Nitanda et al., 2006 - Google Patents

Audio signal segmentation and classification using fuzzy c‐means clustering

Nitanda et al., 2006

Document ID: 10256460057148240164
Author: Nitanda N; Haseyama M; Kitajima H
Publication year: 2006
Publication venue: Systems and Computers in Japan

External Links

Cited by

Snippet

This paper proposes a method of segmentation and classification of audio signals which is coded by MPEG Audio. The proposed method first detects the boundaries between two different audio signals, which are called audio‐cuts, and then classifies segments, which are …

Continue reading at onlinelibrary.wiley.com (other versions)

230000005236 sound signal 0 title abstract description 57

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation

Similar Documents

Publication	Publication Date	Title
Lu et al.	2002	Content analysis for audio classification and segmentation
US7263485B2 (en)	2007-08-28	Robust detection and classification of objects in audio using limited training data
US20050131688A1 (en)	2005-06-16	Apparatus and method for classifying an audio signal
Li et al.	2001	Classification of general audio data for content-based retrieval
Lu et al.	2002	Speaker change detection and tracking in real-time news broadcasting analysis
US8635065B2 (en)	2014-01-21	Apparatus and method for automatic extraction of important events in audio signals
US6434520B1 (en)	2002-08-13	System and method for indexing and querying audio archives
US20050160457A1 (en)	2005-07-21	Annotating programs for automatic summary generations
KR20030070179A (en)	2003-08-29	Method of the audio stream segmantation
US8838452B2 (en)	2014-09-16	Effective audio segmentation and classification
Jiang et al.	2000	Video segmentation with the support of audio segmentation and classification
Rosenberg et al.	1998	Speaker detection in broadcast speech databases
Jiang et al.	2000	Video segmentation with the assistance of audio content analysis
Kim et al.	2004	Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation
US20050160449A1 (en)	2005-07-21	Apparatus and method for automatic dissection of segmented audio signals
US7680654B2 (en)	2010-03-16	Apparatus and method for segmentation of audio data into meta patterns
Nitanda et al.	2006	Audio signal segmentation and classification using fuzzy c‐means clustering
JP2001147697A (en)	2001-05-29	Acoustic data analysis method and apparatus
US7340398B2 (en)	2008-03-04	Selective sampling for sound signal classification
EP1542206A1 (en)	2005-06-15	Apparatus and method for automatic classification of audio signals
Magrin-Chagnolleau et al.	1999	Detection of target speakers in audio databases
JP2010038943A (en)	2010-02-18	Sound signal processing device and method
Nitanda et al.	2005	Accurate audio-segment classification using feature extraction matrix
Imoto et al.	2017	Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array
Nguyen et al.	2011	Audio segmentation and classification using a temporally weighted fuzzy c-means algorithm