Nitanda et al., 2006 - Google Patents
Audio signal segmentation and classification using fuzzy c‐means clusteringNitanda et al., 2006
- Document ID
- 10256460057148240164
- Author
- Nitanda N
- Haseyama M
- Kitajima H
- Publication year
- Publication venue
- Systems and Computers in Japan
External Links
Snippet
This paper proposes a method of segmentation and classification of audio signals which is coded by MPEG Audio. The proposed method first detects the boundaries between two different audio signals, which are called audio‐cuts, and then classifies segments, which are …
- 230000005236 sound signal 0 title abstract description 57
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | Content analysis for audio classification and segmentation | |
US7263485B2 (en) | Robust detection and classification of objects in audio using limited training data | |
US20050131688A1 (en) | Apparatus and method for classifying an audio signal | |
Li et al. | Classification of general audio data for content-based retrieval | |
Lu et al. | Speaker change detection and tracking in real-time news broadcasting analysis | |
US8635065B2 (en) | Apparatus and method for automatic extraction of important events in audio signals | |
US6434520B1 (en) | System and method for indexing and querying audio archives | |
US20050160457A1 (en) | Annotating programs for automatic summary generations | |
KR20030070179A (en) | Method of the audio stream segmantation | |
US8838452B2 (en) | Effective audio segmentation and classification | |
Jiang et al. | Video segmentation with the support of audio segmentation and classification | |
Rosenberg et al. | Speaker detection in broadcast speech databases | |
Jiang et al. | Video segmentation with the assistance of audio content analysis | |
Kim et al. | Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation | |
US20050160449A1 (en) | Apparatus and method for automatic dissection of segmented audio signals | |
US7680654B2 (en) | Apparatus and method for segmentation of audio data into meta patterns | |
Nitanda et al. | Audio signal segmentation and classification using fuzzy c‐means clustering | |
JP2001147697A (en) | Acoustic data analysis method and apparatus | |
US7340398B2 (en) | Selective sampling for sound signal classification | |
EP1542206A1 (en) | Apparatus and method for automatic classification of audio signals | |
Magrin-Chagnolleau et al. | Detection of target speakers in audio databases | |
JP2010038943A (en) | Sound signal processing device and method | |
Nitanda et al. | Accurate audio-segment classification using feature extraction matrix | |
Imoto et al. | Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array | |
Nguyen et al. | Audio segmentation and classification using a temporally weighted fuzzy c-means algorithm |