Gramaglia, 2014 - Google Patents
A binary auditory words model for audio content identificationGramaglia, 2014
View PDF- Document ID
- 1040819657232560156
- Author
- Gramaglia A
- Publication year
External Links
Snippet
Abstract An Audio Content Identification method is presented, that uses Local Binary Descriptors and Machine Learning techniques to build an audio fingerprinting model based on “auditory words” inspired to the “visual words” model used for image recognition. This …
- 238000000034 method 0 abstract description 22
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30778—Audio database index structures and management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30755—Query formulation specially adapted for audio data retrieval
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G06K9/4671—Extracting features based on salient regional features, e.g. Scale Invariant Feature Transform [SIFT] keypoints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cano et al. | A review of algorithms for audio fingerprinting | |
Ke et al. | Computer vision for music identification | |
Casey et al. | Analysis of minimum distances in high-dimensional musical spaces | |
EP2791935B1 (en) | Low complexity repetition detection in media data | |
US20130275421A1 (en) | Repetition Detection in Media Data | |
CN103729368B (en) | A kind of robust audio recognition methods based on local spectrum iamge description | |
Zhang et al. | SIFT-based local spectrogram image descriptor: a novel feature for robust music identification | |
Seyerlehner et al. | Frame level audio similarity-a codebook approach | |
Banuroopa et al. | MFCC based hybrid fingerprinting method for audio classification through LSTM | |
Wu et al. | A feature selection algorithm of music genre classification based on ReliefF and SFS | |
Williams et al. | Efficient music identification using ORB descriptors of the spectrogram image | |
Yan et al. | Detection of speech smoothing on very short clips | |
Agarwaal et al. | Robust and lightweight audio fingerprint for Automatic Content Recognition | |
Thiruvengatanadhan | Music Classification using MFCC and SVM | |
Tsai et al. | Known-artist live song identification using audio hashprints | |
Guzman-Zavaleta et al. | A robust audio fingerprinting method using spectrograms saliency maps | |
KR100733145B1 (en) | Fingerprint generation method and audio fingerprinting system based on normalized spectral subband center point | |
Amatov et al. | A semi-supervised deep learning approach to dataset collection for query-by-humming task | |
Saravanos et al. | Audio-fingerprinting via dictionary learning | |
Zanoni et al. | Efficient music identification approach based on local spectrogram image descriptors | |
Gramaglia | A binary auditory words model for audio content identification | |
Fan et al. | Notice of violation of ieee publication principles: A music identification system based on audio fingerprint | |
Wijesena et al. | Metric Learning with Sequence-to-sequence Autoencoder for Content-based Music Identification | |
Han et al. | A filtering method for audio fingerprint based on multiple measurements | |
US20220335082A1 (en) | Method for audio track data retrieval, method for identifying audio clip, and mobile device |