Gramaglia, 2014 - Google Patents

A binary auditory words model for audio content identification

Gramaglia, 2014

Document ID: 1040819657232560156
Author: Gramaglia A
Publication year: 2014

External Links

Cited by

Snippet

Abstract An Audio Content Identification method is presented, that uses Local Binary Descriptors and Machine Learning techniques to build an audio fingerprinting model based on “auditory words” inspired to the “visual words” model used for image recognition. This …

Continue reading at www.audioneex.com (PDF) (other versions)

238000000034 method 0 abstract description 22

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30778—Audio database index structures and management thereof
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30755—Query formulation specially adapted for audio data retrieval
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G06K9/4671—Extracting features based on salient regional features, e.g. Scale Invariant Feature Transform [SIFT] keypoints
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building

Similar Documents

Publication	Publication Date	Title
Cano et al.	2002	A review of algorithms for audio fingerprinting
Ke et al.	2005	Computer vision for music identification
Casey et al.	2008	Analysis of minimum distances in high-dimensional musical spaces
EP2791935B1 (en)	2016-03-09	Low complexity repetition detection in media data
US20130275421A1 (en)	2013-10-17	Repetition Detection in Media Data
CN103729368B (en)	2016-12-21	A kind of robust audio recognition methods based on local spectrum iamge description
Zhang et al.	2015	SIFT-based local spectrogram image descriptor: a novel feature for robust music identification
Seyerlehner et al.	2008	Frame level audio similarity-a codebook approach
Banuroopa et al.	2021	MFCC based hybrid fingerprinting method for audio classification through LSTM
Wu et al.	2015	A feature selection algorithm of music genre classification based on ReliefF and SFS
Williams et al.	2017	Efficient music identification using ORB descriptors of the spectrogram image
Yan et al.	2019	Detection of speech smoothing on very short clips
Agarwaal et al.	2023	Robust and lightweight audio fingerprint for Automatic Content Recognition
Thiruvengatanadhan	2018	Music Classification using MFCC and SVM
Tsai et al.	2017	Known-artist live song identification using audio hashprints
Guzman-Zavaleta et al.	2014	A robust audio fingerprinting method using spectrograms saliency maps
KR100733145B1 (en)	2007-06-27	Fingerprint generation method and audio fingerprinting system based on normalized spectral subband center point
Amatov et al.	2023	A semi-supervised deep learning approach to dataset collection for query-by-humming task
Saravanos et al.	2020	Audio-fingerprinting via dictionary learning
Zanoni et al.	2017	Efficient music identification approach based on local spectrogram image descriptors
Gramaglia	2014	A binary auditory words model for audio content identification
Fan et al.	2016	Notice of violation of ieee publication principles: A music identification system based on audio fingerprint
Wijesena et al.	2024	Metric Learning with Sequence-to-sequence Autoencoder for Content-based Music Identification
Han et al.	2018	A filtering method for audio fingerprint based on multiple measurements
US20220335082A1 (en)	2022-10-20	Method for audio track data retrieval, method for identifying audio clip, and mobile device