Elhilali - Google Patents

Audio and Multimedia Processing in Digital Libraries

Elhilali

Document ID: 5162803211236080723
Author: Elhilali M

External Links

Cited by

Snippet

In this era of rapid growth of digital libraries, it becomes very important to develop technologies necessary for audio and multimedia processing, as many of the archives and resources used nowadays are in the form of audio and speech recordings. This paper …

Continue reading at engineering.jhu.edu (PDF) (other versions)

238000000034 method 0 abstract description 47

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal

Similar Documents

Publication	Publication Date	Title
Hansen et al.	2005	Speechfind: Advances in spoken document retrieval for a national gallery of the spoken word
US6697564B1 (en)	2004-02-24	Method and system for video browsing and editing by employing audio
US6714909B1 (en)	2004-03-30	System and method for automated multimedia content indexing and retrieval
Morgan et al.	2001	The meeting project at ICSI
Kim et al.	2006	MPEG-7 audio and beyond: Audio content indexing and retrieval
US6490553B2 (en)	2002-12-03	Apparatus and method for controlling rate of playback of audio data
JP2007519987A (en)	2007-07-19	Integrated analysis system and method for internal and external audiovisual data
US20060065102A1 (en)	2006-03-30	Summarizing digital audio data
Arons	1994	Interactively skimming recorded speech
Yu et al.	2000	New developments in automatic meeting transcription.
Shao et al.	2006	Automatic summarization of music videos
CN100538696C (en)	2009-09-09	Systems and methods for integrated analysis of intrinsic and extrinsic audiovisual data
de Jong et al.	2006	Automated speech and audio analysis for semantic access to multimedia
Elhilali	0	Audio and Multimedia Processing in Digital Libraries
Melih et al.	1998	Audio retrieval using perceptually based structures
Cano et al.	2005	Nearest-neighbor automatic sound annotation with a WordNet taxonomy
Gerhard	2003	Computationally measurable differences between speech and song
Xie	2008	Discovering salient prosodic cues and their interactions for automatic story segmentation in Mandarin broadcast news
Ordelman et al.	2006	Exploration of audiovisual heritage using audio indexing technology
De Poli et al.	2006	From audio to content
Jones et al.	1995	Video mail retrieval using voice: an overview of the Stage 2 system
de Vries	1995	Multimedia information access
de Jong et al.	2006	The role of automated speech and audio analysis in semantic multimedia annotation
Chowdhury et al.	2022	AVCL: Audio Video clustering for learning Conversation labeling using Neural Network and NLP
Hauptmann et al.	2001	Beyond the Informedia digital video library: video and audio analysis for remembering conversations