[go: up one dir, main page]

Krusche, 2021 - Google Patents

Visualization and auralization of features learned by neural networks for musical instrument recognition

Krusche, 2021

Document ID
18169282700840059244
Author
Krusche A
Publication year
Publication venue
PQDT-Global

External Links

Snippet

In computer vision a number of feature visualization techniques were developed to make convolutional networks more interpretable. For audio classification those methods are used as well but are not as extensively investigated. This thesis picks up on that and investigates …
Continue reading at search.proquest.com (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/215Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
    • G10H2250/235Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]

Similar Documents

Publication Publication Date Title
Klapuri Automatic music transcription as we know it today
US20110225196A1 (en) Moving image search device and moving image search program
Hu et al. Separation of singing voice using nonnegative matrix partial co-factorization for singer identification
DE10123366C1 (en) Device for analyzing an audio signal for rhythm information
Durrieu et al. Main instrument separation from stereophonic audio signals using a source/filter model
Sainburg et al. Noisereduce: Domain general noise reduction for time series signals
DE112022006903T5 (en) METHOD FOR PROCESSING INFORMATION, INFORMATION PROCESSING SYSTEM AND STORAGE MEDIUM
Rao Audio signal processing
Ullrich et al. Music transcription with convolutional sequence-to-sequence models
Krusche Visualization and auralization of features learned by neural networks for musical instrument recognition
Chen et al. Synthesis and Restoration of Traditional Ethnic Musical Instrument Timbres Based on Time-Frequency Analysis.
Hashemi et al. Persian music source separation in audio-visual data using deep learning
Papadopoulos Music-content-adaptive robust principal component analysis for a semantically consistent separation of foreground and background in music audio signals
Lewis et al. Knowledge discovery-based identification of musical pitches and instruments in polyphonic sounds
Akman et al. Audio explanation synthesis with generative foundation models
BASTIANELLO Sound generation using GAN Models
Battenberg Techniques for machine understanding of live drum performances
Kumar et al. Machine learning for audio processing: From feature extraction to model selection
dos Santos Moura et al. Source Extraction based on Binary Masking and Machine Learning
Parsons et al. Effects of Prosodic Information on Dialect Classification Using Whisper Features
Paulino et al. Analysis of Frequency Range Effect on the Detection of Voice Disorder Using Convolutional Neural Networks Trained on Spectogram Images
EP1743324B1 (en) Device and method for analysing an information signal
Kadi et al. Real-Time Musical Instruments Recognition for Scenography Purposes
Сердюк et al. INFORMATION TECHNOLOGIES OF NEURAL NETWORK SPEECH RECOGNITION IN REAL-TIME
Serdyuk et al. Information technologies of neural network speech recognition in real-time