[go: up one dir, main page]

Chun et al., 2021 - Google Patents

Comparison of cnn-based speech dereverberation using neural vocoder

Chun et al., 2021

Document ID
8547262815816157674
Author
Chun C
Jeon K
Leem C
Lee B
Choi W
Publication year
Publication venue
2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)

External Links

Snippet

Reverberation degrades the speech quality and intelligibility, particularly for hearing impaired people. In an automatic speech recognition (ASR) system, a dereverberation technique, which removes reverberation, is widely employed as a pre-processing to …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general

Similar Documents

Publication Publication Date Title
Zhao et al. Monaural speech dereverberation using temporal convolutional networks with self attention
Luo et al. Real-time single-channel dereverberation and separation with time-domain audio separation network.
Yuliani et al. Speech enhancement using deep learning methods: A review
JP7258182B2 (en) Speech processing method, device, electronic device and computer program
Han et al. Learning spectral mapping for speech dereverberation and denoising
Zhao et al. A two-stage algorithm for noisy and reverberant speech enhancement
CN109887489B (en) Speech dereverberation method based on depth features for generating countermeasure network
Zhao et al. Late reverberation suppression using recurrent neural networks with long short-term memory
CN113571047B (en) A method, device and equipment for processing audio data
Kothapally et al. SkipConvGAN: Monaural speech dereverberation using generative adversarial networks via complex time-frequency masking
Ratnarajah et al. Ts-rir: Translated synthetic room impulse responses for speech augmentation
Adiga et al. Speech Enhancement for Noise-Robust Speech Synthesis Using Wasserstein GAN.
US12148442B2 (en) Signal processing device and signal processing method
Delfarah et al. Deep learning for talker-dependent reverberant speaker separation: An empirical study
Li et al. Single-channel speech dereverberation via generative adversarial training
CN118212929A (en) A personalized Ambisonics speech enhancement method
Abel et al. Novel two-stage audiovisual speech filtering in noisy environments
Close et al. Non intrusive intelligibility predictor for hearing impaired individuals using self supervised speech representations
Chun et al. Comparison of cnn-based speech dereverberation using neural vocoder
Zhao et al. Multi-resolution convolutional residual neural networks for monaural speech dereverberation
Delfarah et al. Recurrent neural networks for cochannel speech separation in reverberant environments
Li et al. On loss functions for deep-learning based T60 estimation
CN116648747A (en) Device for providing a processed audio signal, method for providing a processed audio signal, device for providing neural network parameters and method for providing neural network parameters
Nasir et al. Noise Reduction Techniques for Enhancing Speech
Kashani et al. Speech enhancement via deep spectrum image translation network