Chun et al., 2021 - Google Patents
Comparison of cnn-based speech dereverberation using neural vocoderChun et al., 2021
- Document ID
- 8547262815816157674
- Author
- Chun C
- Jeon K
- Leem C
- Lee B
- Choi W
- Publication year
- Publication venue
- 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)
External Links
Snippet
Reverberation degrades the speech quality and intelligibility, particularly for hearing impaired people. In an automatic speech recognition (ASR) system, a dereverberation technique, which removes reverberation, is widely employed as a pre-processing to …
- 230000001537 neural 0 title description 10
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zhao et al. | Monaural speech dereverberation using temporal convolutional networks with self attention | |
| Luo et al. | Real-time single-channel dereverberation and separation with time-domain audio separation network. | |
| Yuliani et al. | Speech enhancement using deep learning methods: A review | |
| JP7258182B2 (en) | Speech processing method, device, electronic device and computer program | |
| Han et al. | Learning spectral mapping for speech dereverberation and denoising | |
| Zhao et al. | A two-stage algorithm for noisy and reverberant speech enhancement | |
| CN109887489B (en) | Speech dereverberation method based on depth features for generating countermeasure network | |
| Zhao et al. | Late reverberation suppression using recurrent neural networks with long short-term memory | |
| CN113571047B (en) | A method, device and equipment for processing audio data | |
| Kothapally et al. | SkipConvGAN: Monaural speech dereverberation using generative adversarial networks via complex time-frequency masking | |
| Ratnarajah et al. | Ts-rir: Translated synthetic room impulse responses for speech augmentation | |
| Adiga et al. | Speech Enhancement for Noise-Robust Speech Synthesis Using Wasserstein GAN. | |
| US12148442B2 (en) | Signal processing device and signal processing method | |
| Delfarah et al. | Deep learning for talker-dependent reverberant speaker separation: An empirical study | |
| Li et al. | Single-channel speech dereverberation via generative adversarial training | |
| CN118212929A (en) | A personalized Ambisonics speech enhancement method | |
| Abel et al. | Novel two-stage audiovisual speech filtering in noisy environments | |
| Close et al. | Non intrusive intelligibility predictor for hearing impaired individuals using self supervised speech representations | |
| Chun et al. | Comparison of cnn-based speech dereverberation using neural vocoder | |
| Zhao et al. | Multi-resolution convolutional residual neural networks for monaural speech dereverberation | |
| Delfarah et al. | Recurrent neural networks for cochannel speech separation in reverberant environments | |
| Li et al. | On loss functions for deep-learning based T60 estimation | |
| CN116648747A (en) | Device for providing a processed audio signal, method for providing a processed audio signal, device for providing neural network parameters and method for providing neural network parameters | |
| Nasir et al. | Noise Reduction Techniques for Enhancing Speech | |
| Kashani et al. | Speech enhancement via deep spectrum image translation network |