Eskimez et al., 2022 - Google Patents
Real-time joint personalized speech enhancement and acoustic echo cancellation with e3netEskimez et al., 2022
View PDF- Document ID
- 5485441688547315926
- Author
- Eskimez S
- Yoshioka T
- Ju A
- Tang M
- Parnamaa T
- Wang H
- Publication year
- Publication venue
- arXiv preprint arXiv:2211.02773
External Links
Snippet
Personalized speech enhancement (PSE), a process of estimating a clean target speech signal in real time by leveraging a speaker embedding vector of the target talker, has garnered much attention from the research community due to the recent surge of online …
- 238000002592 echocardiography 0 title abstract description 31
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Interconnection arrangements not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for suppressing echoes or otherwise conditioning for one or other direction of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for suppressing echoes or otherwise conditioning for one or other direction of traffic using echo cancellers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B3/00—Line transmission systems
- H04B3/02—Details
- H04B3/20—Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other
- H04B3/23—Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other using a replica of transmitted signal in the time domain, e.g. echo cancellers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Deep Learning for Joint Acoustic Echo and Noise Cancellation with Nonlinear Distortions. | |
Westhausen et al. | Acoustic echo cancellation with the dual-signal transformation LSTM network | |
Zhang et al. | Deep learning for acoustic echo cancellation in noisy and double-talk scenarios | |
Valin et al. | Low-complexity, real-time joint neural echo control and speech enhancement based on percepnet | |
Zhang et al. | FT-LSTM based complex network for joint acoustic echo cancellation and speech enhancement | |
US9269368B2 (en) | Speaker-identification-assisted uplink speech processing systems and methods | |
Cutler et al. | INTERSPEECH 2021 Acoustic Echo Cancellation Challenge. | |
Carbajal et al. | Multiple-input neural network-based residual echo suppression | |
TWI463488B (en) | Echo suppression comprising modeling of late reverberation components | |
Zhang et al. | Multi-task deep residual echo suppression with echo-aware loss | |
Zhang et al. | Deep adaptive AEC: Hybrid of deep learning and adaptive acoustic echo cancellation | |
US20240105199A1 (en) | Learning method based on multi-channel cross-tower network for jointly suppressing acoustic echo and background noise | |
Yen et al. | Adaptive co-channel speech separation and recognition | |
Howard et al. | A neural acoustic echo canceller optimized using an automatic speech recognizer and large scale synthetic data | |
Lei et al. | Deep neural network based regression approach for acoustic echo cancellation | |
Peng et al. | ICASSP 2021 acoustic echo cancellation challenge: Integrated adaptive echo cancellation with time alignment and deep learning-based residual echo plus noise suppression | |
Shu et al. | Joint echo cancellation and noise suppression based on cascaded magnitude and complex mask estimation | |
Eskimez et al. | Real-time joint personalized speech enhancement and acoustic echo cancellation with e3net | |
Eskimez et al. | Real-time joint personalized speech enhancement and acoustic echo cancellation | |
Ivry et al. | Objective metrics to evaluate residual-echo suppression during double-talk | |
Li et al. | Deep multi-task cascaded acoustic echo cancellation and noise suppression | |
Cui et al. | Multi-scale refinement network based acoustic echo cancellation | |
Srinivasan | Using a remotewireless microphone for speech enhancement in non-stationary noise | |
Sun et al. | Time-frequency complex mask network for echo cancellation and noise suppression | |
Xiong et al. | Deep subband network for joint suppression of echo, noise and reverberation in real-time fullband speech communication |