[go: up one dir, main page]

Eskimez et al., 2022 - Google Patents

Real-time joint personalized speech enhancement and acoustic echo cancellation with e3net

Eskimez et al., 2022

View PDF
Document ID
5485441688547315926
Author
Eskimez S
Yoshioka T
Ju A
Tang M
Parnamaa T
Wang H
Publication year
Publication venue
arXiv preprint arXiv:2211.02773

External Links

Snippet

Personalized speech enhancement (PSE), a process of estimating a clean target speech signal in real time by leveraging a speaker embedding vector of the target talker, has garnered much attention from the research community due to the recent surge of online …
Continue reading at www.academia.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Interconnection arrangements not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for suppressing echoes or otherwise conditioning for one or other direction of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for suppressing echoes or otherwise conditioning for one or other direction of traffic using echo cancellers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B3/00Line transmission systems
    • H04B3/02Details
    • H04B3/20Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other
    • H04B3/23Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other using a replica of transmitted signal in the time domain, e.g. echo cancellers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00

Similar Documents

Publication Publication Date Title
Zhang et al. Deep Learning for Joint Acoustic Echo and Noise Cancellation with Nonlinear Distortions.
Westhausen et al. Acoustic echo cancellation with the dual-signal transformation LSTM network
Zhang et al. Deep learning for acoustic echo cancellation in noisy and double-talk scenarios
Valin et al. Low-complexity, real-time joint neural echo control and speech enhancement based on percepnet
Zhang et al. FT-LSTM based complex network for joint acoustic echo cancellation and speech enhancement
US9269368B2 (en) Speaker-identification-assisted uplink speech processing systems and methods
Cutler et al. INTERSPEECH 2021 Acoustic Echo Cancellation Challenge.
Carbajal et al. Multiple-input neural network-based residual echo suppression
TWI463488B (en) Echo suppression comprising modeling of late reverberation components
Zhang et al. Multi-task deep residual echo suppression with echo-aware loss
Zhang et al. Deep adaptive AEC: Hybrid of deep learning and adaptive acoustic echo cancellation
US20240105199A1 (en) Learning method based on multi-channel cross-tower network for jointly suppressing acoustic echo and background noise
Yen et al. Adaptive co-channel speech separation and recognition
Howard et al. A neural acoustic echo canceller optimized using an automatic speech recognizer and large scale synthetic data
Lei et al. Deep neural network based regression approach for acoustic echo cancellation
Peng et al. ICASSP 2021 acoustic echo cancellation challenge: Integrated adaptive echo cancellation with time alignment and deep learning-based residual echo plus noise suppression
Shu et al. Joint echo cancellation and noise suppression based on cascaded magnitude and complex mask estimation
Eskimez et al. Real-time joint personalized speech enhancement and acoustic echo cancellation with e3net
Eskimez et al. Real-time joint personalized speech enhancement and acoustic echo cancellation
Ivry et al. Objective metrics to evaluate residual-echo suppression during double-talk
Li et al. Deep multi-task cascaded acoustic echo cancellation and noise suppression
Cui et al. Multi-scale refinement network based acoustic echo cancellation
Srinivasan Using a remotewireless microphone for speech enhancement in non-stationary noise
Sun et al. Time-frequency complex mask network for echo cancellation and noise suppression
Xiong et al. Deep subband network for joint suppression of echo, noise and reverberation in real-time fullband speech communication