[go: up one dir, main page]

WO2001019062A1 - Suppression of residual acoustic echo - Google Patents

Suppression of residual acoustic echo Download PDF

Info

Publication number
WO2001019062A1
WO2001019062A1 PCT/IL2000/000527 IL0000527W WO0119062A1 WO 2001019062 A1 WO2001019062 A1 WO 2001019062A1 IL 0000527 W IL0000527 W IL 0000527W WO 0119062 A1 WO0119062 A1 WO 0119062A1
Authority
WO
WIPO (PCT)
Prior art keywords
echo
signal
residual
residual echo
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IL2000/000527
Other languages
French (fr)
Inventor
Adoram Erell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DSPC Technologies Ltd
Original Assignee
DSPC Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DSPC Technologies Ltd filed Critical DSPC Technologies Ltd
Priority to GB0203162A priority Critical patent/GB2369001A/en
Priority to DE10084970T priority patent/DE10084970T1/en
Priority to AU68622/00A priority patent/AU6862200A/en
Publication of WO2001019062A1 publication Critical patent/WO2001019062A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers

Definitions

  • the present invention relates to echo cancellation generally and to the cancellation of residual echoes, in particular.
  • the microphone also "hears" the output of the loudspeaker.
  • Echo cancellers are used to eliminate the echo due to the acoustic coupling.
  • Fig. 1 illustrates the situation for a digital telephone.
  • Fig. 1 shows a telephone having a loudspeaker 10 and a microphone
  • a frame of the far end signal, which eventually is provided to loudspeaker 10, is first decoded, by a decoder 14 after which its volume is adjusted by digital volume control 16.
  • the resultant signal is converted by a codec 18 to an analog
  • microphone 12 picks up both the near end signal, as spoken by the user, and the output of loudspeaker 10. This is noted by the arrow labeled "acoustic echo".
  • the output of microphone 12 is digitally converted by codec 18, formed into a frame, and provided to an echo canceller 22 which also receives the frame of the far end signal as produced by digital volume control 16. Echo canceller 22 uses the received signal in order to remove the portion of the microphone signal which is an echo of the far end signal. The result is passed to an encoder 26 for transmission.
  • the telephone of Fig. 1 operates in discontinuous transmission mode
  • step 30 The echo cancellation (step 30) is followed by voice activity detection (step 32). If speech was found (as checked in step 34), the frame is encoded (step 36) and then transmitted (step 38). If there was no speech, parameters of "comfort noise" are transmitted so
  • the receiver can generate a comfort noise frame.
  • the parameters are first updated (step 40) from the current, non-speech frame, and then the parameters are transmitted (step 42).
  • the comfort noise parameters are calculated from the background noise and are continually updated from each background noise frame. However, it takes a few frames until the telephone has sufficient knowledge of the background noise. Thus, for the first few frames of operation the transmitter transmits as if the
  • Echo cancellation involves subtracting a synthesized estimate of the
  • the synthesized echo is estimated by adaptive filtering techniques. Since the estimated echo is never an exact replica of the true echo, the subtraction is usually followed by some kind of further suppression of the residual
  • WO 99/03093 suggests a smooth addition of synthetic comfort noise to the center-clipped signal.
  • a disadvantage of the noise-generation method is that it requires a relatively complex signal processing module to continuously monitor the background noise and to generate a faithful comfort noise to replace or mask the residual echo.
  • comfort noise is commonly used for the purpose of discontinuous transmission (DTX) as explained hereinabove with respect to Fig. 2.
  • DTX discontinuous transmission
  • the base station when there is no speech in the up link channel (to the mobile station), the base station generates comfort noise based on noise information sent from the mobile station.
  • PCT Publication WO 96/42142 describes a method where the already existing comfort-noise mechanism is extended to an echo-suppressor device in the base station.
  • the comfort noise generation extends, in addition to non-speech periods, also to periods of residual echo.
  • the method presented in WO 96/42142 is similar to those of US Patent 5,222,251 and US Patent 5,835,851 , both of which were formulated for the older half-duplex systems.
  • the method in WO 96/42142 involves echo-suppression in the base station rather than in the mobile station.
  • the full echo, rather than the residual echo is replaced by one or more "noise codewords". These "noise codewords" can be thought of as comfort noise.
  • the "noise codewords" are variably mixed with the echo rather than replacing it, to avoid cutting of the near-end speech.
  • WO 96/42142 US 5,222,251 or US 5,835,851 describe a full solution for the elimination of residual noise in the mobile station.
  • the criterion used in US 5,222,251 and US 5,835,851 to decide when to replace the residual echo by comfort noise is problematic for a residual-echo suppressor in the handset.
  • the criterion for replacing or mixing-in generated-noise is the existence of a signal in the down-link channel, which means that an echo exists in the up-link channel. This criterion cannot be used for full-duplex communication because echo can be accompanied by near-end speech in a double-talk situation.
  • Speech activity is detected in the down-link channel.
  • Condition (2) is difficult to detect, because it is difficult to discriminate between echo-only and double-talk (i.e., echo plus near-end speech). Thus a situation can occur in which the near-end speech is erroneously replaced by comfort noise, or vice-versa: the echo is not replaced by comfort-noise and is therefore not suppressed.
  • the residual echo is not suppressed.
  • the residual echo is not masked by the near-end speech and can be disturbing.
  • WO 96/42142 can be implemented only for a digital network with DTX.
  • the new generation of CDMA networks do not use DTX, but rather a variable rate method.
  • the variable rate method there is no comfort-noise generator in the base station, so residual-echo cannot be as easily replaced by comfort noise.
  • FIG. 1 is a schematic illustration of a prior art hands-free telephone
  • Fig. 2 is a flow chart illustration of the operation of the telephone of Fig. 1 in hands-free mode
  • Fig. 3 is a schematic illustration of a hands-free telephone capable of suppressing the residual echo, constructed and operative in accordance with a preferred embodiment of the present invention.
  • Fig. 4 is a flow chart illustration of the operation of the telephone of Fig. 3 in hands-free mode.
  • Figs. 3 and 4 illustrate a digital telephone capable of suppressing the residual echo, constructed and operative in accordance with a preferred embodiment of the present invention, and its method of operation, respectively.
  • the telephone of Fig. 3 has elements similar to those of the prior art. These similar elements, labeled with similar reference numerals, will not be further described.
  • the present invention adds a residual echo detector 50 to the prior art telephone and replaces the encoder of the prior art with a residual echo suppressing encoder 52 that can operate with detector 50.
  • residual echo detector 50 takes input signal IN and output signal OUT of echo canceller 22 and determines when a significant amount of echo has been cancelled, as discussed hereinbelow. When this occurs, detector 50 indicates to encoder 52 to suppress the echo in the frame.
  • Residual echo detector 50 calculates the following criterion:
  • N and E 0 u ⁇ are the estimated energies in the input IN and output OUT frames of echo canceller 22 and ENR_THRES is a threshold level, such as 10 dB.
  • criterion [1] is fulfilled only when the microphone signal is predominately echo while, at the same time, a large part of this echo is cancelled in the output. It is noted that all echo cancellers have adaptive filters which attempt to represent the echo transfer function. Unfortunately, they are initialized with some default values and they take some time, after an echo first appears, to converge to their optimal value. During this time they are not effective.
  • detector 50 also enables suppression whenever the adaptive filter (not shown) of echo canceller 50 has not yet converged. For example, suppression can be enabled for the initial 80 samples (10 msec with a sampling rate of 8 kHz) of the adaptation. It may be safely assumed that during this very short initial interval, the near-end signal does not contain near-end speech, because double-talk (two sides talking at the same time) is not likely to occur at the very beginning of a conversation.
  • each frame for which the above criterion is fulfilled is declared by detector 50 to be of type "noise".
  • the mechanism for updating the comfort noise variables in the encoder is inhibited.
  • the encoder sends a comfort-noise indication with comfort-noise parameters of previous noise frames that did not contain echo. This is shown in Fig. 4, which illustrates the operation of the telephone in the present invention. Similar steps to those of the prior art have similar reference numerals.
  • echo canceller 22 performs the echo cancellation.
  • residual echo detector 50 utilizes the above criterion to decide whether or not to suppress the echo. If not, the telephone continues as before, performing voice activity detection (step 32) and transmitting an encoded frame (step 38) if there was voice and comfort noise parameters (step 42) if there was no voice. If there is residual echo to be suppressed, encoder 52 first determines (step 62) if echo canceller 22 is before convergence. If it is, then encoder 52 generates (step 64) default comfort noise parameters and transmits them (step 42). If echo canceller 22 is after convergence, then encoder 52 utilizes the comfort noise parameters most recently calculated and transmits them (step 42).
  • the present invention does not update the comfort noise with data from a signal with too much residual echo. This ensures that the comfort noise calculation is based on background noise and not on echoed signals.
  • criterion [1] of the present invention may replace the residual echo by comfort noise also during double-talk, at least during the very low energy, near-end speech frames for which the input signal is dominated by the echo. This means that during double-talk, the very low energy frames of the near-end speech are also suppressed. However, this is barely felt by the far-end listener.
  • the high-energy portion can be reduced by transmitting each frame for which the above criterion is fulfilled (i.e. which has residual echo therein) at a low rate, rather than sending comfort noise parameters.
  • the low rate is 1/8.
  • speech encoded at rate 1/8 sounds weaker and more "whisper like". The effect is particularly strong for voiced speech, which is the high-energy part of the speech. Forcing a low rate of 1/8 has a more dramatic effect on the voiced speech than on the unvoiced, and thus, forcing rate 1/8 whenever criterion [1] is fulfilled will result mainly in attenuating the voiced parts of
  • the low rate will be forced both on the residual echo and on the low-energy near-end speech, in the case of double-talk. It will be appreciated that the effect on the residual echo is strong, whereas it is weak for the near-end speech. For the residual echo, the low rate is forced for all of its segments,
  • voiced parts including the voiced parts. These voiced parts are the most audible and annoying ones and they are the ones that are most effectively suppressed.
  • the low rate is forced only on the low energy frames, which are mostly unvoiced, and the effect of forcing the low rate on these frames is small.
  • the microphone signal (which is the input to echo canceller 22) can be described as the sum of echo and the near end signal:
  • Echo cancellers in hands-free environments achieve a significant amount of echo attenuation after a very short initial convergence period.
  • the echo energy is attenuated by at least a factor of 10 after a few milliseconds, where fluctuations in the echo path do not increase the residual echo above this minimum attenuation EC_ATTm in of about 10 dB.
  • EC_ATTm Denoting the momentary attenuation by EC_ATT, we write:
  • ENEAR is the energy in the near-end signal.
  • Condition [4] means that the input signal is predominately echo and condition [5] means that, indeed, the echo was attenuated by at least EC_ATT m , n .
  • condition [1] indicates a condition where the residual echo might be audible.
  • Condition [1] does not necessarily indicate that the residual echo is audible, because the actual attenuation may be such that: ECHO / EC_ATT « NEAR [6]
  • the suppression consists of merely "forcing" a low-rate. Most likely, if the residual echo were indeed very low, a low-rate would be have been used anyway, so “suppression” in this case has no effect.
  • the input signal includes low energy, near-end speech frames.
  • the frames that will be suppressed will be those for which the echo is high and the near-end speech is low. Since, for acoustic echo situations, the average echo level is the same or lower than the average near-end speech level, the highest-energy near-end speech frames that may suppressed will be those with energy that is EC_ATTmi n times less than the average near-end energy (i.e., in the worst case, frames that are typically 10 dB below the average level will be suppressed).
  • criterion [1] is not fulfilled when criterions [4] and [5] are both not satisfied.
  • Criterion [5] is not satisfied only when echo canceller 22 has not yet converged.
  • the noise additionally fully masks the residual echo, leaving the residual echo inaudible. In the latter case, there is no need for suppression. In the former case, there is need for suppression since the echo canceller is not yet fully operative.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

A communication device having residual echo suppression includes an echo canceller (22) and a residual echo system (50, 52). The echo canceller cancels (30) most of an echo of a far end signal found in a microphone signal and provides an echo canceled signal with a residual echo therein. The residual echo system suppresses (60) the residual echo when the echo canceller cancels a significant amount of echo. The residual echo system can either transmit (42) comfort noise or can transmit (38) the echo canceled signal at a low rate when the echo canceller cancels a significant amount of echo. If the former, the comfort noise parameters are updated (40) from background noise but not when the echo canceller cancels a significant amount of echo.

Description

SUPPRESSION OF RESIDUAL ACOUSTIC ECHO
FIELD OF THE INVENTION
The present invention relates to echo cancellation generally and to the cancellation of residual echoes, in particular.
BACKGROUND OF THE INVENTION
When a telephone operates in hands-free mode (i.e. with the speaker open), the user speaks into the microphone and hears the voice of his conversant through the loudspeaker. Unfortunately, in duplex transmission (both sides can
talk at the same time), the microphone also "hears" the output of the loudspeaker.
This acoustic coupling between the near end signal (the signal picked up by the microphone) and the far end signal (as reproduced by the loudspeaker) creates an unpleasant echo. Echo cancellers are used to eliminate the echo due to the acoustic coupling.
Fig. 1 , to which reference is now made, illustrates the situation for a digital telephone. Fig. 1 shows a telephone having a loudspeaker 10 and a microphone
12. A frame of the far end signal, which eventually is provided to loudspeaker 10, is first decoded, by a decoder 14 after which its volume is adjusted by digital volume control 16. The resultant signal is converted by a codec 18 to an analog
signal whose volume is further adjusted by analog volume control 20. As mentioned above, microphone 12 picks up both the near end signal, as spoken by the user, and the output of loudspeaker 10. This is noted by the arrow labeled "acoustic echo". The output of microphone 12 is digitally converted by codec 18, formed into a frame, and provided to an echo canceller 22 which also receives the frame of the far end signal as produced by digital volume control 16. Echo canceller 22 uses the received signal in order to remove the portion of the microphone signal which is an echo of the far end signal. The result is passed to an encoder 26 for transmission.
The telephone of Fig. 1 operates in discontinuous transmission mode
(DTX) as shown in Fig. 2, to which reference is now made. The echo cancellation (step 30) is followed by voice activity detection (step 32). If speech was found (as checked in step 34), the frame is encoded (step 36) and then transmitted (step 38). If there was no speech, parameters of "comfort noise" are transmitted so
that the receiver can generate a comfort noise frame. The parameters are first updated (step 40) from the current, non-speech frame, and then the parameters are transmitted (step 42). The comfort noise parameters are calculated from the background noise and are continually updated from each background noise frame. However, it takes a few frames until the telephone has sufficient knowledge of the background noise. Thus, for the first few frames of operation the transmitter transmits as if the
signal had speech content, even the signal was one of non-speech. Echo cancellation involves subtracting a synthesized estimate of the
echo, where the synthesized echo is estimated by adaptive filtering techniques. Since the estimated echo is never an exact replica of the true echo, the subtraction is usually followed by some kind of further suppression of the residual
echo. Methods originally designed for analog communication, such as those required by the G.167 standard of the Telecommunication Industry Association (TIA), use center clipping to suppress the low-level residual echo. A review of the pros and cons of center clipping can be found in PCT Publication WO 99/03093. Generally, the disadvantage of center clipping is that it is a relatively "brutal" processing of the signal, creating discontinuities and large variations in the background noise level and overall sound quality.
To overcome some of these shortcomings, WO 99/03093 suggests a smooth addition of synthetic comfort noise to the center-clipped signal. Another method of totally replacing the residual echo by synthetic noise in presented in US Patent 5,687,229. A disadvantage of the noise-generation method is that it requires a relatively complex signal processing module to continuously monitor the background noise and to generate a faithful comfort noise to replace or mask the residual echo. For a digital communication network, comfort noise is commonly used for the purpose of discontinuous transmission (DTX) as explained hereinabove with respect to Fig. 2. In DTX, when there is no speech in the up link channel (to the mobile station), the base station generates comfort noise based on noise information sent from the mobile station. Thus, PCT Publication WO 96/42142 describes a method where the already existing comfort-noise mechanism is extended to an echo-suppressor device in the base station. In this arrangement, the comfort noise generation extends, in addition to non-speech periods, also to periods of residual echo. The method presented in WO 96/42142 is similar to those of US Patent 5,222,251 and US Patent 5,835,851 , both of which were formulated for the older half-duplex systems. The method in WO 96/42142 involves echo-suppression in the base station rather than in the mobile station. In US 5,222,251, on the other hand, the full echo, rather than the residual echo, is replaced by one or more "noise codewords". These "noise codewords" can be thought of as comfort noise. In US 5,835,851 the "noise codewords" are variably mixed with the echo rather than replacing it, to avoid cutting of the near-end speech.
Neither WO 96/42142, US 5,222,251 or US 5,835,851 describe a full solution for the elimination of residual noise in the mobile station. For example, the criterion used in US 5,222,251 and US 5,835,851 to decide when to replace the residual echo by comfort noise is problematic for a residual-echo suppressor in the handset. In US 5,222,251 and US 5,835,851 , which aim at half-duplex systems, the criterion for replacing or mixing-in generated-noise is the existence of a signal in the down-link channel, which means that an echo exists in the up-link channel. This criterion cannot be used for full-duplex communication because echo can be accompanied by near-end speech in a double-talk situation.
The criterion presented in WO 96/42142 for the base station can, in principle, be extended also to the mobile station in a full-duplex conversation. However, this criterion has several drawbacks, which will be explained herein. In WO 96/42142, the decision to process the residual echo after echo cancellation- is based on the mutual fulfillment of two conditions:
1. Speech activity is detected in the down-link channel.
2. There is no speech activity in the up-link channel. The criterion creates two problems:
Condition (2) is difficult to detect, because it is difficult to discriminate between echo-only and double-talk (i.e., echo plus near-end speech). Thus a situation can occur in which the near-end speech is erroneously replaced by comfort noise, or vice-versa: the echo is not replaced by comfort-noise and is therefore not suppressed.
Even if double-talk situation is properly detected, during the double-talk period, the residual echo is not suppressed. During the low-energy segments of the near-end speech, the residual echo is not masked by the near-end speech and can be disturbing.
Another issue is that the WO 96/42142 can be implemented only for a digital network with DTX. However, the new generation of CDMA networks do not use DTX, but rather a variable rate method. In the variable rate method there is no comfort-noise generator in the base station, so residual-echo cannot be as easily replaced by comfort noise.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the appended drawings in which: Fig. 1 is a schematic illustration of a prior art hands-free telephone;
Fig. 2 is a flow chart illustration of the operation of the telephone of Fig. 1 in hands-free mode;
Fig. 3 is a schematic illustration of a hands-free telephone capable of suppressing the residual echo, constructed and operative in accordance with a preferred embodiment of the present invention; and
Fig. 4 is a flow chart illustration of the operation of the telephone of Fig. 3 in hands-free mode.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
Reference is now made to Figs. 3 and 4, which illustrate a digital telephone capable of suppressing the residual echo, constructed and operative in accordance with a preferred embodiment of the present invention, and its method of operation, respectively. The telephone of Fig. 3 has elements similar to those of the prior art. These similar elements, labeled with similar reference numerals, will not be further described.
The present invention adds a residual echo detector 50 to the prior art telephone and replaces the encoder of the prior art with a residual echo suppressing encoder 52 that can operate with detector 50.
In accordance with a preferred embodiment of the present invention, residual echo detector 50 takes input signal IN and output signal OUT of echo canceller 22 and determines when a significant amount of echo has been cancelled, as discussed hereinbelow. When this occurs, detector 50 indicates to encoder 52 to suppress the echo in the frame.
Residual echo detector 50 calculates the following criterion:
EiN / Eouτ > ENR_THRES [1] where E|N and E0uτ are the estimated energies in the input IN and output OUT frames of echo canceller 22 and ENR_THRES is a threshold level, such as 10 dB. As explained hereinbelow, because echo canceller 22 can attenuate the echo but not the near-end signal, criterion [1] is fulfilled only when the microphone signal is predominately echo while, at the same time, a large part of this echo is cancelled in the output. It is noted that all echo cancellers have adaptive filters which attempt to represent the echo transfer function. Unfortunately, they are initialized with some default values and they take some time, after an echo first appears, to converge to their optimal value. During this time they are not effective. Therefore, in accordance with a preferred embodiment of the present invention, detector 50 also enables suppression whenever the adaptive filter (not shown) of echo canceller 50 has not yet converged. For example, suppression can be enabled for the initial 80 samples (10 msec with a sampling rate of 8 kHz) of the adaptation. It may be safely assumed that during this very short initial interval, the near-end signal does not contain near-end speech, because double-talk (two sides talking at the same time) is not likely to occur at the very beginning of a conversation.
In a digital network with discontinuous transmission (DTX), each frame for which the above criterion is fulfilled is declared by detector 50 to be of type "noise". In addition, the mechanism for updating the comfort noise variables in the encoder is inhibited. Thus, the encoder sends a comfort-noise indication with comfort-noise parameters of previous noise frames that did not contain echo. This is shown in Fig. 4, which illustrates the operation of the telephone in the present invention. Similar steps to those of the prior art have similar reference numerals.
In step 30, echo canceller 22 performs the echo cancellation. Afterward (step 60), residual echo detector 50 utilizes the above criterion to decide whether or not to suppress the echo. If not, the telephone continues as before, performing voice activity detection (step 32) and transmitting an encoded frame (step 38) if there was voice and comfort noise parameters (step 42) if there was no voice. If there is residual echo to be suppressed, encoder 52 first determines (step 62) if echo canceller 22 is before convergence. If it is, then encoder 52 generates (step 64) default comfort noise parameters and transmits them (step 42). If echo canceller 22 is after convergence, then encoder 52 utilizes the comfort noise parameters most recently calculated and transmits them (step 42).
It will be appreciated that, by utilizing the most recently calculated comfort noise parameters, the present invention does not update the comfort noise with data from a signal with too much residual echo. This ensures that the comfort noise calculation is based on background noise and not on echoed signals.
It will be appreciated that criterion [1] of the present invention may replace the residual echo by comfort noise also during double-talk, at least during the very low energy, near-end speech frames for which the input signal is dominated by the echo. This means that during double-talk, the very low energy frames of the near-end speech are also suppressed. However, this is barely felt by the far-end listener.
It will further be appreciated that the part of the residual echo that is most annoying is the high-energy, voiced part. In accordance with an alternative preferred embodiment of the present invention operative for digital networks having a variable-rate mode, the high-energy portion can be reduced by transmitting each frame for which the above criterion is fulfilled (i.e. which has residual echo therein) at a low rate, rather than sending comfort noise parameters. For example, with EVRC and QCELP13 vocoders, the low rate is 1/8. With these vocoders, speech encoded at rate 1/8 sounds weaker and more "whisper like". The effect is particularly strong for voiced speech, which is the high-energy part of the speech. Forcing a low rate of 1/8 has a more dramatic effect on the voiced speech than on the unvoiced, and thus, forcing rate 1/8 whenever criterion [1] is fulfilled will result mainly in attenuating the voiced parts of
the residual echo.
It will be also appreciated that, in this second embodiment of the present invention, the low rate will be forced both on the residual echo and on the low-energy near-end speech, in the case of double-talk. It will be appreciated that the effect on the residual echo is strong, whereas it is weak for the near-end speech. For the residual echo, the low rate is forced for all of its segments,
including the voiced parts. These voiced parts are the most audible and annoying ones and they are the ones that are most effectively suppressed. For the near-end speech, on the other hand, the low rate is forced only on the low energy frames, which are mostly unvoiced, and the effect of forcing the low rate on these frames is small.
Explanation of the Suppression Criterion
Suppression occurs when the ratio of input and output energies of echo
canceller 22 be above a threshold level ENR_THRES, or:
EiN / Eouτ > ENR_THRES [1]
To understand the criterion, consider the following: quantitatively, the microphone signal (which is the input to echo canceller 22) can be described as the sum of echo and the near end signal:
IN = NEAR + ECHO [2] Echo cancellers in hands-free environments achieve a significant amount of echo attenuation after a very short initial convergence period. Typically, the echo energy is attenuated by at least a factor of 10 after a few milliseconds, where fluctuations in the echo path do not increase the residual echo above this minimum attenuation EC_ATTmin of about 10 dB. Denoting the momentary attenuation by EC_ATT, we write:
OUT = NEAR + ECHO / EC_ATT [3]
If the threshold ENR_THRES of criterion [1] is equal to the minimum
attenuation, EC_ATTmin , then, approximately, E|N/Eouτ exceeds this threshold when both conditions [4] and [5] hereinbelow are satisfied:
EE/A > ENEAR [4]
EC_ATT > EC_ATTm,n [5] where EE/A is the energy of the attenuated signal ECHO/EC_ATT and
ENEAR is the energy in the near-end signal. Condition [4] means that the input signal is predominately echo and condition [5] means that, indeed, the echo was attenuated by at least EC_ATTm,n. Thus, condition [1] indicates a condition where the residual echo might be audible. Condition [1] does not necessarily indicate that the residual echo is audible, because the actual attenuation may be such that: ECHO / EC_ATT « NEAR [6]
However, experience with EC algorithms for acoustic echo of speech signals shows that a residual echo is almost always audible. Therefore, criterion [1] almost always indicates that a residual echo needs to be suppressed. Moreover, because of the particular suppression method suggested in the present invention, there is almost no price paid for suppressing non-audible residual-echo in the case that condition [6] is fulfilled.
For a variable-rate vocoder, the suppression consists of merely "forcing" a low-rate. Most likely, if the residual echo were indeed very low, a low-rate would be have been used anyway, so "suppression" in this case has no effect.
In the case of forcing a comfort noise frame, the same thing happens: If the residual echo is small, the VAD would have indicated a "noise" frame, and comfort noise parameters would have been transmitted anyway. However, without "suppression", the current frame would have been used for updating the comfort noise parameters, whereas with "suppression" it will not be used. However, this has a negligible effect on the overall sound quality.
As previously explained, the above is also true if the input signal includes low energy, near-end speech frames. The frames that will be suppressed will be those for which the echo is high and the near-end speech is low. Since, for acoustic echo situations, the average echo level is the same or lower than the average near-end speech level, the highest-energy near-end speech frames that may suppressed will be those with energy that is EC_ATTmin times less than the average near-end energy (i.e., in the worst case, frames that are typically 10 dB below the average level will be suppressed). Because this is only a rare worst case (after all, most echo frames are below the near-end average level), and because the far-end speaker hears his own speech at the same time, the above distortion of the near-end speech is barely audible to the far-end. It will be appreciated that criterion [1] is not fulfilled when criterions [4] and [5] are both not satisfied. Criterion [5] is not satisfied only when echo canceller 22 has not yet converged. When criterion [4] is not satisfied, the noise additionally fully masks the residual echo, leaving the residual echo inaudible. In the latter case, there is no need for suppression. In the former case, there is need for suppression since the echo canceller is not yet fully operative.
The methods and apparatus disclosed herein have been described without reference to specific hardware or software. Rather, the methods and apparatus have been described in a manner sufficient to enable persons of ordinary skill in the art to readily adapt commercially available hardware and software as may be needed to reduce any of the embodiments of the present invention to practice without undue experimentation and using conventional techniques.
It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described herein above. Particularly, the method is not limited to a mobile phone, hands-free operation or acoustic echo. Rather the scope of the invention is defined by the claims that follow:

Claims

1. A communication device comprising: an echo canceller adapted to cancel most of an echo of a far end signal found in a microphone signal, said echo canceller providing an echo canceled signal with a residual echo therein; and a residual echo system adapted to suppress said residual echo when said echo canceller cancels a significant amount of echo.
2. A communication device according to claim 1 and wherein said residual echo system includes a residual echo detector adapted to detect when the energy of said microphone signal is significantly higher than the energy of said echo canceled signal.
3. A communication device according to claim 1 and wherein said residual echo system includes a residual echo suppressing encoder adapted to transmit comfort noise when said echo canceller cancels a significant amount of echo.
4. A communication device according to claim 3 and wherein said encoder is adapted to update said comfort noise from background noise but not when said echo canceller cancels a significant amount of echo.
5. A communication device according to claim 3 wherein a significant amount of echo means 10dB of echo.
6. A communication device according to claim 1 and wherein said residual echo system includes a residual echo suppressing encoder adapted to transmit said echo canceled signal at a low rate when said echo canceller cancels a significant amount of echo.
7. A communication device according to claim 1 wherein significantly higher means 10dB.
8. A residual echo system comprising: a residual echo detector adapted to detect when the energy of a microphone signal is significantly higher than the energy of an echo canceled signal and adapted to produce a detection signal in response; and a residual echo suppressing encoder adapted to transmit comfort noise upon receipt of said detection signal.
9. A system according to claim 8 and wherein said encoder is adapted to update said comfort noise from background noise but not when said encoder receives said detection signal.
10. A residual echo system comprising: a residual echo detector adapted to detect when the energy of a microphone signal is significantly higher than the energy of an echo canceled signal and adapted to produce a detection signal in response; and a residual echo suppressing encoder adapted to transmit said echo canceled signal at a low rate upon receipt of said detection
signal.
11. An integrated circuit (IC) having a residual echo system, the system comprising: a residual echo detector adapted to detect when the energy of a microphone signal is significantly higher than the energy of an echo canceled signal and adapted to produce a detection signal in response; and a residual echo suppressing encoder adapted to transmit
comfort noise upon receipt of said detection signal.
12. An IC according to claim 1 1 and wherein said encoder is adapted to update said comfort noise from background noise but not when said encoder receives said detection signal.
13. An IC having a residual echo system, the system comprising: a residual echo detector adapted to detect when the energy of a microphone signal is significantly higher than the energy of an echo canceled signal and adapted to produce a detection signal in response; and a residual echo suppressing encoder adapted to transmit said echo canceled signal at a low rate upon receipt of said detection
signal.
14. A method comprising: canceling most of an echo of a far end signal found in a microphone signal and providing an echo canceled signal with a residual echo therein; and suppressing said residual echo when a significant amount of echo is canceled in said canceling.
15. A method according to claim 14 and wherein said suppressing includes detecting when the energy of said microphone signal is significantly higher than the energy of said echo canceled signal.
16. A method according to claim 14 and wherein said suppressing includes transmitting comfort noise when a significant amount of echo is cancelled in said canceling.
17. A method according to claim 16 and wherein said transmitting includes updating said comfort noise from background noise but not when a significant amount of echo is cancelled in said canceling.
18. A method according to claim 14 and wherein said suppressing includes transmitting said echo canceled signal at a low rate when a significant amount of echo is cancelled in said canceling.
PCT/IL2000/000527 1999-09-07 2000-09-04 Suppression of residual acoustic echo Ceased WO2001019062A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB0203162A GB2369001A (en) 1999-09-07 2000-09-04 Suppression of residual acoustic echo
DE10084970T DE10084970T1 (en) 1999-09-07 2000-09-04 Suppression of residual acoustic echo
AU68622/00A AU6862200A (en) 1999-09-07 2000-09-04 Suppression of residual acoustic echo

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US39069799A 1999-09-07 1999-09-07
US09/390,697 1999-09-07

Publications (1)

Publication Number Publication Date
WO2001019062A1 true WO2001019062A1 (en) 2001-03-15

Family

ID=23543548

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2000/000527 Ceased WO2001019062A1 (en) 1999-09-07 2000-09-04 Suppression of residual acoustic echo

Country Status (4)

Country Link
AU (1) AU6862200A (en)
DE (1) DE10084970T1 (en)
GB (1) GB2369001A (en)
WO (1) WO2001019062A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004036889A1 (en) * 2002-10-16 2004-04-29 Ericsson Inc. Integrated noise cancellation and residual echo supression
WO2007109949A1 (en) * 2006-03-27 2007-10-04 Huawei Technologies Co., Ltd. A method and a device for generating comfort noise in echo cancellation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5016271A (en) * 1989-05-30 1991-05-14 At&T Bell Laboratories Echo canceler-suppressor speakerphone
US5283784A (en) * 1990-08-03 1994-02-01 Coherent Communications Systems Echo canceller processing techniques and processing
US5646991A (en) * 1992-09-25 1997-07-08 Qualcomm Incorporated Noise replacement system and method in an echo canceller
US5835486A (en) * 1996-07-11 1998-11-10 Dsc/Celcore, Inc. Multi-channel transcoder rate adapter having low delay and integral echo cancellation
US5937060A (en) * 1996-02-09 1999-08-10 Texas Instruments Incorporated Residual echo suppression
US6052462A (en) * 1997-07-10 2000-04-18 Tellabs Operations, Inc. Double talk detection and echo control circuit

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5016271A (en) * 1989-05-30 1991-05-14 At&T Bell Laboratories Echo canceler-suppressor speakerphone
US5283784A (en) * 1990-08-03 1994-02-01 Coherent Communications Systems Echo canceller processing techniques and processing
US5646991A (en) * 1992-09-25 1997-07-08 Qualcomm Incorporated Noise replacement system and method in an echo canceller
US5937060A (en) * 1996-02-09 1999-08-10 Texas Instruments Incorporated Residual echo suppression
US5835486A (en) * 1996-07-11 1998-11-10 Dsc/Celcore, Inc. Multi-channel transcoder rate adapter having low delay and integral echo cancellation
US6052462A (en) * 1997-07-10 2000-04-18 Tellabs Operations, Inc. Double talk detection and echo control circuit

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004036889A1 (en) * 2002-10-16 2004-04-29 Ericsson Inc. Integrated noise cancellation and residual echo supression
US7027591B2 (en) 2002-10-16 2006-04-11 Ericsson Inc. Integrated noise cancellation and residual echo suppression
WO2007109949A1 (en) * 2006-03-27 2007-10-04 Huawei Technologies Co., Ltd. A method and a device for generating comfort noise in echo cancellation
CN101046965B (en) * 2006-03-27 2010-05-12 华为技术有限公司 A Method of Generating Comfort Noise in Echo Cancellation

Also Published As

Publication number Publication date
GB2369001A (en) 2002-05-15
GB0203162D0 (en) 2002-03-27
DE10084970T1 (en) 2002-11-21
AU6862200A (en) 2001-04-10

Similar Documents

Publication Publication Date Title
EP0956658B1 (en) Method and apparatus for using state determination to control functional elements in digital telephone systems
EP0861531B1 (en) Acoustic echo elimination in a digital mobile communications system
EP1298815B1 (en) Echo processor generating pseudo background noise with high naturalness
EP1119172B1 (en) Network echo canceller
EP1250797B1 (en) Improved system and method for implementation of an echo canceller
US6622030B1 (en) Echo suppression using adaptive gain based on residual echo energy
US5390244A (en) Method and apparatus for periodic signal detection
US7212841B2 (en) Telephone apparatus and a communication method using such apparatus
EP1554865B1 (en) Integrated noise cancellation and residual echo supression
WO2000016497A1 (en) Echo canceler adaptive filter optimization
TR200100647T2 (en) Methods and apparatus for controlling echo suppression in communication systems.
JP3597671B2 (en) Handsfree phone
CN113921029A (en) Double-end sounding detection method applied to echo cancellation
WO2001019062A1 (en) Suppression of residual acoustic echo
JPH09116613A (en) Echo canceller
KANG et al. A new post-filtering algorithm for residual acoustic echo cancellation in hands-free mobile application
EP1341365A1 (en) Method and arrangement for processing a speech signal
WO1994000944A1 (en) Method and apparatus for ringer detection
MXPA99007002A (en) Method and apparatus for using state determination to control functional elements in digital telephone systems
HK1025196B (en) Method and apparatus for using state determination to control functional elements in digital telephone systems
JPH08335975A (en) Loudspeaking device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 200203162

Country of ref document: GB

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 008126062

Country of ref document: CN

122 Ep: pct application non-entry in european phase
RET De translation (de og part 6b)

Ref document number: 10084970

Country of ref document: DE

Date of ref document: 20021121

WWE Wipo information: entry into national phase

Ref document number: 10084970

Country of ref document: DE

NENP Non-entry into the national phase

Ref country code: JP