CN1121684C

CN1121684C - System for adaptively filtering audio signals to enhance speech intelligibility in noisy environmental conditions

Info

Publication number: CN1121684C
Application number: CN96198008A
Authority: CN
Inventors: T·W·索尔维
Original assignee: Ericsson Inc
Current assignee: Ericsson Inc
Priority date: 1995-09-14
Filing date: 1996-09-13
Publication date: 2003-09-17
Anticipated expiration: 2016-09-13
Also published as: JPH11514453A; BR9610290A; DE69613380D1; NO981074L; CN1201547A; RU2163032C2; NO981074D0; AU724111B2; WO1997010586A1; EP0852052A1; TR199800475T1; MX9801857A; CA2231107A1; PL185513B1; PL325532A1; AU7078496A; KR100423029B1; EE03456B1; KR19990044659A; EE9800068A

Abstract

A method and system are provided for adaptively reducing noise in frames of digitized audio signals that include both speech and background noise. Frames of digitized audio signals are passed through an adjustable, high-pass filter circuit to filter a portion of background noise located in a low frequency range of the digitized signal. The filter circuit is adjusted by a filter control circuit adapted for a current frame to exhibit a selected frequency response curve. The filter control circuit includes a speech detector for detecting the presence or absence of speech in the frames of digitized audio signals. The filter circuit is adjusted when no speech is detected in the current frame. In a first preferred embodiment, the filter control circuit controls the filter circuit by calculating a noise estimate corresponding to the background noise, and adjusting the filter circuit based on the noise estimate. As the noise estimates increase, the filter circuit is adjusted to extract increasing amounts of energy falling in low frequency ranges of speech. In a second preferred embodiment, the filter circuit is adjusted as a function of a noise profile estimate. A noise profile estimate for a current frame is determined as a function of speech detection and is compared to a reference noise profile. Based on this comparison, the filter circuit is adaptively adjusted.

Description

Method and apparatus for selectively changing a frame of digital signal

技术领域Technical field

该发明涉及噪声削减系统，特别涉及一种用于便携数字无线电话的自适应语音清晰度增强系统以及用于选择性地改变一帧数字信号的方法和装置。The invention relates to noise reduction systems, and more particularly to an adaptive speech intelligibility enhancement system for portable digital radiotelephones and a method and apparatus for selectively altering a frame of digital signals.

背景技术 Background technique

在美国以及世界其他地区的商业运做中，蜂窝电话工业已经取得了显著的进展。在主要都市地区，对蜂窝服务的需求正在超过现有系统的容量。假设这种趋势继续下去，蜂窝无线电通讯将触及甚至最小的农村市场。因此，以合理花费保持高质量服务的同时蜂窝容量必须加大。朝向提高容量的重量一步是蜂窝系统从模拟传送向数字传送的转换。这种转换也是重要的，这是因为第一代个人通讯网(PCNs)将可能由使用下一代数字蜂窝基础结构的蜂窝载体来提供，其中，第一代个人通讯网采用了便于携带及便于在家中、办公室、街上、汽车里等地打电话或接电话的低花费、衣袋大小的无绳电话。The cellular telephone industry has made remarkable strides in commercial operations in the United States and other parts of the world. In major metropolitan areas, the demand for cellular service is exceeding the capacity of existing systems. Assuming this trend continues, cellular radio communications will reach even the smallest rural markets. Therefore, cellular capacity must be increased while maintaining a high quality of service at a reasonable cost. A major step towards increasing capacity is the conversion of cellular systems from analog to digital transmission. This transition is also important because first-generation Personal Communications Networks (PCNs) will likely be provided by cellular carriers using next-generation digital cellular Low-cost, pocket-sized cordless phone for making and receiving calls in the office, on the street, in the car, and more.

数字通讯系统利用了强有力的数字信号处理技术。数字信号处理通常指的是数字化信号的数学或其他方面的处理。例如，将模拟信号转换(数字化)为数字形式之后，可能利用数字信号处理器(DSP)中的简单数学例程来滤波、放大并衰减该数字信号。数字信号处理器一般被制造为高速集成电路，使得数据处理操作基本上实时进行。数字信号处理器也可以被用来降低数字化语音的比特传输率，其结果表现为发送无线电信号的谱占有率降低以及系统容量扩大。例如如果使用14比特线形脉冲码调制(PCM)来数字化语音信号，并以8KHZ的采样率采样，会产生112K比特/秒的串行比特率。此外，通过从数学上利用冗余的特性及其它人类语音的可预测特性，声音编码技术可以用来将112K比特/秒的串行比特率压缩到7.95K比特/秒以获得14∶1的比特传输率削减。传输率削减意味着更多可得的带宽。Digital communication systems utilize powerful digital signal processing techniques. Digital signal processing generally refers to the mathematical or other manipulation of digitized signals. For example, after converting (digitizing) an analog signal to digital form, the digital signal may be filtered, amplified, and attenuated using simple mathematical routines in a digital signal processor (DSP). Digital signal processors are typically fabricated as high-speed integrated circuits such that data processing operations are performed substantially in real time. Digital signal processors can also be used to reduce the bit rate of digitized speech. The result is a reduction in the spectrum occupancy of the transmitted radio signal and an increase in system capacity. For example, if 14-bit linear pulse code modulation (PCM) is used to digitize the voice signal and sampled at a sampling rate of 8KHZ, a serial bit rate of 112K bits/second will be generated. In addition, by mathematically exploiting the properties of redundancy and other predictable properties of human speech, vocoder techniques can be used to compress the serial bit rate of 112Kbit/s to 7.95Kbit/s to obtain a 14:1 bit Transmission rate reduction. Reduced transfer rates mean more available bandwidth.

在美国，被TIA采用用做第二代蜂窝电话系统(i.e.IS-54)的数字化标准的一种流行的语音压缩技术是矢量源码书激励的线形预测编码(VSELP)。不幸的是，当包括语音且混有高电平环境噪声(尤其是″有色噪声″)的音频信号被利用VSELP编码/压缩时，结果里可能包括部分不希望的音频信号特性。例如，如果在噪声环境(例如，其中移动车辆的车内)中使用数字移动电话，环境噪声和希望的语音都被使用VSELP编码算法压缩，并被发送到基站，在基站中压缩后的信号被解码并被重构为可听的语音。当背景噪声被重构为模拟形式时，噪声的不被希望的可听失真，以及偶然发生于语音中的这些情况被引入。这种失真对于一般听众来说是非常讨厌的。In the United States, a popular speech compression technique adopted by TIA as the digitization standard for the second generation cellular telephone system (i.e. IS-54) is Vector Source Book Excited Linear Predictive Coding (VSELP). Unfortunately, when an audio signal comprising speech mixed with high levels of ambient noise (especially "colored noise") is encoded/compressed using VSELP, the result may include some undesired characteristics of the audio signal. For example, if a digital mobile phone is used in a noisy environment (e.g., inside a vehicle in which it is moving), both the ambient noise and the desired speech are compressed using the VSELP coding algorithm and sent to a base station where the compressed signal is decoded and reconstructed into audible speech. Undesirable audible distortions of the noise, as well as those that happen to occur in speech, are introduced when the background noise is reconstructed into analog form. This distortion is very annoying to the average listener.

这种失真大部分由使用移动电话的环境造成。移动电话一般用于车辆内部，在那里常常有汽车引擎产生的环境噪声及周围车流造成的噪声。这种车辆内部的环境噪声通常集中在低音频范围，并且噪声幅度会由于车辆的速度和加速以及周围车流量大小程度这样的因素而变化。这种低频噪声还具有这样的趋势：严重降低来自汽车中讲话人的语音清晰度。在采用VSELP声音编码器的通讯系统中，这种由低频噪声造成的语音清晰度的降低可能会尤其显著，但是这种现象也可能发生于不包括VSELP声音编码器的通讯系统中。Much of this distortion is caused by the environment in which the mobile phone is used. Mobile phones are typically used inside vehicles, where there is often ambient noise from the car engine and noise from surrounding traffic. Such ambient noise inside a vehicle is typically concentrated in the low frequency range, and the noise amplitude can vary due to factors such as the speed and acceleration of the vehicle and the amount of surrounding traffic. This low frequency noise also has a tendency to severely degrade speech intelligibility from people speaking in the car. This reduction in speech intelligibility due to low-frequency noise may be particularly pronounced in communication systems employing VSELP vocoders, but it may also occur in communication systems that do not include VSELP vocoders.

环境噪声对移动电话的影响也可能会因使用移动电话的方式而起作用。尤其是移动电话可以用于免提方式，即电话使用者对着放在托架内的移动电话讲话。这使得移动电话使用者的手可以腾出来驾驶，但也增加了使用者讲出的话到达移动电话麦克风输入端之前必须走过的距离。这种使用者与移动电话之间加大了的距离，加上变化的环境噪声，会导致噪声成为输入到移动电话的音频信号总功率谱能量的一个显著部分。The effect of ambient noise on a mobile phone may also play a role in how the mobile phone is used. In particular the mobile phone can be used in hands-free mode, ie the user of the phone speaks into the mobile phone placed in the cradle. This frees the mobile phone user's hands for driving, but also increases the distance that the user's spoken words must travel before reaching the mobile phone's microphone input. This increased distance between the user and the mobile phone, combined with varying ambient noise, can cause noise to be a significant portion of the total power spectral energy of the audio signal input to the mobile phone.

现有技术的描述包括在EP 0 645 756，EP 0 558 312，EP 0 665530，DE 4 012 349，美国专利No.4,811,404，4,461,025，及5,251,263中，所有这些都描述了滤除不希望信号成分的方式。Descriptions of the prior art include EP 0 645 756, EP 0 558 312, EP 0 665530, DE 4 012 349, U.S. Patent Nos. 4,811,404, 4,461,025, and 5,251,263, all of which describe methods for filtering out unwanted signal components Way.

理论上来说，可利用数字信号处理器来实现各种数字信号处理算法以滤除VSELP编码的背景噪声。然而，这些解决方法常常需要以每秒执行几百万条指令(MIPS)来计算的很大的数字信号处理开销，这耗费了可贵的处理时间，内存空间及功率耗费。然而，在便携无线电话中，这些信号处理资源中的每一个都是有限的。因此，对于最小化VSLEP编码的背景噪声以及其他类型的背景噪声，简单地增加DSP的处理负担并不是一个最佳解决方案。Theoretically, a digital signal processor can be used to implement various digital signal processing algorithms to filter out the background noise of the VSELP code. However, these solutions often require significant digital signal processing overhead measured in millions of instructions per second (MIPS), which consumes valuable processing time, memory space, and power consumption. However, in portable radiotelephones, each of these signal processing resources is limited. Therefore, simply increasing the processing burden on the DSP is not an optimal solution for minimizing the background noise of VSLEP encoding as well as other types of background noise.

发明概要Summary of the invention

该发明给出了一种自适应降噪系统，该系统在使对编码语音质量的任何负面影响最小化及对数字信号处理资源的任何增加消耗最小化的同时，削减了不希望的编码背景噪声作用。该发明的方法和系统增加了数字化音频信号中语音的清晰度，其方法是将数字化音频信号帧通过一个滤波器电路。该滤波器电路起可调节高通滤波器的作用，它滤除一部分低音频范围的数字化信号，通过落在高频范围内的数字化信号部分。因为车辆内的噪声趋于集中在低音频范围，并且只有一小部分语音清晰值落在该低频范围，因此滤波电路在只滤除掉语音的不重要部分的同时，滤除了数字化音频信号中的大部分噪声。这使得与除去的语音能量部分相比，相对更大一部分的噪声能量被除去。通过自适应地调整并选择滤波器电路的频率响应曲线，滤除掉的语音量被限制，并对无线电输出的语音清晰度具有最小的影响。This invention presents an adaptive noise reduction system that cuts unwanted coding background noise while minimizing any negative impact on the quality of coded speech and any increased consumption of digital signal processing resources effect. The inventive method and system increase the intelligibility of speech in a digitized audio signal by passing frames of the digitized audio signal through a filter circuit. The filter circuit acts as an adjustable high-pass filter that rejects a portion of the digitized signal in the low frequency range and passes a portion of the digitized signal that falls in the high frequency range. Because the noise in the vehicle tends to be concentrated in the low frequency range, and only a small part of the speech intelligibility value falls in this low frequency range, the filter circuit filters out the noise in the digitized audio signal while only filtering out the unimportant part of the speech. most of the noise. This results in a relatively larger fraction of noise energy being removed than the fraction of speech energy being removed. By adaptively adjusting and selecting the frequency response curve of the filter circuit, the amount of speech filtered out is limited with minimal impact on speech intelligibility at the radio output.

滤波器控制电路被用来调整滤波器电路，使之以某一噪声估计值和/或谱包络函数的形式来显示不同的频率响应曲线，其中的噪声估计值和/或谱包络对应于音频信号中的噪声。噪声估计值和/或谱包络在逐帧的基础上针对数字信号做调整并以语音检测函数的形式来调整。如果没有检测到语音，那麽为当前帧修正噪声估计值和/或谱包络。如果检测到语音，就不调整噪声估计值和/或谱包络。The filter control circuit is used to adjust the filter circuit to exhibit different frequency response curves in the form of a noise estimate and/or spectral envelope function corresponding to Noise in an audio signal. The noise estimate and/or spectral envelope is adjusted to the digital signal on a frame-by-frame basis and in the form of a speech detection function. If no speech is detected, the noise estimate and/or spectral envelope are revised for the current frame. If speech is detected, the noise estimate and/or spectral envelope are not adjusted.

在第一实施方案中，滤波器电路针对数字化的音频信号帧计算噪声估计值。该噪声估计值对应于数字化音频信号帧中的背景噪声量。当语音低频范围中的背景噪声对语音的相对量增加时，噪声估计值增加。当语音低频范围中的背景噪声对语音的相对量增加时，滤波器控制电路使用噪声估计值来调整滤波器电路以滤除更大部分的低频范围语音。当不存在背景噪声时，没有语音信号被滤除。当存在更高的噪声电平时，更大部分的噪声和语音信息被抽取。因为噪声趋向于集中在低频范围内并且只有相对较小部分语音清晰值落在该低频范围内，当噪声估计值增加时，通过加大正在被滤除的低频能量部分，音频信号的整体清晰度可以被提高。In a first embodiment, a filter circuit calculates a noise estimate for a frame of a digitized audio signal. The noise estimate corresponds to the amount of background noise in the frame of the digitized audio signal. The noise estimate increases as the relative amount of background noise in the low frequency range of speech to speech increases. As the relative amount of background noise to speech in the low frequency range of speech increases, the filter control circuit uses the noise estimate to adjust the filter circuit to filter out a greater portion of the low frequency range speech. When no background noise is present, no speech signal is filtered out. When a higher noise level is present, a larger portion of noise and speech information is extracted. Because noise tends to be concentrated in the low frequency range and only a relatively small portion of the speech intelligibility value falls within this low frequency range, as the noise estimate increases, the overall intelligibility of the audio signal is improved by increasing the portion of low frequency energy that is being filtered out can be improved.

在第二实施方案中，一个修改后的滤波器控制电路被用来调整滤波器电路，使之以某一噪声包络函数的形式来显示出不同的频率响应曲线，其中的噪声包络为音频信号中选出频率范围上噪声估计值的噪声包络。该滤波控制电路包括一个谱分析器，该分析器以检测语音函数的形式来确定一个噪声包络估计值。为当前帧确定一个噪声包络估计值并将该估计值与参考噪声包络相比较。基于该比较，滤波器电路被自适应地调整用来从当前帧中提取不同数量的低频能量。In a second embodiment, a modified filter control circuit is used to adjust the filter circuit to exhibit different frequency response curves as a function of a noise envelope, where the noise envelope is the audio The noise envelope of the noise estimate over the selected frequency range in the signal. The filter control circuit includes a spectral analyzer that determines a noise profile estimate as a function of the detected speech. An estimate of the noise profile is determined for the current frame and compared to the reference noise profile. Based on this comparison, the filter circuit is adaptively adjusted to extract different amounts of low frequency energy from the current frame.

根据该发明的自适应削减系统可以被很好地应用于无线电通讯系统，在该无线电通讯系统中，便携/移动无线电收发器之间及无线电收发器与固定电话线用户之间通过RF信道通讯。每一个收发器包括一个天线，一个用于将通过天线在RF信道上接收到的无线电信号转换成模拟音频信号的接收器，及一个发送器。发送器包括一个编-解码器(codec)用来把将要被发送的模拟音频信号数字化为数字化语音信息帧，该语音信息既包括语音也包括背景噪声。数字信号处理器在背景噪声估计值及当前帧中语音检测的基础上处理当前帧来最小化背景噪声。调制器将处理后的数字化语音信息帧调制到RF载波上用于后续通过天线的发送。The adaptive clipping system according to the invention can be advantageously applied to radio communication systems in which portable/mobile radio transceivers communicate with each other and between radio transceivers and fixed telephone line subscribers via RF channels. Each transceiver includes an antenna, a receiver for converting radio signals received on an RF channel through the antenna into analog audio signals, and a transmitter. The transmitter includes a coder-decoder (codec) for digitizing the analog audio signal to be transmitted into frames of digitized speech information, the speech information including both speech and background noise. The digital signal processor processes the current frame to minimize the background noise based on the background noise estimate and speech detection in the current frame. The modulator modulates the processed digitized speech information frame onto an RF carrier for subsequent transmission through the antenna.

附图简要描述Brief description of the drawings

根据下面书写的描述并结合附图，对于该技术领域的普通技术人员来说，该发明的所有特征及优点将很容易明白。All features and advantages of this invention will become readily apparent to those of ordinary skill in the art from the following written description when taken in conjunction with the accompanying drawings.

图1是该发明的一个通用功能方框图。Figure 1 is a general functional block diagram of the invention.

图2图解说明了用于蜂窝无线电通讯的美国数字标准IS-54的帧和位置结构；Figure 2 illustrates the frame and position structure of the US Digital Standard IS-54 for cellular radio communications;

图3是利用数字信号处理器实现的该发明第一优选实施方案的方框图；Figure 3 is a block diagram of a first preferred embodiment of the invention implemented using a digital signal processor;

图4是该发明的一个示范实施方案的功能方框图，该实施方案应用于无线电通讯系统中多个便携无线电收发器中一个。Fig. 4 is a functional block diagram of an exemplary embodiment of the invention as applied to one of a plurality of portable radio transceivers in a radio communication system.

图5A和5B为一个流程图，它图解说明了在实现该发明第一优选实施方案过程中，数字信号处理器执行的功能/操作。5A and 5B are a flowchart illustrating the functions/operations performed by the digital signal processor in implementing the first preferred embodiment of the invention.

图6A是图解说明根据该发明第一优选实施方案的滤波电路的衰减-频率特性的第一示例图。Fig. 6A is a first exemplary diagram illustrating the attenuation-frequency characteristic of the filter circuit according to the first preferred embodiment of the present invention.

图6B是图解说明根据该发明第一优选实施方案的滤波电路的衰减-频率特性的第二示例图。Fig. 6B is a second exemplary diagram illustrating the attenuation-frequency characteristic of the filter circuit according to the first preferred embodiment of the present invention.

图7是可被该发明第一优选实施方案中滤波控制器电路存取的一个示例查询表。Figure 7 is an exemplary look-up table accessible by the filter controller circuit of the first preferred embodiment of the invention.

图8A和8B图解说明了示例输入音频信号的幅度-频率特性。8A and 8B illustrate amplitude-frequency characteristics of an example input audio signal.

图9A和9B分别图解说明了图8A和8B中输入音频信号被该发明的滤波器电路滤波之后的幅度-频率特性；Figures 9A and 9B illustrate the amplitude-frequency characteristics of the input audio signal in Figures 8A and 8B after being filtered by the inventive filter circuit, respectively;

图10是利用数字信号处理器实现的该发明第二优选实施方案的方框图；Figure 10 is a block diagram of a second preferred embodiment of the invention implemented using a digital signal processor;

图11为一个流程图，对应于图5B的流程图，它图解说明了在实现该发明第二优选实施方案过程中数字信号处理器执行的功能/操作。Fig. 11 is a flowchart, corresponding to the flowchart of Fig. 5B, which illustrates the functions/operations performed by the digital signal processor in implementing the second preferred embodiment of the invention.

图12是可被该发明第二优选实施方案中滤波器控制电路存取的一个示例查询表。Figure 12 is an exemplary look-up table accessible by the filter control circuit in the second preferred embodiment of the invention.

附图详细描述Detailed description of the drawings

在下面的描述中，出于解释而不是限制的目的，为了给出对该发明的全面理解，具体的细节如特殊电路、电路元件、技术、流程图等等被陈述。然而，该技术领域的技术人员会明白，该发明可以实践于偏离这些具体细节的其他实施方案中。在其他例子中，众所周知的方法、设备、和电路的详细描述都被略去以便不会以不必要的细节而模糊了对该发明的描述。In the following description, for purposes of explanation rather than limitation, specific details are set forth such as particular circuits, circuit elements, techniques, flowcharts, etc., in order to give a thorough understanding of the invention. It will be apparent, however, to those skilled in the art that the invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known methods, devices, and circuits are omitted so as not to obscure the description of the invention with unnecessary detail.

图1是根据该发明的自适应降噪系统100的通用方框图。自适应降噪系统100包括一个连接到滤波器电路115的滤波器控制电路105。滤波器控制电路105为数字化音频信号的当前帧产生一个滤波器控制信号。该滤波器控制信号被输出到滤波器电路115，滤波器电路115根据滤波器控制信号而调节以显示出一个高通频率响应曲线，该曲线是在滤波器控制信号的基础上选择出的。调节后的滤波电路115滤波数字化音频信号的当前帧。滤波信号被声音编码器120处理来产生表示数字化音频信号的编码信号。FIG. 1 is a generalized block diagram of an adaptive noise reduction system 100 according to the invention. The adaptive noise reduction system 100 includes a filter control circuit 105 connected to a filter circuit 115 . Filter control circuit 105 generates a filter control signal for the current frame of the digitized audio signal. The filter control signal is output to filter circuit 115 which is adjusted in accordance with the filter control signal to exhibit a high pass frequency response curve selected on the basis of the filter control signal. The conditioned filter circuit 115 filters the current frame of the digitized audio signal. The filtered signal is processed by a sound encoder 120 to produce an encoded signal representing the digitized audio signal.

在蜂窝无线电通讯系统中应用于便携/移动无线电话收发器的该发明的一个示范方案中，图2图解说明了IS-54标准为数字蜂窝无线电通讯采用的时分多址存取(TDMA)的帧结构。一″帧″是一个20毫秒的时间段，它包括一个发送块TX，一个接收块RX及一个信号强度测量块用于移动台辅助切换(MAHO)。图2中表示出的两个连续帧在40毫秒的时间段内被发送。数字化的语音和背景噪声信息如同下面进一步描述的，在逐帧的基础上被处理和滤波。In an exemplary implementation of the invention as applied to a portable/mobile radiotelephone transceiver in a cellular radio communication system, Figure 2 illustrates the Time Division Multiple Access (TDMA) frame adopted by the IS-54 standard for digital cellular radio communication structure. A "frame" is a 20 millisecond time period that includes a transmit block TX, a receive block RX and a signal strength measurement block for Mobile Assisted Handover (MAHO). The two consecutive frames shown in Figure 2 are sent over a period of 40 milliseconds. The digitized speech and background noise information is processed and filtered on a frame-by-frame basis as described further below.

最好的是，图1中的滤波器控制电路105，滤波器电路115及声音编码器120的功能用一个高速数字信号处理器来实现。一种适当的数字信号处理器是可以从TI公司获得的TMS320C53 DSP。该TMS320C53 DSP在一个单集成片上包括一个16比特微处理器，用于存储如将要被处理的语音帧这样的数据的片上RAM，用于存储各种数据处理算法的ROM，其中的算法包括VSELP语音压缩算法，以及下面将要描述的用来完成由滤波器控制电路105和滤波器电路115执行的功能块的其它算法。Preferably, the functions of the filter control circuit 105, the filter circuit 115 and the voice coder 120 in FIG. 1 are realized by a high speed digital signal processor. A suitable digital signal processor is the TMS320C53 DSP available from TI Corporation. The TMS320C53 DSP includes a 16-bit microprocessor on a single integrated chip, on-chip RAM for storing data such as speech frames to be processed, and ROM for storing various data processing algorithms, including VSELP speech The compression algorithm, as well as other algorithms described below, are used to complete the functional blocks performed by filter control circuit 105 and filter circuit 115 .

该发明的第一实施方案在图3中被表示出。在第一实施方案中，滤波器电路115以背景噪声估计值函数的形式来调整。该背景噪声估计值由滤波器控制电路来确定。脉冲码调制的音频信息帧被顺序存储在DSP的片上RAM中。可以使用其他的数字化技术来数字化音频信息。每一个PCM数字帧从DSP片上RAM中被取出并被帧能量估计器210处理，然后暂存在临时帧存储器220中。由帧能量估计器210确定的当前帧的能量被提供给噪声估计器230及语音检测器240功能块。当帧能量估计值超过前一噪声估计值与一个语音阈值的和时，语音检测器240表示语音存在于当前帧中。如果语音检测器确定没有语音存在，那么数字信号处理器200以当前噪声估计值和当前帧能量函数的形式来计算一个修正的噪声估计值。A first embodiment of the invention is represented in FIG. 3 . In a first embodiment, the filter circuit 115 is adjusted as a function of the background noise estimate. The background noise estimate is determined by a filter control circuit. Pulse code modulated audio frames are sequentially stored in the on-chip RAM of the DSP. Other digitization techniques may be used to digitize the audio information. Each PCM digital frame is fetched from the DSP on-chip RAM and processed by the frame energy estimator 210, and then temporarily stored in the temporary frame memory 220. The energy of the current frame determined by the frame energy estimator 210 is provided to the noise estimator 230 and speech detector 240 functional blocks. When the frame energy estimate exceeds the sum of the previous noise estimate and a speech threshold, the speech detector 240 indicates that speech is present in the current frame. If the speech detector determines that no speech is present, the digital signal processor 200 calculates a revised noise estimate as a function of the current noise estimate and the energy of the current frame.

修正后的噪声估计值被输出到滤波器选择器235。滤波器选择器235基于噪声估计值产生一个滤波器控制信号。在优选实施方案中，滤波器选择器235在产生滤波器控制信号过程中读取查询表。查询表包括一系列的滤波器控制值，每一个控制值都与一个噪声估计值或噪声估计值的范围相匹配。在修正后噪声估计值的基础上，查询表中的滤波器控制值被选择出，该滤波器控制值由一个滤波器控制信号表示，该控制信号为滤波器电路115而输出到滤波器组265。为了稳定该过程，并避免在不同滤波器之间的连续切换，为新滤波器的选择设置了一个N帧的转换时间。新滤波器只能每N帧选择一次，其中N是一个大于1，并最好大于10的整数。The corrected noise estimate is output to the filter selector 235 . Filter selector 235 generates a filter control signal based on the noise estimate. In a preferred embodiment, filter selector 235 reads a look-up table during generation of the filter control signal. The look-up table includes a series of filter control values, each control value matching a noise estimate or range of noise estimates. On the basis of the corrected noise estimate, a filter control value is selected from the look-up table, the filter control value being represented by a filter control signal which is output by the filter circuit 115 to the filter bank 265 . To stabilize the process and avoid continuous switching between different filters, a switching time of N frames is set for the selection of a new filter. A new filter can only be selected once every N frames, where N is an integer greater than 1, and preferably greater than 10.

滤波器电路115根据滤波器控制信号而调整以显示出对应于输入滤波器控制信号和噪声估计值的高通频率响应曲线。现有技术中众所周知的各种不同类型的滤波器电路可以被用来根据滤波器控制信号显示选中的频率响应曲线。这些现有技术的滤波器包括IIR滤波器，如巴特沃斯，契比雪夫或椭圆滤波器，由于较低的处理要求，也可以使用FIR滤波器，但优选IIR滤波器。The filter circuit 115 is adjusted according to the filter control signal to exhibit a high-pass frequency response curve corresponding to the input filter control signal and the noise estimate. Various types of filter circuits well known in the art can be used to display a selected frequency response curve based on the filter control signal. These prior art filters include IIR filters such as Butterworth, Chebyshev or elliptic filters, FIR filters can also be used due to lower processing requirements, but IIR filters are preferred.

滤波后的信号被声音编码器120处理，120被用来压缩滤波后信号的比特率。在优选实施方案中，声音编码器120使用矢量源码书激励线性预测编码(VSELP)技术来编码音频信号。其他的声音编码技术和算法也可以被使用，例如码激励线性预测(CELP)编码，残留脉冲激励线性预测(RPE-LTP)编码，改进的多带激励(IMBE)编码。通过在声音编码之前根据该发明滤波音频信号帧，背景噪声被最小化，该处理基本上削减了当语音重构时语音中的任何不希望的噪声影响。它也防止了语音被″淹没″在低频噪声中。The filtered signal is processed by a vocoder 120 which is used to compress the bit rate of the filtered signal. In a preferred embodiment, the audio encoder 120 encodes the audio signal using Vector Source Codebook Excited Linear Predictive Coding (VSELP) techniques. Other audio coding techniques and algorithms can also be used, such as Code Excited Linear Prediction (CELP) coding, Residual Pulse Excited Linear Prediction (RPE-LTP) coding, Improved Multiple Band Excitation (IMBE) coding. Background noise is minimized by filtering the audio signal frames according to the invention prior to vocoding, a process that substantially cuts out any unwanted noise effects in the speech when it is reconstructed. It also prevents speech from being "swamped" in low frequency noise.

结合图3描述的数字信号处理器200可以被用于例如无线电通讯系统中使用的数字便携/移动无线电话的收发器这样的装置中。图4图解说明了一个这样的数字无线电收发器，它可以用于蜂窝无线电通讯网络中。The digital signal processor 200 described in connection with FIG. 3 can be used in a device such as a transceiver of a digital portable/mobile radiotelephone used in a radio communication system. Figure 4 illustrates one such digital radio transceiver which may be used in a cellular radio communication network.

包括语音和背景噪声的音频信号从麦克风400输入到编-解码器402，402最好是一个专用集成电路(ASIC)。在麦克风400处检测到的带限音频信号被编解码器402以每秒8000个样本的采样率采样并被分块成帧。根据上述，每个20毫秒帧包括160个语音样本。这些样本被量化并被转换成例如14比特线性PCM这样的编码数字格式。一旦当前帧的数字化语音的160个样本被存储到发送DSP200中的片上RAM中，发送DSP200就象上面结合图3描述的一样，根据VSELP算法，执行信道编码功能，帧能量估计，噪声估计，语音检测，FFT，滤波功能和数字语音编码/压缩。An audio signal including speech and background noise is input from a microphone 400 to a codec 402, which is preferably an application specific integrated circuit (ASIC). The band-limited audio signal detected at microphone 400 is sampled by codec 402 at a sampling rate of 8000 samples per second and blocked into frames. According to the above, each 20 millisecond frame includes 160 speech samples. These samples are quantized and converted into a coded digital format such as 14-bit linear PCM. Once the 160 samples of the digitized speech of the current frame are stored in the on-chip RAM in the transmitting DSP 200, the transmitting DSP 200 performs channel coding functions, frame energy estimation, noise estimation, speech detection, FFT, filtering functions and digital speech coding/compression.

监测微处理器432控制着图4中表示出的收发器中所有元件的整个操作。由发送DSP200产生的滤波后的PCM数据流被提供用来正交调制和发送。到此为止，基于得自DSP 200的滤波后PCM数据流，ASIC门阵列404产生同相(I)信息信道和正交(Q)信息信道。I和Q比特流被相匹配的低通滤波器406和408处理并被传送到平衡调制器410中的IQ混合器中。参考震荡器412和乘法器414给出一个发送中间频率(IF)。I信号与同相IF混合，Q信号与正交IF混合(即同相IF被相移器416滞后90度)。混合的I和Q信号被相加，并被“上”转换到由信道合成器430选出的RF信道频率上，然后通过双工机420和天线422在选出的无线电频率信道上发送。Supervisory microprocessor 432 controls the overall operation of all components in the transceiver shown in FIG. The filtered PCM data stream generated by transmit DSP 200 is provided for quadrature modulation and transmission. So far, based on the filtered PCM data stream from DSP 200, ASIC gate array 404 generates an in-phase (I) information channel and a quadrature (Q) information channel. The I and Q bit streams are processed by matched low pass filters 406 and 408 and passed to the IQ mixer in balanced modulator 410 . Reference oscillator 412 and multiplier 414 provide a transmit intermediate frequency (IF). The I signal is mixed with the in-phase IF, and the Q signal is mixed with the quadrature IF (ie, the in-phase IF is delayed by 90 degrees by phase shifter 416). The mixed I and Q signals are summed and "up"converted to a selected RF channel frequency by channel combiner 430 and then transmitted via duplexer 420 and antenna 422 on the selected radio frequency channel.

在接收边，通过天线422和双工机420接收的信号被从混合器424中的选中接收信道频率上向下转换为第一IF频率，其中的第一IF频率使用了由信道合成器430在参考震荡器428的输出基础上合成的本地震荡器信号。第一IF混合器424的输出被滤波并且其频率被向下转换为第二IF，该转换在信道合成器430和解调器426的另一个输出的基础上进行。然后，接收门阵列434将第二IF信号转换成一系列的相位样本和一系列的频率样本。接收DSP436对接收到的信号执行解调、滤波、增益/衰减、信道解码和语音扩张。然后，处理后的语音数据被送到编解码器402并被转换为基带音频信号用于驱动扬声器438。On the receive side, the signal received via antenna 422 and duplexer 420 is down-converted from the selected receive channel frequency in mixer 424 to a first IF frequency using The output of reference oscillator 428 is based on the synthesized local oscillator signal. The output of the first IF mixer 424 is filtered and its frequency down-converted to the second IF, which conversion is performed on the basis of the other output of the channel combiner 430 and demodulator 426 . Receive gate array 434 then converts the second IF signal into a series of phase samples and a series of frequency samples. Receive DSP 436 performs demodulation, filtering, gain/attenuation, channel decoding and speech expansion on the received signal. The processed speech data is then sent to the codec 402 and converted to a baseband audio signal for driving the speaker 438 .

现在将结合图5A、5B中的流程图描述数字信号处理器200为实现滤波器控制电路105，滤波器电路115和声音编码器120的功能而执行的操作。帧能量估计器210确定每一帧音频信号的能量。通过计算一帧中每一个PCM样本平方值的和(步骤505)，帧能量估计器210确定当前帧的能量。因为对于每秒8000样本的采样率来说，每一个20毫秒长的帧有160个样本，那麽就有160个PCM样本平方被相加。按照数学方式表达，帧能量估计根据下面等式1来确定：等式1The operations performed by the digital signal processor 200 to realize the functions of the filter control circuit 105, the filter circuit 115 and the voice encoder 120 will now be described with reference to the flowcharts in FIGS. 5A, 5B. The frame energy estimator 210 determines the energy of each frame of the audio signal. The frame energy estimator 210 determines the energy of the current frame by computing the sum of the squared values of each PCM sample in a frame (step 505). Since for a sampling rate of 8000 samples per second, there are 160 samples per frame that is 20 milliseconds long, then 160 PCM sample squares are summed. Expressed mathematically, the frame energy estimate is determined according to Equation 1 below: Equation 1

为当前帧计算的帧能量值被存储在DSP200的片上RAM202中(步骤510)。The frame energy value calculated for the current frame is stored in on-chip RAM 202 of DSP 200 (step 510).

语音检测器240的功能包括从DSP200的片上RAM202中取出一个由噪声估计器230先前确定的噪声估计值(步骤515)。当然，当收发器最初上电时，没有噪声估计值存在。判断框520预计到这种情况并在步骤525给出一个噪声估计值。为了象下面将要描述的，迫使对噪声估计值的修正，最好安排一个任意高的值作为噪声估计值，例如在正常语音电平之上的20dB。由帧能量估计器210确定的帧能量被从DSP210的片上RAM202中取出(方框530)。在方框535中确定帧能量估计值是否超过了检出的噪声估计值加上一个预定语音阈值的和，如下面等式2表示的：The speech detector 240 function includes fetching a noise estimate previously determined by the noise estimator 230 from the on-chip RAM 202 of the DSP 200 (step 515). Of course, no noise estimate exists when the transceiver is initially powered up. Decision block 520 anticipates this and provides a noise estimate at step 525 . In order to force a correction of the noise estimate as will be described below, it is preferable to assign an arbitrarily high value for the noise estimate, for example 20dB above the normal speech level. The frame energy determined by frame energy estimator 210 is fetched from on-chip RAM 202 of DSP 210 (block 530). In block 535 it is determined whether the frame energy estimate exceeds the sum of the detected noise estimate plus a predetermined speech threshold, as expressed in Equation 2 below:

帧能量估计值＞(噪声估计值+语音阈值) (等式2)Frame energy estimate > (noise estimate + speech threshold) (Equation 2)

语音阈值可以是一个固定值，该固定值由经验确定大于一般背景噪声的短时能量方差，并且可以被设置为例如9dB。另外，语音阈值可以被自适应地修改来反映变化的语音条件，例如，当讲话者进入一个更嘈杂或更安静的环境时的语音条件。如果帧能量估计值超过了等式2中的和，那麽在方框570中设置一个标志位表示语音存在。如果语音检测器240检测到语音存在，那麽噪声估计器230被越过，为以前的数字化音频帧计算的噪声估计值被检索出并被用做当前噪声估计值。相反的，如果帧能量估计小于等式2中的和，在方框540将语音标志清零。The speech threshold may be a fixed value empirically determined to be greater than the short-term energy variance of general background noise, and may be set to, for example, 9dB. Additionally, speech thresholds can be adaptively modified to reflect changing speech conditions, for example, when a speaker enters a noisier or quieter environment. If the frame energy estimate exceeds the sum in Equation 2, then a flag is set in block 570 to indicate the presence of speech. If the speech detector 240 detects the presence of speech, the noise estimator 230 is bypassed and the noise estimate calculated for the previous digitized audio frame is retrieved and used as the current noise estimate. Conversely, if the frame energy estimate is less than the sum in Equation 2, the speech flag is cleared at block 540 .

也可以使用其他检测当前帧中语音的系统。例如，欧洲电讯标准机构(ETSI)已经开发了一种用于全球定位系统GSM中声音活动检测(VAD)的标准。并在ETSI参考文献：RE/SMG-020632P中被描述，该文献以参考的方式在此引用。Other systems that detect speech in the current frame can also be used. For example, the European Telecommunications Standards Institute (ETSI) has developed a standard for Voice Activity Detection (VAD) in the Global Positioning System GSM. and is described in ETSI reference: RE/SMG-020632P, which is hereby incorporated by reference.

如果语音不存在，噪声估计器230中的噪声估计值修正例程被执行。在没有语音存在的时间里，噪声估计值实质上是帧能量的一个在线平均。如上面描述的，如果最初的启动噪声估计值被选择的足够高，那麽语音没有被检测到，并且语音标志被因此清零以迫使对噪声估计值的修正。If speech is not present, the noise estimate modification routine in noise estimator 230 is executed. During times when no speech is present, the noise estimate is essentially an online average of the frame energy. As described above, if the initial start-up noise estimate is chosen high enough, speech is not detected, and the speech flag is therefore cleared to force a revision of the noise estimate.

在由噪声估计器230执行的噪声估计例程中，在方框545中确定了一个差值/误差(Δ)，根据下面等式，该误差为帧能量估计器210产生的帧噪声能量与噪声估计器230以前计算的噪声估计值之间的差值：In the noise estimation routine performed by the noise estimator 230, a difference/error (Δ) is determined in block 545, which is the difference between the frame noise energy produced by the frame energy estimator 210 and the noise The difference between noise estimates previously computed by estimator 230:

Δ＝当前帧能量-以前噪声估计(等式3)Δ = current frame energy - previous noise estimate (Equation 3)

判断框550确定是否Δ超过了0。如果Δ是负的，如同高噪声估计值时发生的一样，那么根据下面等式噪声估计值在方框560中被重新计算：Decision block 550 determines if Δ exceeds zero. If Δ is negative, as occurs with high noise estimates, then the noise estimate is recalculated in block 560 according to the following equation:

噪声估计＝以前噪声估计+Δ/2(等式4)Noise Estimate = Previous Noise Estimate + Δ/2 (Equation 4)

由于Δ是负的，这导致噪声估计值向下校正。这里选择了相对较大的步长Δ/2来快速校正以降低噪声电平。然而，如果帧能量超过了噪声估计值，给出一个大于0的Δ，则在方框555中噪声被根据下面等式修正：Since Δ is negative, this causes the noise estimate to be corrected downward. Here a relatively large step size Δ/2 is chosen for fast correction to reduce the noise level. However, if the frame energy exceeds the noise estimate, giving a Δ greater than 0, then in block 555 the noise is corrected according to the following equation:

噪声估计＝以前噪声估计值+Δ/256(等式5)Noise Estimate = Previous Noise Estimate + Δ/256 (Equation 5)

由于Δ是正的，噪声估计值一定会增加。然而，这里选取了一个更小的步长Δ/256(与Δ/2相比)来逐渐加大噪声估计值并对瞬时噪声做出实质性的消除。Since Δ is positive, the noise estimate must increase. However, here a smaller step size Δ/256 (compared to Δ/2) is chosen to gradually increase the noise estimate and substantially eliminate the instantaneous noise.

为当前帧计算的噪声估计值被输出到滤波器选择器235。在第一优选实施方案中，滤波器选择器235读取查询表并利用当前噪声估计值来选取一个滤波器控制值(步骤572)。然后滤波器电路115(步骤574)以选出滤波器控制值函数的形式调整来显示出一个频率响应曲线，该响应曲线意在当噪声估计值和背景噪声增加时加大滤除的噪声量。然后，存储在DSP RAM中的PCM样本通过调整后的滤波器电路265来滤波PCM样本以除去噪声(步骤576)。滤波后的PCM样本之后被声音编码器120处理(步骤578)，然后，编码后的样本被输出到RF发送电路(步骤580)。The noise estimate calculated for the current frame is output to the filter selector 235 . In the first preferred embodiment, filter selector 235 reads the look-up table and uses the current noise estimate to select a filter control value (step 572). Filter circuit 115 (step 574) is then adjusted as a function of the selected filter control values to exhibit a frequency response curve intended to increase the amount of noise filtered as the noise estimate and background noise increase. The PCM samples stored in DSP RAM are then filtered by adjusted filter circuit 265 to remove noise (step 576). The filtered PCM samples are then processed by the vocoder 120 (step 578), and the encoded samples are then output to the RF transmit circuit (step 580).

图6A和6B给出了几个关于滤波器电路如何调整来为输入到滤波器电路115的不同滤波器控制信号显示不同频率响应曲线F1-F4的例子。如图6A所示，滤波器电路115可以被选择来显示一系列不同的频率响应曲线，并且频率响应曲线F1-F4分别具有截止频率F1c-F4c。在优选实施方案中，滤波器电路115的截止频率的范围可以是300HZ到800HZ。当噪声估计值增加时，滤波器电路115被设计为显示具有更高截止频率的频率响应曲线。这种更高的截止频率导致更大部分落在语音低频范围内的帧能量被滤波器电路115抽取。6A and 6B give several examples of how the filter circuit can be adjusted to exhibit different frequency response curves F1 - F4 for different filter control signals input to the filter circuit 115 . As shown in FIG. 6A, the filter circuit 115 can be selected to exhibit a series of different frequency response curves, and the frequency response curves F1-F4 have cut-off frequencies F1c-F4c, respectively. In a preferred embodiment, the cutoff frequency of filter circuit 115 may range from 300 Hz to 800 Hz. As the noise estimate increases, the filter circuit 115 is designed to exhibit a frequency response curve with a higher cutoff frequency. This higher cutoff frequency results in a greater portion of the frame energy falling in the low frequency range of speech being extracted by filter circuit 115 .

同样的，如图6B所示，滤波器电路115可以被选择来显示一系列的不同频率响应曲线F1-F4，并且每一个频率响应曲线具有不同的坡度和相同的截止频率。频率响应曲线F1-F4的截止频率在上面提及的范围内。当噪声估计值增加时，滤波器电路115被调整来显示具有更陡坡度的频率响应曲线。这种更陡的坡度导致更大部分落在语音更低频率范围内的帧能量被滤波器电路115抽取。Likewise, as shown in FIG. 6B, the filter circuit 115 can be selected to display a series of different frequency response curves F1-F4, and each frequency response curve has a different slope and the same cutoff frequency. The cutoff frequencies of the frequency response curves F1-F4 are within the range mentioned above. As the noise estimate increases, the filter circuit 115 is adjusted to exhibit a frequency response curve with a steeper slope. This steeper slope results in a greater fraction of the frame energy falling in the lower frequency range of speech being extracted by filter circuit 115 .

滤波器电路115以某一噪声估计值函数的形式来滤波当前帧，其中的噪声估计值是为当前帧计算的。当前帧被滤波使得噪声被削减而通过了语音的主要部分。未被滤除且通过的语音主要部分给出了可识别的语音输出，语音信号质量只有很小降低。不同截止频率和不同坡度的组合可以被用来自适应地抽取落在语音低频范围内的帧能量的选出部分。Filter circuit 115 filters the current frame as a function of some noise estimate calculated for the current frame. The current frame is filtered so that the noise is cut and the main part of the speech is passed. The main part of the speech that is not filtered and passed gives a recognizable speech output with only a small degradation in speech signal quality. Combinations of different cutoff frequencies and different slopes can be used to adaptively extract selected portions of frame energy that fall within the low frequency range of speech.

图7描述了由滤波器选择器235读取的一个示例查询表，以便替滤波器电路115从滤波器响应曲线F1-F4中选出一个。该查询表包括一系列可能的噪声估计值N1-Nn及滤波器控制值F1-Fn，这些值对应于滤波器电路115显示的可能响应曲线。噪声估计值N1-Nn中的每一个可以表示一个范围的噪声估计值，并且每一个都与一个特定的滤波器控制值F1-F4相匹配。滤波器控制电路105产生一个滤波器控制信号，其方法是计算一个噪声估计值并从查询表中检出与之相关的滤波器控制值。FIG. 7 depicts an example look-up table read by filter selector 235 to select one of filter response curves F1-F4 for filter circuit 115. Referring to FIG. The look-up table includes a series of possible noise estimate values N1-Nn and filter control values F1-Fn corresponding to possible response curves displayed by the filter circuit 115 . Each of the noise estimates N1-Nn may represent a range of noise estimates, and each is matched to a particular filter control value F1-F4. Filter control circuit 105 generates a filter control signal by computing a noise estimate and retrieving the associated filter control value from a look-up table.

图8A&B和9A&B表示了两帧音频信号的每一帧是如何被自适应滤波以给出一个输出到RF发送器的改进音频信号的。图8A和8B表示了分别包括语音分量s1，s2和噪声分量n1，n2的音频信号的一个第一帧和一个第二帧。如图所示，两帧中的噪声能量n1和n2都集中在低音频范围内。而语音能量s1和s2却集中在较高音频范围内。图9A给出了滤波后第一帧的噪声信号n1和语音信号s1。图9B给出了滤波后第二帧的噪声信号n2和语音信号s2。Figures 8A&B and 9A&B illustrate how each of the two frames of the audio signal is adaptively filtered to give an improved audio signal output to the RF transmitter. 8A and 8B show a first frame and a second frame of an audio signal including speech components s1, s2 and noise components n1, n2, respectively. As shown, the noise energies n1 and n2 in both frames are concentrated in the low audio frequency range. The speech energies s1 and s2 are concentrated in the higher audio frequency range. FIG. 9A shows the noise signal n1 and the speech signal s1 of the first frame after filtering. FIG. 9B shows the noise signal n2 and the speech signal s2 of the second frame after filtering.

如同所讨论的，自适应音频降噪系统100被设计用于计算第一帧和第二帧之间噪声电平的差值，其方法是基于当前帧的计算后噪声估计值来调整滤波器控制电路105。例如，滤波器控制电路105计算出噪声估计N1和谱包络s1并且为第一帧选出一个滤波器控制值F1。在优选实施方案中，基于滤波器控制值F1，滤波器电路115被调整并象图6A所示的，显示出一个具有截止频率F1c的频率响应曲线。然后，第一帧通过该调整后的滤波器电路115。滤波器电路115被选中使得大部分的噪声n1和仅一小部分的语音s1落在频率响应曲线F1的截止频率F1c之下。这导致噪声n1被有效的滤除并且只有一部分相对不重要的语音s1被滤除。滤波后的第一帧音频信号表示在图9A中。As discussed, the adaptive audio noise reduction system 100 is designed to calculate the difference in noise level between a first frame and a second frame by adjusting the filter control based on the calculated noise estimate for the current frame circuit 105. For example, filter control circuit 105 computes noise estimate N1 and spectral envelope s1 and selects a filter control value F1 for the first frame. In the preferred embodiment, based on the filter control value F1, the filter circuit 115 is tuned and exhibits a frequency response curve having a cutoff frequency F1c as shown in FIG. 6A. Then, the first frame passes through the adjusted filter circuit 115 . The filter circuit 115 is selected such that most of the noise n1 and only a small part of the speech s1 fall below the cut-off frequency F1c of the frequency response curve F1. This results in the noise n1 being effectively filtered out and only a part of the relatively unimportant speech s1 being filtered out. The filtered audio signal of the first frame is shown in Fig. 9A.

在图8B中显示的第二帧中，存在更高的背景噪声，并假设语音没有被检测到，那麽滤波器控制电路105会计算出一个更高的噪声估计值n2。基于该更高噪声估计值，为第二帧确定一个更高的相应滤波器控制值F2。在第一优选实施方案中，根据更高的滤波器控制值F2来调整滤波器电路115以象图6A表示的那样显示一个具有更高截止频率F2c的频率响应曲线。然后，音频信号的后续帧通过了调整后的滤波器电路115。因为对于后续帧来说，频率响应曲线F2的截止频率F2c更高，所以大部分的噪声n2和语音s2都被滤除了。(但是)，语音s2被滤除的部分与该帧包括的清晰度信息相比仍然相对不显著，因此这对语音仅有很小的影响。滤除更大部分的语音s2的缺点被第二帧中噪声n2去除量增加的优点所抵消。被滤除掉的语音谱部分并不显著作用于语音的清晰度。第二帧中滤波后的音频信号在图9B中表示出。In the second frame shown in FIG. 8B , there is higher background noise, and assuming speech is not detected, the filter control circuit 105 calculates a higher noise estimate n2. Based on the higher noise estimate, a higher corresponding filter control value F2 is determined for the second frame. In the first preferred embodiment, the filter circuit 115 is adjusted according to the higher filter control value F2 to exhibit a frequency response curve with a higher cutoff frequency F2c as shown in FIG. 6A. Subsequent frames of the audio signal then pass through the adjusted filter circuit 115 . Since the cut-off frequency F2c of the frequency response curve F2 is higher for subsequent frames, most of the noise n2 and speech s2 are filtered out. (However), the filtered out part of the speech s2 is still relatively insignificant compared to the intelligibility information included in the frame, so this has only a small impact on the speech. The disadvantage of filtering out a larger portion of the speech s2 is offset by the advantage of increased noise n2 removal in the second frame. The portion of the speech spectrum that is filtered out does not significantly contribute to speech intelligibility. The filtered audio signal in the second frame is shown in Fig. 9B.

图10-12中给出了自适应降噪系统100的第二优选实施方案。在第二优选实施方案中，滤波器控制电路105以噪声包络估计值函数的形式来调整滤波器电路115。噪声包络估计值是针对每一帧计算的并与参考噪声包络估计值比较。基于该比较，滤波器电路115被自适应地调整来从当前帧中提取不同数量的低频能量。A second preferred embodiment of an adaptive noise reduction system 100 is shown in FIGS. 10-12. In a second preferred embodiment, filter control circuit 105 adjusts filter circuit 115 as a function of the noise envelope estimate. A noise profile estimate is calculated for each frame and compared to a reference noise profile estimate. Based on this comparison, filter circuit 115 is adaptively adjusted to extract different amounts of low frequency energy from the current frame.

参考图10，给出了按照第二优选实施方案配置的DSP 200。如图所示，除了参考第一优选实施方案描述的帧能量估计器210，噪声估计器230，语音检测器240和滤波器选择器235之外，滤波器控制电路105还包括谱分析器270。如同第一实施方案所描述的和流程图5A和5B所表示的，滤波器控制电路105为接收到的帧确定噪声估计值并检测语音的存在。在为当前帧检测语音时，谱分析器270修正噪声包络估计值，并在调整滤波器电路115中使用该值。Referring to Figure 10, a DSP 200 configured according to a second preferred embodiment is shown. As shown, filter control circuit 105 includes spectral analyzer 270 in addition to frame energy estimator 210, noise estimator 230, speech detector 240 and filter selector 235 described with reference to the first preferred embodiment. As described for the first embodiment and represented by flowcharts 5A and 5B, filter control circuit 105 determines noise estimates for received frames and detects the presence of speech. The spectral analyzer 270 modifies the noise envelope estimate and uses this value in the adjustment filter circuit 115 when detecting speech for the current frame.

参考图11，给出了修正噪声包络估计值和调整滤波器电路115的步骤。图11给出了谱分析器270执行的步骤，这些步骤在以前第一优选实施方案的流程图5A和5B中描述的整个过程中被引用。Referring to FIG. 11 , the steps of modifying the noise profile estimate and adjusting the filter circuit 115 are given. Figure 11 shows the steps performed by spectrum analyzer 270, which are referenced throughout the process previously described in flow diagrams 5A and 5B of the first preferred embodiment.

如果当前帧中没有检测到语音，谱分析器270首先为当前帧确定一个噪声包络(步骤600)。为当前帧确定的噪声包络中包括不同频率(即频率点)上的能量计算值，这些频率位于为当前帧选出的语音低频范围中。在优选实施方案中，选中的频率范围大约为300到800HZ。当前帧的噪声包络可以通过利用具有N个频率点的快速傅立叶变换(FFT)处理当前帧来确定。利用FFT处理数字信号在现有技术领域内是众所周知的，其优越处在于当FFT局限于相对较少的频率点，例如32点时，它需要很少的处理功率。具有N个频率点的FFT在N个不同的频率处产生能量计算。落在选中的频率范围内的频率点的能量计算值形成了当前帧的噪声包络。If no speech is detected in the current frame, spectrum analyzer 270 first determines a noise profile for the current frame (step 600). The noise profile determined for the current frame includes calculated energy values at different frequencies (ie, frequency bins) that are located in the speech low frequency range selected for the current frame. In a preferred embodiment, the selected frequency range is approximately 300 to 800 Hz. The noise profile of the current frame can be determined by processing the current frame with a Fast Fourier Transform (FFT) with N frequency bins. Using FFT to process digital signals is well known in the art. It has the advantage that when FFT is limited to relatively few frequency points, eg 32 points, it requires little processing power. An FFT with N frequency bins produces energy calculations at N different frequencies. The calculated energies of frequency points falling within the selected frequency range form the noise envelope for the current frame.

为了确定当前帧的噪声包络估计值(步骤604)，将当前帧的噪声包络与为音频信号的以前帧而确定的噪声包络估计值做平均。当没有以前的噪声包络估计值可以得到时，例如初始化之后，可以使用存储的初始噪声包络估计值。噪声包络估计值包括位于连续低频(即，在选中的频率范围内，e₁是最高频率的噪声能量估计值而e_n是最低频率的噪声能量估计值)上的噪声能量估计值e_i(其中i＝1，2…n)。在优选实施方案中，每个噪声能量估计值e_i对应于在某一特定频率上能量计算值的平均值，该特定频率是在大量的其中没有语音被检测到的连续帧上的选中频率范围内的一个频率点。通过使用大量的帧来确定噪声包络估计值，滤波器电路115在一个更为渐进的基础上被调整。在另一实施方案中，噪声包络估计值可以等于当前帧的噪声包络。To determine a noise profile estimate for the current frame (step 604), the noise profile for the current frame is averaged with noise profile estimates determined for previous frames of the audio signal. The stored initial noise profile estimate may be used when no previous noise profile estimate is available, eg after initialization. The _noise envelope estimate consists of noise energy estimates _{e i} ₍ where i=1, 2...n). In a preferred embodiment, each noise energy estimate _ei corresponds to the average value of the energy calculations at a specific frequency, which is a selected frequency range over a large number of consecutive frames in which no speech is detected A frequency point within . By using a large number of frames to determine the noise profile estimate, the filter circuit 115 is adjusted on a more gradual basis. In another embodiment, the noise profile estimate may be equal to the noise profile of the current frame.

然后，噪声包络估计值的能量估计值e_i与参考噪声包络相比(步骤604)。参考噪声包络包括参考能量阈值e_ri(其中i＝1，2…n)，这些阈值位于对应于噪声包络估计值的噪声能量估计值e_i的频率点上。参考能量阈值e_ri可以按经验确定。按照从最高频率能量估计值e₁到最低频率能量估计值e_n的顺序，噪声能量估计值e_i被连续与对应的参考能量阈值e_ri相比较。Then, the energy estimate _ei of the noise profile estimate is compared with the reference noise profile (step 604). The reference noise profile includes reference energy thresholds e _ri (where i=1, 2...n) at frequency bins corresponding to noise energy estimates _ei of the noise profile estimates. The reference energy threshold e _ri can be empirically determined. In order from the highest frequency energy estimate e ₁ to the lowest frequency energy estimate e _n , the noise energy estimates e _i are successively compared with corresponding reference energy thresholds e _ri .

更具体的，噪声能量估计e₁首先与参考噪声阈值e_r1相比较。如果e₁大于参考噪声阈值e_r1，那麽比较值C1被选中并输入到滤波器选择器235。如果噪声估计值e₁小于参考噪声阈值e_rl，那麽噪声能量估计值e₂(该值为在低于e₁的频率处得到的噪声能量估计值)与参考阈值e_r2相比较。如果噪声能量估计值e₂大于参考噪声阈值e_r2，那麽比较值C₂被选中并输入到滤波器选择器235。到比较值C_i(其中i＝1，2…n)被选中为止，比较过程一直继续。More specifically, the noise energy estimate e ₁ is first compared with a reference noise threshold e _r1 . If e ₁ is greater than the reference noise threshold e _r1 , the comparison value C1 is selected and input to the filter selector 235 . If the noise estimate e ₁ is smaller than the reference noise threshold e _rl , then the noise energy estimate e ₂ , which is the noise energy estimate obtained at a frequency lower than e ₁ , is compared with the reference threshold e _r2 . If the noise energy estimate e ₂ is greater than the reference noise threshold e _r2 , the comparison value C ₂ is selected and input to the filter selector 235 . The comparison process continues until a comparison value C _i (where i=1, 2...n) is selected.

滤波器电路235使用确定的比较值Ci来确定一个滤波器控制值。该滤波器控制值从例如图12中给出的查询表中选择。查询表包括一系列的比较值Ci和相应的滤波器控制值Fi。滤波器电路115以选中滤波器控制值函数的形式来调整。滤波器电路115被调整来显示一个频率响应曲线以便从当前帧中提取低频能量。当连续更高频率上的噪声能量估计值超过他们对应的参考能量阈值时，滤波器电路115被调整来提取更多的低频能量。图6A和6B给出了选中滤波器控制值的示例频率响应曲线。The filter circuit 235 uses the determined comparison value Ci to determine a filter control value. The filter control values are selected from a look-up table such as that given in FIG. 12 . The look-up table includes a series of comparison values Ci and corresponding filter control values Fi. The filter circuit 115 is adjusted as a function of the selected filter control value. Filter circuit 115 is tuned to display a frequency response curve to extract low frequency energy from the current frame. When the noise energy estimates at successively higher frequencies exceed their corresponding reference energy thresholds, the filter circuit 115 is adjusted to extract more low frequency energy. Figures 6A and 6B show example frequency response curves for selected filter control values.

噪声包络估计值的使用帮助提高了自适应地调整滤波器电路来提取低频能量的能力，其所采取方式有助于改善语音总体质量。由于汽车环境不是使用移动无线电通讯装置的唯一环境。因此，某一环境中的噪声包络可能倾向于更高频率。当低频中的噪声能量很小时，谱分析器270可以被选择性地被禁止。而且，当噪声频率谱的很大一部分位于低频时，那麽即使某些处理功率被牺牲掉，也要应用更陡的滤波坡度。这种额外处理要求仍然是很小的。The use of the noise envelope estimate helps improve the ability to adaptively adjust the filter circuit to extract low frequency energy in a way that helps improve the overall quality of speech. Since the automotive environment is not the only environment in which mobile radios are used. Therefore, the noise profile in an environment may be skewed toward higher frequencies. Spectrum analyzer 270 can be selectively disabled when the noise energy in low frequencies is small. Also, when a significant portion of the noise frequency spectrum is at low frequencies, then steeper filtering slopes are applied even though some processing power is sacrificed. This additional processing requirement is still minimal.

根据上面描述显而易见，该发明的自适应噪声滤波系统被简单地实现。并且DSP的计算量没有显著增加。削减噪声更复杂的方法，例如″谱削减″要求几个涉及计算的MIPS和存储数据和程序代码用的大量存储器。通过比较，该发明可以通过使用仅仅一小部分″谱削减″算法所要求的MIPS和存储器而实现，其中的谱削减算法同时也引入了更多的语音失真。容量降低的存储器减小了DSP集成电路的大小；降低的MIPS减少了功率消耗。这些特性对于电池供电的便携/移动无线电话都是很理想的。As apparent from the above description, the adaptive noise filtering system of the invention is simply implemented. And the calculation amount of DSP does not increase significantly. More complex methods of noise reduction, such as "spectral reduction" require several MIPS involved in the computation and a large amount of memory for storing data and program code. By comparison, the invention can be implemented using only a fraction of the MIPS and memory required by the "spectral clipping" algorithm, which also introduces more speech distortion. Reduced memory reduces the size of the DSP integrated circuit; reduced MIPS reduces power consumption. These features are ideal for battery-operated portable/mobile radiotelephones.

参考其优选实施方案，尽管该发明已经被特别表示并描述出，但它不仅仅限于这些实施方案。例如，尽管DSP被描述为执行帧能量估计器210，噪声估计器230，语音检测器240，滤波器选择器235和滤波器电路265的功能，这些功能可以通过使用其它的数字和/或模拟元件来实现。此外，在滤波器电路115以噪声估计和噪声包络估计这两者函数的形式来调整时，自适应滤波系统100也可以被实现。While the invention has been particularly shown and described with reference to its preferred embodiments, it is not limited thereto. For example, although the DSP is described as performing the functions of frame energy estimator 210, noise estimator 230, speech detector 240, filter selector 235 and filter circuit 265, these functions may be implemented using other digital and/or analog components to fulfill. Furthermore, the adaptive filtering system 100 can also be implemented when the filter circuit 115 is adjusted as a function of both the noise estimate and the noise envelope estimate.

Claims

1. method that is used for optionally changing a frame of digital signal, digital signal wherein is made of a plurality of successive frames, the sound signal that this digital signal representation receives at the transmitter place, this sound signal or constitute by speech components, or constitute by noise component, perhaps be made of jointly speech components and noise component, described method is characterised in that and may further comprise the steps:

The energy sizes values of estimative figure signal frame;

Determine whether comprise speech components in the digital signal frame according to the estimated value that obtains in the described estimating step;

When determining that in described determining step speech components does not constitute frame a part of, revise noise estimation value with the form of the energy sizes values function estimated in previous noise estimation value and the described estimating step;

Read a record in the question blank, this question blank has the filter characteristic according to the noise estimation value size index, and the record of reading is corresponding to the noise estimation value of revising in the described correction step;

Select to want the filter circuit filtering characteristic of filtered device demonstration, the filtering characteristic of choosing is corresponding to the filter characteristic of storage in reading to note down in the described read step;

Use the filter filtering digital data frames, wave filter wherein shows the filter circuit filtering characteristic, changes digital data frames according to the filter circuit filtering characteristic thus.

2. according to the method in the claim 1, its feature is also the intermediate steps of adding not comprise speech components if digital data frames is determined that this additional step is determined the noise envelope estimated value of digital signal frame so.

3. according to the method in the claim 2, the noise envelope estimated value that wherein is determined in described definite noise envelope estimated value step is used to revise noise estimation value in described correction step.

4. according to the method in the claim 1, wherein question blank is read in described read step, and it is characterized in that has a lot of records in the look-up table, and each record all comprises an independently filter characteristic.

5. according to the method in the claim 4, wherein the separate filter characteristic of a plurality of records comprises independently high pass filter characteristic in the question blank, and each high-pass filtering characteristic is by an independent cutoff frequency definition.

6. according to the method in the claim 4, wherein the separate filter characteristic of a plurality of records comprises independently high pass filter characteristic in the question blank, and each high-pass filtering characteristic is defined by an independent frequency response curve gradient.

7. according to the method in the claim 1, it is characterized in that further step: thus Counter Value statistics frame number of estimated energy sizes values for it in described estimating step increased.

8. according to the method in the claim 7, when every N counting value increases, just carry out the step of described selective filter circuit median filter characteristic, N be one greater than 1 integer.

9. device (100 that is used for optionally changing digital signal frame; 200), digital signal wherein is made of a plurality of successive frames, the sound signal that this digital signal representation receives at the transmitter place, this sound signal can be made of speech components, or constitute by noise component, perhaps constituting jointly by these two components, described device is characterised in that:

Be coupled the energy value estimator (210) of receiving digital signals frame identification, described energy value estimator is used for the energy value of estimative figure signal frame;

Be coupled to the speech detector (240) of described energy value estimator, described speech components determiner is used for determining whether digital signal frame comprises speech components;

Determine not exercisable noise estimator (230) during configuration frame a part of of speech components when described speech components determiner, described noise estimator is revised noise estimation value with the form of the energy value function that front noise estimation value and described estimator are estimated;

The question blank that comprises a lot of records, wherein every record comes index with noise estimation value, and the record in the question blank reads according to the noise estimation value that is formed by described noise estimator;

Be coupled the wave filter (265) of receiving digital data frame, described wave filter shows the filtering characteristic of selectable filter circuit, selection to the filter circuit filtering characteristic of wave filter is to determine according to the record of question blank, and the record of this question blank reads according to the noise estimation value by described noise estimator correction.

10. according to the device in the claim 9, its feature also is a noise envelope estimator (270), if described speech components determiner determines that digital data frames does not comprise speech components, this noise envelope estimator is determined the noise envelope estimated value of this digital data frames so.