[go: up one dir, main page]

HK1204718B - Techniques for distortion reducing multi-band compressor with timbre preservation - Google Patents

Techniques for distortion reducing multi-band compressor with timbre preservation Download PDF

Info

Publication number
HK1204718B
HK1204718B HK15105094.7A HK15105094A HK1204718B HK 1204718 B HK1204718 B HK 1204718B HK 15105094 A HK15105094 A HK 15105094A HK 1204718 B HK1204718 B HK 1204718B
Authority
HK
Hong Kong
Prior art keywords
frequency band
threshold
time
varying
level
Prior art date
Application number
HK15105094.7A
Other languages
Chinese (zh)
Other versions
HK1204718A1 (en
Inventor
A‧J‧西非尔特
Original Assignee
杜比实验室特许公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杜比实验室特许公司 filed Critical 杜比实验室特许公司
Publication of HK1204718A1 publication Critical patent/HK1204718A1/en
Publication of HK1204718B publication Critical patent/HK1204718B/en

Links

Description

Techniques for distortion-reducing multi-band compressor with timbre preservation
The present application is a divisional application of an invention patent application having an application number of 201180014412.5, an application date of 2011, 3/15, and an invention name of "technique for distortion-reducing multiband compressor with sound quality protection".
Cross Reference to Related Applications
The present invention claims priority from U.S. patent provisional application No.61/315,172, filed 3/18/2010, which is incorporated herein by reference in its entirety.
Technical Field
The present invention relates generally to audio rendering, and more particularly to distortion reduction during rendering.
Background
Playback devices, such as laptop computers, cellular radiotelephones, portable music players, and televisions, include amplifiers and audio transducers (e.g., speakers) with limited output capabilities. In these devices, the audio playback is perceptibly distorted, and often noticeably distorted during presentation as the playback level increases. Furthermore, such distortions are often frequency dependent for playback devices. For example, when the output signal typically exceeds a certain level, the form factor of a television may exhibit a resonant response at a certain frequency, producing an annoying sound.
Multi-band compression may be applied to the audio signal prior to playback to reduce distortion and attempt to maximize playback levels. A distortion threshold is specified for each frequency band of the compressor. The compressor applies different gain values to each frequency band independently to ensure that the output signal does not exceed any corresponding distortion threshold.
However, this approach may drastically change the timbre (time) or listener perceived properties, where two sounds of equal loudness and pitch (pitch) may be perceived as dissimilar. That is, when certain frequencies reach the distortion threshold and others do not, the compressor introduces its own drawbacks by changing the relative balance between these frequencies. Each operating separately. The resulting sound appears distorted or otherwise unnatural.
From this, the inventors have appreciated that techniques for sound quality protection through multi-band compression are desirable for a clear and natural listening experience.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Accordingly, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, problems identified with respect to one or more methods should not be assumed to have been recognized in any prior art based on this section.
International patent application publication No. WO2008/057173 describes a hierarchical control path for dynamic processing, wherein constraints on the dynamic processing of audio signals are applied in a hierarchical manner to provide improved performance. European patent application publication EP1251715 describes a multi-band digital hearing aid device which attenuates narrow frequency bands in the frequency range in which the hearing aid is prone to oscillate.
Disclosure of Invention
Methods and apparatus are provided for timbre preservation in a multi-band compressor. Timbre protection is achieved by determining a time-varying threshold in each of a plurality of frequency bands, the time-varying threshold being related to (i) a respective fixed threshold for that frequency band and, at least in part, (ii) an audio signal level (whether digital or analog audio signal) in a second frequency band and (iii) a fixed threshold in the second frequency band. Thus, each time-varying threshold is input signal adaptive. If a particular frequency band is subject to a significant gain reduction (gain reduction) due to being above its fixed threshold (or alternatively, being close to the fixed threshold), the time-varying threshold of one or more other frequency bands is also lowered to be subject to several gain reductions.
In an embodiment of the invention, a fixed threshold for the first frequency band is provided or otherwise determined. A first level of an audio signal within the first frequency band is determined. The first level may be less than the fixed threshold. A second level of the audio signal in a second frequency band is also determined. Calculating a time-varying threshold for the first frequency band using the second level, the time-varying threshold being less than the fixed threshold. The audio signal is attenuated to be equal to or less than the time-varying threshold in the first frequency band, or alternatively, the audio signal may be gradually attenuated in the first frequency band as the time-varying threshold is approached. The time-varying threshold may be calculated from the average difference of the audio input signal in each frequency band and its respective fixed threshold. Optionally, a second fixed threshold for the second frequency band may be further determined. The second level of the audio signal may exceed the second fixed threshold, resulting in attenuation of the audio signal within the second frequency band to the second fixed threshold.
In another embodiment, the compressor comprises a multiband filter bank, a compression function element (compression function element) and at least one timbre preservation element. Each compression function element may be dedicated to one frequency band. The timbre preservation element is coupled to the multi-band filter bank and the compression function element. The timbre preservation element receives a fixed threshold for each frequency band and provides a time-varying threshold for each frequency band. The time-varying threshold for one frequency band is determined in part by the level of the audio signal outside of that frequency band.
In yet another embodiment of the present invention, a system includes a multi-band filter bank, a timbre preservation element, and a compression function element. The timbre preservation element receives a fixed threshold for each of a plurality of frequency bands and in turn provides a time-varying threshold for the frequency bands. The time-varying threshold is determined at least in part by the level of the audio signal outside of the frequency band. In a particular embodiment, the time-varying threshold is a function of the average difference of the audio input signal in each frequency band from its corresponding fixed threshold.
As another embodiment of the present invention, a system includes a multiband filter means, a compression function means, and a timbre preservation means. The timbre preservation apparatus receives a fixed limit for each of a plurality of frequency bands and provides a time-varying threshold for each of the frequency bands. These time-varying thresholds are determined at least in part by the levels of the audio signals outside the respective frequency bands and the associated fixed thresholds.
As an embodiment of the present invention, a predetermined threshold value for the first frequency band is provided. A first level of an audio signal within the first frequency band is determined. The first level may be less than the predetermined threshold. A second level of the audio signal in a second frequency band is also determined. Calculating a signal adaptive threshold for the first frequency band using the second level, the signal adaptive threshold being less than the predetermined threshold. The audio signal is attenuated within the first frequency band based on the signal adaptive threshold. In a particular embodiment, the predetermined threshold may be independent of the level of the audio signal within the first frequency band, or indeed independent of the level in any frequency band. Conversely, the signal adaptive threshold depends on the audio signal, in particular on the level of the audio signal outside the first frequency band.
Drawings
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
FIG. 1A illustrates an exemplary compressor according to an embodiment of the present invention;
FIGS. 1B and 1C provide exemplary input/output characteristics of a compression function according to embodiments of the present invention;
FIG. 2 is a simplified diagram illustrating exemplary results according to an embodiment of the present invention; and
fig. 3 shows a simplified flow diagram according to an embodiment of the invention.
Detailed Description
Fig. 1A illustrates an exemplary multi-band compressor 100 with timbre preservation constraints in accordance with an embodiment of the present invention. Compressor 100 receives an input signal x n]Which is divided into a plurality of frequency bands (e.g., B frequency bands, which may be 2, 3, 4, 5, … … 20 or more frequency bands) by the filter bank 102. As an example, the output of each frequency band of the filter bank 102 may be as input signal x [ n ]]Response h to band-pass filterb[n]The convolution is calculated:
xb[n]=hb[n]*x[n],b=1...B
next, each band signal is associated with a respective time-varying threshold Tb[n]Are passed together to the respective compression functions CF104(a), 104(B), … … 104 (B). FIG. 1B is provided as Tb[n]CF104(a), 104(b), … of the function of (c)… 104 (B). The input level of the compression function can be taken as the band signal x in various waysb[n]Is calculated. For example, a Fast attack/slow release (Fast attack/slow release) single-pole smoother (e.g., energy estimator 108) may be applied to signal xb[n]To calculate the time-varying energy e in each frequency bandb[n]Estimation of (2):
attack time value (lambda)A) May be of the order of 10ms, and a release time value (lambda)R) May be on the order of 100ms (e.g., a release time 10 times the attack time, or greater). With a sequence such as eb[n]Estimated input signal xb[n]Is close to the threshold value Tb[n]The output signal rises more slowly and is eventually limited to the threshold (e.g., output gain g)b[n]As reflected by the change in (c).
FIG. 1C shows another compression function. In this case, at the threshold value Tb[n]The input/output slope 110 below exceeds at threshold Tb[n]The slope of the upper side. Instead of a gradual time-varying threshold, it may be desirable to continue to decay at a different rate (e.g., a decreasing rate or greater) beyond the time-varying threshold. In particular embodiments, slope 110 is equal to 1 or less, while slope 112 is less than slope 110 or even zero. It should be further understood that the CFs 104(a), 104(B), … … 104(B) may each have different or personalized input/output characteristics for a particular frequency band.
These time-varying thresholds T are calculated using a Timbre Preservation Function (TPF) element 106b[n]. In this embodiment, each time-varying threshold Tb[n]All band signals x in band B-1 … … Bb[n]And all fixed thresholds LbIs calculated as:
Tb[n]=TPF({xi[n],Li|i=1...B})
gain g for each frequency bandb[n]And then calculated as gb[n]=CF(xb[n],Tb[n])。
Alternatively, each threshold Tb[n]Can be used as a plurality of but less than all of the frequency band signals xb[n]And/or a plurality but less than all of the fixed thresholds LbIs calculated. The time-varying threshold for a frequency band may be calculated based on its closest neighboring frequency band or a range of neighboring frequency bands. In some cases, it may be desirable to allow a particular belt to operate completely isolated from the TPF, without contributing to the TPF. For example, some audio systems may have an abnormally low fixed threshold in bass frequencies due to small speaker sizes. If these bass frequency bands are allowed to contribute to the TPF, a sharp reduction in the overall playback level may result. In this case, it may be desirable to allow these bass frequency bands to operate independently and apply TPF to the remaining frequency bands. Alternatively, additional frequency dependent weighting may be employed to weight the bass frequency bands less heavily.
In the compressor 100, as exceeding their fixed threshold LbAs a function of other frequency bands, the TPF element 106 decreases to have a fixed threshold L below thembOf the input level of (1) is a time-varying threshold of the frequency band of the input level of (2). In other words, if a frequency band is subject to significant gain reduction due to being above its fixed threshold, the time-varying thresholds of other frequency bands are also reduced to be subject to several gain reductions. Because the time-varying thresholds for the frequency bands are reduced below their respective fixed thresholds, the compressor 100 still reduces distortion while mitigating or otherwise preventing changes to the sound quality.
As an embodiment of the invention, TPF element 106 may be configured to calculate the audio input signal in each frequency band and its corresponding fixed threshold LbThe average difference of (a). The time-varying threshold in each frequency band may then be the audio input signal level in that frequency band minus the average difference.
Furthermore, the time-varying threshold may be smoothed over time, at least greater than the gain gb[n]. That is, the level of the audio input signal used to calculate the threshold may be greater than the level used to calculate the gain gb[n]Signal (e.g. e)b[n]) Are smoothed more heavily. The smoother energy signal s may be calculated using a single-pole smoother with a longer time constantb[n]:
In this case, attack and release times on the order of 10 times that of conventional multiband compressors can be used. The smoothed energy signal may then be represented by dB:
Sb[n]=10log10(sb[n])
smoothing the energy signal in each band, also expressed in dB, with a fixed threshold L in each bandbThe difference between is calculated as:
Db[n]=Sb[n]-Lb
and find the minimum of these distances over all bands:
a weighted average of these differences over the frequency band is then calculated, where β represents the weighting factor:
when β is 1, calculate the true average of the difference, when β>A larger difference contributes more to the average value at 1. In other words, greater than the threshold LbActually β -8 produces sufficient weighting for TPF element 106The threshold value is less than a fixed threshold value LbTime, threshold value Tb[n]Is calculated as the smoothed signal energy in the frequency band minus the average difference. Otherwise, keeping the time-varying threshold equal to the fixed threshold:
as an alternative implementation of the TPF element, rather than a weighted average, the calculation may be derived from the distance Db[n]Threshold of maximum value of (2):
if the threshold is smaller than a fixed threshold, each threshold may then be calculated as the smoothed signal energy in the frequency band minus the maximum distance plus some tolerance value Dtol
Tolerance value DtolMay be designed to allow some variation in the amount of compression applied to each band. For the specific embodiment, DtolAn actual value of 12dB allows for sufficient variation.
Fig. 2 shows an exemplary result of applying TPF to a 20-band compressor on a real-world audio signal. In this case, twenty frequency bands are selected and spaced to simulate the perceived resolution of human hearing, and a fixed threshold for each frequency band is determined by the listening test to prevent distortion on the playback device speaker. The resulting energy e of the band signal supplied to the compressor functionb[n]Represented by the strip 202. Line 204 depicts the resulting gain gb[n]. The middle of fig. 2 represents 0dB and the bottom-30 dB. Line 206 depicts the smoothed signal energy sb[n]. Lines 208 and 210 depict the fixed threshold L, respectivelybAnd a time-varying thresholdValue Tb[n]。
In this example, the smoothed signal energy eb[n]And sb[n]Is substantially greater than the fixed threshold L for frequency bands 1 to 4bThese bands are therefore subject to significant attenuation. Frequency bands 1 to 4 need no reduced time-varying threshold, and Tb[n]=Lb. On the other hand, for bands 5-20, the signal energy eb[n]And sb[n]Not much greater than or completely less than their fixed threshold value Lb. As a result, the threshold, T, is lowered as a function of bands 1 to 4 showing significant attenuationb[n]<LbIn some cases, the threshold is significantly lowered. The end result is that all 20 bands are attenuated. Without the timbre preservation constraint according to embodiments of the present invention, e is due tob[n]<LbThus, the bands 6 to 20 will not be attenuated at all, resulting in a significant change in sound quality. For example, there will be a 20dB difference between bands 4 and 9, but with TPF the difference is reduced to 8 dB.
Fig. 3 shows a simplified flow diagram 300 according to an embodiment of the invention. In step 302, a fixed threshold for the first frequency band is determined or provided. Next, in step 304, a first level of the audio signal is determined within a first frequency band. The first level may be less than the fixed threshold. For step 306, a second level of the audio signal is determined for a second frequency band. The second frequency band is different from the first frequency band. In step 308, a time-varying threshold for the first frequency band is calculated or otherwise determined using the second level in the second frequency band and the fixed threshold. The time-varying threshold is less than or equal to the fixed threshold for the first frequency band. Finally, in step 310, the audio signal is attenuated to less than or equal to the time-varying threshold in the first frequency band. It will be appreciated that the attenuation of the signal may occur before the threshold (whether fixed or time-varying) is reached as shown in fig. 1B, with a gradual attenuation being applied as the time-varying threshold is approached.
Optionally, in steps 312 and 314, a second fixed threshold for the second frequency band is determined. The second level of the audio signal may exceed a second fixed threshold. The audio signal may be attenuated to a second fixed threshold in a second frequency band. Other alternatives may be provided in addition to steps 312 and 314, where steps are added, one or more steps are removed, or one or more steps are provided in a different order than above, without departing from the scope of the appended claims. These above steps may be performed by one or more devices including a processor.
Implementation mechanisms-hardware overview
According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices. A special purpose computing device may be hardwired to perform the techniques, or may include digital electronic devices that are persistently programmed to perform the techniques, such as one or more Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs), or may include one or more general purpose hardware processors that are programmed to perform the techniques according to program instructions in firmware, memory, other storage, or a combination. These special purpose computing devices may also combine custom hardwired logic, ASICs, or FPGAs with custom programming to implement the techniques. A special purpose computing device may be a desktop computer system, portable computer system, handheld device, networked device, or any other device that includes hardwired and/or program logic to implement the techniques. The techniques are not limited to any specific combination of hardware circuitry and software nor to any particular source for the instructions executed by the computing device or data processing system.
The term "storage medium" as used herein refers to any medium that stores data and/or instructions that cause a machine to operate in a specific manner. It is non-transitory. The storage medium may include non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks. Volatile media includes dynamic memory. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CDROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge. Storage media is distinct from, but may be used in conjunction with, transmission media. Transmission media participate in the transfer of information between storage media. Transmission media include, for example, coaxial cables, copper wire and fiber optics. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
As used herein, "audio transducer" may include, but is not limited to, speakers (e.g., direct radiating electro-dynamic drivers mounted on a housing), horn speakers, piezoelectric speakers, magnetostrictive speakers, electrostatic speakers, ribbon and planar magnetic speakers, bending wave speakers, flat panel speakers, distributed mode speakers, Heil air motion transducers, plasma arc speakers, digital speakers, and any combination/mixture thereof.
Equivalents, extensions, substitutions and miscellaneous
In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It should also be understood that for clarity, for example (e.g.) means "by way of example" (and not by way of exhaustive list), as opposed to either (i.e.) or "that is".
Furthermore, in the preceding description, numerous specific details are set forth, such as examples of specific components, devices, methods, etc., in order to provide a thorough understanding of embodiments of the present invention. It will be appreciated, however, by one skilled in the art that these specific details need not be employed to practice embodiments of the present invention. In other instances, well-known materials or methods have not been described in detail in order to avoid unnecessarily obscuring the embodiments of the invention.

Claims (13)

1. A method for audio presentation, the method comprising:
determining a first fixed threshold for a first frequency band;
determining a first level of audio signals within the first frequency band, the first level being less than the first fixed threshold;
determining a second level of the audio signal for a second frequency band, the second frequency band being different from the first frequency band;
determining a second fixed threshold for the second frequency band;
calculating a time-varying threshold for the first frequency band using the second level and the second fixed threshold, the time-varying threshold being less than or equal to the first fixed threshold, wherein a particular frequency band does not contribute to the calculation of the time-varying threshold for the first frequency band; and
attenuating audio signals within the first frequency band based on the time-varying threshold.
2. The method of claim 1, wherein the attenuation reduces audio signals within the first frequency band to less than or equal to the time-varying threshold.
3. The method of claim 1, further comprising:
attenuating the audio signal within a second frequency band to be less than or equal to the second fixed threshold if a second level of the audio signal exceeds the second fixed threshold.
4. The method of claim 1, wherein the time-varying threshold is a function of an average difference of the audio input signal in each frequency band from its respective fixed threshold.
5. The method of claim 1, further comprising: the attenuated audio signal is audibly presented by at least one output transducer of the playback device.
6. The method of claim 5, wherein the playback device is at least one of: televisions, cellular radiotelephones, radios, personal music players, laptop computers, desktop computers, and portable DVD players.
7. A multi-band compression system comprising:
-multiband filter means for dividing the audio signal for a plurality of frequency bands;
compression function means, each compression function means being dedicated to one frequency band; and
timbre preservation means coupled to said multi-band filter means and said compression function means, said timbre preservation means receiving fixed limits for each frequency band and providing time-varying thresholds for each frequency band,
wherein the time-varying threshold for a first frequency band is determined by a level of the audio signal outside said first frequency band, wherein a particular frequency band does not contribute to the calculation of the time-varying threshold for the first frequency band.
8. The system of claim 7, wherein the time-varying threshold for the first frequency band is further determined by a plurality of fixed thresholds outside the first frequency band.
9. An apparatus for audio presentation, the apparatus comprising:
means for determining a first fixed threshold for a first frequency band;
means for determining a first level of audio signals within the first frequency band, the first level being less than the first fixed threshold;
means for determining a second level of the audio signal for a second frequency band, the second frequency band being different from the first frequency band;
means for determining a second fixed threshold for the second frequency band;
means for calculating a time-varying threshold for the first frequency band using the second level and the second fixed threshold, the time-varying threshold being less than or equal to the first fixed threshold, wherein a particular frequency band does not contribute to the calculation of the time-varying threshold for the first frequency band; and
means for attenuating audio signals within the first frequency band based on the time-varying threshold.
10. The apparatus of claim 9, wherein the means for attenuating reduces the audio signals within the first frequency band to less than or equal to the time-varying threshold.
11. The apparatus of claim 9, further comprising:
means for attenuating audio signals within a second frequency band to be less than or equal to the second fixed threshold if a second level of audio signals exceeds the second fixed threshold.
12. The apparatus of claim 9, wherein the time-varying threshold is a function of an average difference of the audio input signal in each frequency band from its respective fixed threshold.
13. The apparatus of claim 9, further comprising: means for audibly presenting the attenuated audio signal.
HK15105094.7A 2010-03-18 2015-05-28 Techniques for distortion reducing multi-band compressor with timbre preservation HK1204718B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US31517210P 2010-03-18 2010-03-18
US61/315,172 2010-03-18

Publications (2)

Publication Number Publication Date
HK1204718A1 HK1204718A1 (en) 2015-11-27
HK1204718B true HK1204718B (en) 2018-05-18

Family

ID=

Similar Documents

Publication Publication Date Title
CN102812636B (en) Techniques for Distortion-Reducing Multiband Compressors with Sound Quality Preservation
CN105164918B (en) Band compression with dynamic threshold
US8634578B2 (en) Multiband dynamics compressor with spectral balance compensation
US20050015252A1 (en) Speech correction apparatus
CN111970628B (en) Audio signal enhancement method, device, storage medium and processor
WO2015111084A2 (en) Dynamic range compression with low distortion for use in hearing aids and audio systems
CN112534717A (en) Multi-channel audio enhancement, decoding and rendering responsive to feedback
CN112585868A (en) Audio enhancement in response to compression feedback
US20190052962A1 (en) Signal processor
WO2009093416A1 (en) Sound signal processing device and method
HK1204718B (en) Techniques for distortion reducing multi-band compressor with timbre preservation
HK1177057A (en) Techniques for distortion reducing multi-band compressor with timbre preservation
HK1177057B (en) Techniques for distortion reducing multi-band compressor with timbre preservation
JP2023178550A (en) Sound processing device and sound processing method
CN115086837A (en) Sound effect enhancing method and device
HK1258837B (en) Frequency band compression with dynamic thresholds
HK1222478B (en) Frequency band compression with dynamic thresholds
JP2014209683A (en) Audio signal processing device, acoustic device, method of controlling audio signal processing device, and program