DE102011106034A1

DE102011106034A1 - Method for enabling spectral band replication in e.g. digital audio broadcast, involves determining spectral band replication period and source frequency segment, and performing spectral band replication on null bit code sub bands at period

Info

Publication number: DE102011106034A1
Application number: DE201110106034
Authority: DE
Inventors: Dongping Jiang; Hao Yuan; Guoming Chen; Ke Peng; Jiali Li
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2011-06-30
Filing date: 2011-06-30
Publication date: 2013-01-03

Abstract

The method involves determining a spectral band replication period (102) and a source frequency segment, where the period is a bandwidth from a zero-frequency point to a frequency point of a sound position, and the segment is a frequency segment of a frequency point of the zero-frequency point, which backwardly shifts a copyband-offset-frequency point, up to a frequency point of the position, which backwardly shifts a copyband-offset-frequency point, and displacement of offset is larger or equal to zero. Spectral band replication is performed (103) on null bit code sub bands at the period. Independent claims are also included for the following: (1) a device for enabling spectral band replication (2) a method for enabling audio coding (3) an audio decoding system.

Description

Technisches GebietTechnical area

Die vorliegende Erfindung betrifft eine Audiodecodiertechnik, und insbesondere ein Verfahren und eine Vorrichtung zur Spektralbandreplikation der Spektrumsrekonstruktion an einem uncodierten Codierungsteilband, und ein Verfahren und System zur Audiodecodierung.The present invention relates to an audio decoding technique, and more particularly, to a method and apparatus for spectral band replication of spectrum reconstruction on an uncoded coding subband, and a method and system for audio decoding.

Stand der TechnikState of the art

Die Audiocodierungstechnik ist das Kernstück der Multimediaanwendungstechniken wie digitaler Audiorundfunk, im Internet verbreitete Musik und Audiokommunikation usw., und diese Anwendungen werden aus der Verbesserung der Kompressionsleistung des Audiocodierers einen großen Nutzen ziehen. Der perzeptuelle Audiocodierer wirkt als eine Art verlustbehaftete Transformationsbereichscodierung und ist ein moderner, etablierter Audiocodierer. Allgemein können bei der Audiocodierung aufgrund der Begrenzung der Codierungsbitrate Teile der Frequenzbereichskoeffizienten oder Frequenzkomponenten nicht codiert werden, und zur besseren Wiederherstellung der Spektralkomponenten der uncodierten Teilbänder verwenden aktuelle Audiocodierer und -decodierer allgemein ein Verfahren zur Rauschfüllung oder Spektralbandreplikation, um die Spektralkomponenten des uncodierten Teilbands zu rekonstruieren. G 722.1 C wendet das Verfahren zur Rauschfüllung an, HE-AAC-V1 wendet die Spektralbandreplikationstechnik an, und G.719 wendet das Verfahren der Kombination von Rauschfüllung mit einfacher Spektralbandreplikation an. Die Anwendung des Verfahrens zur Rauschfüllung ist nicht in der Lage, die Spektralhüllkurve des uncodierten Teilbands und die Ton- und Rauschkomponenten innerhalb des Teilbands gut wiederherzustellen. Das Verfahren zur Spektralbandreplikation von HE-AAC-V1 macht es erforderlich, das Spektrum des Audiosignals vor der Codierung zu analysieren, den Ton und das Rauschen der Hochfrequenzkomponentensignale zu schätzen, Parameter zu extrahieren und nach dem Downsampling des Audiosignals den AAC-Codierer zu verwenden, um die Codierung durchzuführen, die mit einem hohen Rechenaufwand verbunden ist und mehr Parameterinformation an das Decodierungsende übertragen muss, mehr codierte Bits belegt und zugleich auch die Codierungsverzögerung erhöht. Doch das Replikationssystem von G.719 ist zu einfach, um die Spektralhüllkurve der uncodierten Teilbänder und die Ton- und Rauschkomponenten innerhalb des Teilbands gut wiederherstellen zu können.The audio coding technique is at the heart of multimedia application techniques such as digital audio broadcasting, music distributed on the Internet and audio communication, etc., and these applications will benefit greatly from the improvement in the compression performance of the audio coder. The perceptual audio coder acts as a kind of lossy transform domain coding and is a modern, established audio coder. Generally, in audio coding, due to the limitation of the coding bit rate, portions of the frequency domain coefficients or frequency components can not be encoded, and for better restoration of the spectral components of the uncoded subbands, current audio encoders and decoders generally employ a noise filling or spectral band replication method to reconstruct the spectral components of the uncoded subband , G 722.1 C uses the noise filling method, HE-AAC-V1 uses the spectral band replication technique, and G.719 uses the method of combining noise filling with simple spectral band replication. The application of the noise fill method is unable to well recover the spectral envelope of the uncoded subband and the tone and noise components within the subband. The spectral band replication method of HE-AAC-V1 requires analyzing the spectrum of the audio signal before encoding, estimating the tone and noise of the high frequency component signals, extracting parameters, and using the AAC encoder after downsampling the audio signal. to perform the coding, which is associated with a high computational effort and has to transfer more parameter information to the decoding end, occupies more coded bits and at the same time increases the coding delay. However, the G.719 replication system is too simple to recover well the spectral envelope of the uncoded subbands and the tone and noise components within the subbands.

Kurzdarstellung der ErfindungBrief description of the invention

Das bei der vorliegenden Erfindung zu lösende technische Problem ist die Bereitstellung eines Verfahrens und einer Vorrichtung zur Spektralbandreplikation und eines Verfahrens und Systems zur Audiodecodierung, um das Problem der Wiederherstellung des Audiosignals von uncodierten Codierungsteilbändern während der Audiocodierungs- und -decodierungsprozesse gut zu lösen.The technical problem to be solved in the present invention is to provide a method and apparatus for spectral band replication and a method and system for audio decoding to solve well the problem of recovering the audio signal from uncoded encoding subbands during the audio encoding and decoding processes.

Um das obige technische Problem zu lösen, stellt die vorliegende Erfindung ein Verfahren zur Spektralbandreplikation bereit, und dieses Verfahren umfasst:

A. das Suchen nach einer Position eines bestimmten Tons eines Audiosignals in MDCT-Frequenzbereichskoeffizienten;
B. der Tonposition entsprechend, das Bestimmen einer Spektralbandreplikationsperiode und eines Quellfrequenzsegments, wobei diese Spektralbandreplikationsperiode eine Bandbreite von einem Nullfrequenzpunkt zu einem Frequenzpunkt der Tonposition ist und dieses Quellfrequenzsegment ein Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition ist, der copyband_offset-Frequenzpunkte rückwärts verschiebt, wobei der Versatz copyband_offset größer oder gleich 0 ist;
C. der Spektralbandreplikationsperiode entsprechend, das Durchführen der Spektralbandreplikation an Nullbit-Codierungsteilbändern.

In order to solve the above technical problem, the present invention provides a spectral band replication method, and this method comprises:

A. searching for a position of a particular audio signal in MDCT frequency domain coefficients;
According to the tone position, determining a spectral band replication period and a source frequency segment, said spectral band replication period being a bandwidth from a zero frequency point to a frequency point of the audio position and said source frequency segment shifting a frequency segment from a frequency point of the zero frequency point rearwardly shifting the copyband_offset frequency points to a frequency point the frequency point of the audio position is the backward shift of the copyband_offset frequency points, the offset copy_band_offset being greater than or equal to 0;
C. corresponding to the spectral band replication period, performing spectral band replication on zero-bit coding subbands.

Bevorzugt wird in Schritt A das folgende Verfahren angewandt, um die Position des bestimmten Tons zu suchen:
das Nehmen der Absolutwerte oder Quadratwerte der Frequenzbereichskoeffizienten eines ersten Frequenzsegments und das Durchführen der Glättungsfilterung; und einem Ergebnis der Glättungsfilterung entsprechend, das Suchen nach einer Position eines maximalen Extremwerts der Filterungsausgaben des ersten Frequenzsegments, und das Nehmen der Position dieses maximalen Extremwerts als die Position eines bestimmten Tons.Preferably, in step A, the following method is used to search the position of the particular tone:
taking the absolute values or square values of the frequency domain coefficients of a first frequency segment and performing the smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering outputs of the first frequency segment, and taking the position of this maximum extreme value as the position of a particular sound.

Bevorzugt ist eine Operationsformel des Nehmens der Absolutwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung wie folgt: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| oder eine Operationsformel des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist wie folgt: X_amp_i(k) = μX_amp_i-l(k – 1) + (1 – μ)X _i(k)² wobei μ ein Glättungsfilterungskoeffizient ist, X_amp_i(k) Filterungsausgaben des k-ten Frequenzpunkts des i-ten Frames bezeichnet und X_i(k) MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames sind, und wenn i = 0, X_amp_i-l(k) = 0. Preferably, an operation formula of taking the absolute values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows: X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) | or an operation formula of taking the square values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows: X_amp _i (k) = μX_amp _il (k-1) + (1-μ) X _i (k) ² where μ is a smoothing filtering coefficient, X_amp _i (k) denotes filtering outputs of the k-th frequency point of the i-th frame, and X _i (k) are MDCT coefficients after decoding the k-th frequency point of the i-th frame, and if i = 0, X_amp _il (k) = 0.

Bevorzugt ist das erste Frequenzsegment ein Frequenzsegment mit niedrigen Frequenzen, deren Energie den spektralstatistischen Eigenschaften gemäß stärker zentralisiert bestimmt wird, wobei niedrige Frequenzen sich auf Spektralkomponenten beziehen, die weniger als die Hälfte einer Gesamtbandbreite eines Signals ausmachen.Preferably, the first frequency segment is a low frequency frequency segment whose energy is determined to be more centralized according to spectral statistics, with low frequencies referring to spectral components that make up less than half the total bandwidth of a signal.

Bevorzugt wird das folgende Verfahren angewandt, um den maximalen Extremwert der Filterungsausgaben zu bestimmen: das direkte Suchen nach einem anfänglichen Maximalwert in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem ersten Frequenzsegment entsprechen, und das Nehmen dieses Maximalwerts als den maximalen Extremwert der Filterungsausgaben des ersten Frequenzsegments.Preferably, the following method is used to determine the maximum extremum of the filtering outputs: searching directly for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment, and taking this maximum value as the maximum extremum of the filtering outputs of the first frequency segment.

Bevorzugt wird das folgende Verfahren angewandt, um den maximalen Extremwert der Filterungsausgaben zu bestimmen:
das Nehmen eines Segments im ersten Frequenzsegment als ein zweites Frequenzsegment, und das Suchen nach einem anfänglichen Maximalwert in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem zweiten Frequenzsegment entsprechen, und je nach Position des Frequenzbereichskoeffizienten, der diesem anfänglichen Maximalwert entspricht, das Durchführen verschiedener Prozesse:

a. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments ist, das Vergleichen dieser Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer früheren niedrigeren Frequenz im ersten Frequenzsegment, und das Vergleichen fortlaufend vorwärts, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten dann ein endgültig bestimmter maximaler Extremwert ist, oder das Vergleichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments dann der endgültig bestimmte maximale Extremwert ist;
b. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments ist, das Vergleichen dieser Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer späteren höheren Frequenz im ersten Frequenzsegment, und das Vergleichen fortlaufend rückwärts, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten dann der endgültig bestimmte maximale Extremwert ist, oder das Vergleichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments dann der endgültig bestimmte maximale Extremwert ist;
c. wenn dieser anfängliche Maximalwert die Filterungsausgabe eines Frequenzbereichskoeffizienten zwischen der niedrigsten Frequenz und der höchsten Frequenz im zweiten Frequenzsegment ist, ist der Frequenzbereichskoeffizient, der diesem anfänglichen Maximalwert entspricht, die Tonposition, das heißt, dieser anfängliche Maximalwert ist der endgültig bestimmte maximale Extremwert.

Preferably, the following method is used to determine the maximum extreme value of the filtering outputs:
taking a segment in the first frequency segment as a second frequency segment, and searching for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment, and depending on the position of the frequency domain coefficient corresponding to that initial maximum value, performing various processes:

a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment, and comparing continuously forward until the filtering output the current frequency domain coefficient is greater than the filtering output of a previous frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient; wherein the filtering output of the frequency coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value;
b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment, and comparing continuously backward until the filtering output the current frequency domain coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final maximum maximum value, or comparing until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient; wherein the filtering output of the frequency domain coefficient the highest frequency of the first frequency segment is then the finally determined maximum extreme value;
c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

Bevorzugt wird in Schritt C, wenn die Spektralbandreplikation für ein Nullbit-Codierungsteilband durchgeführt wird, entsprechend denn Quellfrequenzsegment und einer Startsequenznummer des Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, zuerst eine Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands berechnet, und wird dann die Spektralbandreplikationsperiode als eine Periode genommen, und werden die Frequenzbereichskoeffizienten des Quellfrequenzsegments von der Quellfrequenzsegment-Replikationsstartsequenznummer an periodisch auf das Nullbit-Codierungsteilband repliziert.Preferably, in step C, when the spectral band replication is performed for a zero-bit coding subband corresponding to the source frequency segment and a start sequence number of the zero-bit sub-band Coding subband requiring spectral band replication first calculates a source frequency segment replication start sequence number of this zero bit coding subband, and then taking the spectral band replication period as a period, and periodically replicating the frequency domain coefficients of the source frequency segment from the source frequency segment replication start sequence number to the zero bit coded subband.

Bevorzugt ist in Schritt C ein Verfahren zur Berechnung der Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands:
das Erhalten einer Sequenznummer eines Frequenzpunkts eines Start-MDCT-Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands, dessen Frequenzbereichskoeffizienten zu rekonstruieren sind und welche als fillband_start_freq bezeichnet wird, und einer Sequenznummer eines Frequenzpunkts, der dem Ton entspricht, die als Tonal_pos bezeichnet wird, einer Spektralbandreplikationsperiode, die als copy_period bezeichnet wird, deren Wert gleich Tonal_pos plus 1 ist, und eines Spektralbandreplikationsversatzes, der als copyband_offset bezeichnet wird, wobei der Wert von fillband_start_freq rekursiv von copy_period subtrahiert wird, bis dieser Wert in einem Wertebereich der Sequenznummern des Quellfrequenzsegments liegt, und dieser Wert die Quellfrequenzsegment-Replikationsstartsequenznummer ist, die als copy_pos_mod bezeichnet wird.Preferably, in step C, a method of calculating the source frequency segment replication start sequence number of this zero bit encoding subband is:
obtaining a sequence number of a frequency point of a start MDCT frequency domain coefficient of the zero-bit coding sub-band whose frequency domain coefficients are to be reconstructed and which is called fillband_start_freq, and a sequence number of a frequency point corresponding to the tone denoted Tonal_pos, a spectral band replication period is called copy_period, whose value is Tonal_pos plus 1, and a spectral band replication offset called copyband_offset, where the value of fillband_start_freq is recursively subtracted from copy_period until that value is in a range of values of the source frequency segment sequence numbers, and that value is the Source frequency segment replication start sequence number, which is referred to as copy_pos_mod.

Bevorzugt ist in Schritt C ein Verfahren, um die Spektralbandreplikationsperiode als die Periode zu nehmen und von der Quellfrequenzsegment-Replikationstartsequenznummer an die Frequenzbereichskoeffizienten des Quellfrequenzsegments periodisch auf das Nullbit-Codierungsteilband zu replizieren: das Replizieren der Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an rückwärts fortlaufend auf das Nullbit-Codierungsteilband beginnend mit fillband_start_freq, bis ein Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt Tonal_pos + copyband_offset erreicht, das kontinuierliche Replizieren der Frequenzbereichskoeffizienten vom copyband_offset-Frequenzpunkt an rückwärts auf das Nullbit-Codierungsteilband und so weiter, bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten des aktuellen Nullbit-Codierungsteilbands abgeschlossen ist.Preferably, in step C, a method is to take the spectral band replication period as the period and periodically replicate from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment to the zero bit coding subband: replicating the frequency domain coefficients from the source frequency segment replication start sequence number to the backward continuously Zero bit coding subband beginning with fillband_start_freq until a frequency point of source frequency segment replication reaches a frequency point Tonal_pos + copyband_offset, continuously replicating the frequency domain coefficients from the copyband_offset frequency point back to the zero bit coding subband, and so on until the spectral band replication of all frequency domain coefficients of the current zero bit coding subband is completed is.

Um das obige technische Problem zu lösen, stellt die vorliegende Erfindung auch eine Vorrichtung zur Spektralbandreplikation bereit, und diese Vorrichtung umfasst: ein Tonpositionssuchmodul, ein Perioden- und Quellfrequenzsegment-Berechnungsmodul, ein Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmodul und ein Spektralbandreplikationsmodul, die in Reihe geschaltet sind, wobei
das Tonpositionssuchmodul vorgesehen ist, um eine Position eines bestimmten Tons eines Audiosignals in MDCT-Frequenzbereichskoeffizienten zu suchen;
das Perioden- und Quellfrequenzsegment-Berechnungsmodul vorgesehen ist, um der Tonposition entsprechend eine Spektralbandreplikationsperiode und ein Quellfrequenzsegment zur Replikation zu bestimmen, und diese Spektralbandreplikationsperiode ist eine Bandbreite von einem Nullfrequenzpunkt zu einen Frequenzpunkt der Tonposition, und dieses Quellfrequenzsegment ist ein Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt;
das Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmodul vorgesehen ist, um dem Quellfrequenzsegment und einer Startsequenznummer eines Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, entsprechend eine Quellfrequenzsegment-Replikationsstartsequenznummer eines Nullbit-Codierungsteilbands zu berechnen;
das Spektralbandreplikationsmodul vorgesehen ist, um die Spektralbandreplikationsperiode als Periode zu nehmen und von der Quellfrequenzsegment-Replikationstartsequenznummer an die Frequenzbereichskoeffizienten des Quellfrequenzsegments periodisch auf das Nullbit-Codierungsteilband zu replizieren.To solve the above technical problem, the present invention also provides a spectral band replication apparatus, and this apparatus comprises: a sound position search module, a period and source frequency segment calculation module, a source frequency segment replication start sequence number calculation module and a spectral band replication module connected in series , in which
the sound position search module is provided to search a position of a particular sound of an audio signal in MDCT frequency domain coefficients;
the period and source frequency segment calculation module is provided to determine a spectral band replication period and a source frequency segment for replication according to the sound position, and this spectral band replication period is a bandwidth from a zero frequency point to a frequency point of the sound position, and this source frequency segment is a frequency segment from a frequency point of the zero frequency point shifting the copyband_offset frequency points backwards, up to a frequency point of the frequency point of the audio position, shifting the copyband_offset frequency points backwards;
the source frequency segment replication start sequence number calculation module is provided for calculating a source frequency segment replication start sequence number of a zero bit encoding subband corresponding to the source frequency segment and a start sequence number of a zero bit encoding subband requiring spectral band replication;
the spectral band replicating module is provided to periodically take the spectral band replication period and periodically replicate from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment to the zero bit coding subband.

Bevorzugt ist ein Verfahren des Tonpositionssuchmoduls, um die Tonposition zu suchen: das Nehmen der Absolutwerte oder Quadratwerte der MDCT-Frequenzbereichskoeffizienten des ersten Frequenzsegments und das Durchführen einer Glättungsfilterung; und einem Ergebnis der Glättungsfilterung entsprechend, das Suchen nach einer Position eines maximalen Extremwerts der Filterungsausgabe des ersten Frequenzsegments, und das Nehmen der Position dieses maximalen Extremwerts als die Position des Tons.Preferred is a method of the tone position search module to search the tone position: taking the absolute values or squared values of the MDCT frequency domain coefficients of the first frequency segment and performing a smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering output of the first frequency segment, and taking the position of this maximum extreme value as the position of the sound.

Bevorzugt ist eine Operationsformel dieses Tonpositionssuchmoduls des Nehmens der Absolutwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung: x_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| oder eine Operationsformel des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist: X_amp_i(k) = μX_amp_i-1(k – 1) + (1 – μ)X _i(k)² wobei μ ein Glättungsfilterungskoeffizient ist, X_amp_i(k) Filterungsausgaben des k-ten Frequenzpunkts des i-ten Frames bezeichnet und X_i(k) MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames sind, und wenn i = 0, X_amp_i-l(k) = 0.Preferred is an operation formula of this tone position search module of taking the absolute values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering: x_amp _i (k) = μX_amp _i-1 (k) + (1 - μ) | X _i (k) | or an operation formula of taking the square values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering is: X_amp _i (k) = μX_amp _i-1 (k-1) + (1-μ) X _i (k) ² where μ is a smoothing filtering coefficient, X_amp _i (k) denotes filtering outputs of the k-th frequency point of the i-th frame, and X _i (k) are MDCT coefficients after decoding the k-th frequency point of the i-th frame, and if i = 0, X_amp _il (k) = 0.

Bevorzugt sucht das Tonpositionssuchmodul in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem ersten Frequenzsegment entsprechen, direkt nach einem anfänglichen Maximalwert und nimmt diesen Maximalwert als den maximalen Extremwert der Filterungsausgabe des ersten Frequenzsegments.Preferably, in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment, the sound position search module searches immediately after an initial maximum value and takes this maximum value as the maximum extreme value of the filtering output of the first frequency segment.

Bevorzugt wird, wenn das Tonpositionssuchmodul den maximalen Extremwert der Filterungsausgaben bestimmt, ein Segment im ersten Frequenzsegment als ein zweites Frequenzsegment genommen, und ein anfänglicher Maximalwert wird in den Filterungsausgaben der Frequenzbereichskoeffizienten gesucht, die denn zweiten Frequenzsegment entsprechen, und je nach Position des Frequenzbereichskoeffizienten, der diesem anfänglichen Maximalwert entspricht, werden verschiedene Prozesse durchgeführt:
a. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments ist, das Vergleichen dieser Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer früheren niedrigeren Frequenz im ersten Frequenzsegment, und das Vergleichen fortlaufend vorwärts, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten dann ein endgültig bestimmter maximaler Extremwert ist, oder das Vergleichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments dann der endgültig bestimmte maximale Extremwert ist;
b. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments ist, das Vergleichen dieser Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer späteren höheren Frequenz im ersten Frequenzsegment, und das Vergleichen fortlaufend rückwärts, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten dann der endgültig bestimmte maximale Extremwert ist, oder das Vergleichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, wobei die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments dann der endgültig bestimmte maximale Extremwert ist;
c. wenn dieser anfängliche Maximalwert die Filterungsausgabe eines Frequenzbereichskoeffizienten zwischen der niedrigsten Frequenz und der höchsten Frequenz im zweiten Frequenzsegment ist, ist der Frequenzbereichskoeffizient, der diesem anfänglichen Maximalwert entspricht, die Tonposition, das heißt, dieser anfängliche Maximalwert ist der endgültig bestimmte maximale Extremwert.Preferably, when the sound position search module determines the maximum extremum of the filtering outputs, a segment in the first frequency segment is taken as a second frequency segment, and an initial maximum value is searched in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment and the position of the frequency domain coefficient this initial maximum value, various processes are performed:
a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment, and comparing continuously forward until the filtering output the current frequency domain coefficient is greater than the filtering output of a previous frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient; wherein the filtering output of the frequency coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value;
b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment, and comparing continuously backward until the filtering output the current frequency domain coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final maximum maximum value, or comparing until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient; wherein the filtering output of the frequency domain coefficient the highest frequency of the first frequency segment is then the finally determined maximum extreme value;
c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

Bevorzugt umfasst ein Prozess des Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmoduls zur Berechnung der Quellfrequenzsegment-Replikationsstartsequenznummer des Nullbit-Codierungsteilbands, das der Spekralreplikation bedarf:
das Erhalten einer Sequenznummer eines Startfrequenzpunkts des Nullbit-Codierungsteilbands, dessen Frequenzbereichskoeffizienten aktuell zu rekonstruieren sind und die als fillband_start_freq bezeichnet wird, und einer Sequenznummer eines Frequenzpunkts, der dem Ton entspricht, die als Tonal_pos bezeichnet wird, der Spektralbandreplikationsperiode, als copy_period bezeichnet wird, deren Wert gleich Tonal_pos plus 1 ist, und einer Quellsegment-Startsequenznummer, die als copyband_offset bezeichnet wird, wobei der Wert der fillband_start_freq rekursiv von copy_period subtrahiert wird, bis dieser Wert in einem Wertebereich der Sequenznummern des Quellfrequenzsegments liegt, und dieser Wert die Quellfrequenzsegment-Replikationsstartsequenznummer ist, die als copy_pos_mod bezeichnet wird.Preferably, a process of the source frequency segment replication start sequence number calculation module for calculating the source frequency segment replication start sequence number of the zero bit encoding subband that requires the Spekral replication comprises:
obtaining a sequence number of a start frequency point of the zero-bit coding sub-band whose frequency domain coefficients are to be currently reconstructed, and which is called fillband_start_freq, and a sequence number of a frequency point corresponding to the tone called Tonal_pos, the spectral band replication period, called copy_period, the Value equal to tonal_pos plus 1, and a source segment start sequence number called copyband_offset, wherein the value of the fillband_start_freq is recursively subtracted from copy_period until that value is in a range of values of the source frequency segment sequence numbers, and this value is the source frequency segment replication start sequence number which is called copy_pos_mod.

Bevorzugt werden, wenn das Spektralbandreplikationsmodul die Spektralbandreplikation durchführt, Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an rückwärts fortlaufend beginnend mit fillband_start_freq auf das Nullbit-Codierungsteilband repliziert, bis ein Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt von Tonal_pos + copyband_offset erreicht, die Frequenzbereichskoeffizienten werden vom copyband_offset-Frequenzpunkt an kontinuierlich rückwärts auf das Nullbit-Codierungsteilband weiterrepliziert usw., bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten des aktuellen Nullbit-Codierungsteilbands abgeschlossen ist. Preferably, when the spectral band replicating module performs the spectral band replication, frequency domain coefficients are replicated backwards from the source frequency segment replication start sequence number to fill bit_start_freq until a frequency point of the source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset, the frequency domain coefficients become from the copyband_offset frequency point continues to replicate backwards to the zero bit coding subband, etc., until the spectral band replication of all the frequency domain coefficients of the current zero bit coding subband is completed.

Um das obige technische Problem zu lösen, stellt die vorliegende Erfindung auch ein Verfahren zur Audiodecodierung bereit, und das Verfahren umfasst:

A. das Durchführen der Decodierung und Inversquantisierung an jedem amplitudenhüllkurvencodierten Bit in einem zu decodierenden Bitstrom, um eine Amplitudenhüllkurve jedes Codierungsteilbands zu erhalten;
B. das Durchführen der Bitzuweisung an jedem Codierungsteilband und das Durchführen der Inversquantisierung an Nicht-Nullbit-Codierungsteilbändern, um Frequenzbereichskoeffizienten der Nicht-Nullbit-Codierungsteilbänder zu erhalten;
C. das Suchen nach einer Position eines bestimmten Tons eines Audiosignals in MDCT-Frequenzbereichskoeffizienten, das Nehmen einer Bandbreite von einem Nullfrequenzpunkt zu einem Frequenzpunkt der Tonposition als Spektralbandreplikationsperiode, das Nehmen eines Frequenzsegments von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt, als Quellfrequenzsegment, das Durchführen der Spektralbandreplikation an Nullbit-Codierungsteilbändern, und, einer Amplitudenhüllkurve eines aktuellen Codierungsteilbands entsprechend, das Durchführen der Energieanpassung an Frequenzbereichskoeffizienten, die durch Replikation erhalten wurden, und das Kombinieren der Rauschfüllung, das Erhalten der rekonstruierten Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands, wobei der Versatz copyband_offset größer oder gleich 0 ist;
D. das Durchführen der Inversen Modifizierten Diskreten Cosinus-Transformation an Frequenzbereichskoeffizienten von Nicht-Nullbit-Codierungsteilbändern und an rekonstruierten Frequenzbereichskoeffizienten von Nullbit-Codierungsteilbändern, um ein Endaudiosignal zu erhalten.

In order to solve the above technical problem, the present invention also provides a method for audio decoding, and the method comprises:

A. performing the decoding and inverse quantization on each amplitude envelope coded bit in a bitstream to be decoded to obtain an amplitude envelope of each coding subband;
B. performing bit allocation on each coding subband and performing inverse quantization on non-zero bit coding subbands to obtain frequency domain coefficients of the nonzero bit coding subbands;
C. searching for a position of a particular audio signal in MDCT frequency domain coefficients, taking a bandwidth from a zero frequency point to a frequency point of the audio position as the spectral band replication period, taking a frequency segment from a frequency point of the zero frequency point shifting the copyband_offset frequency points backward until as a source frequency segment, performing the spectral band replication on zero-bit coding subbands and, corresponding to an amplitude envelope of a current encoding subband, performing the energy adjustment on frequency domain coefficients obtained by replication, and a frequency point of the frequency point of the audio position shifting the copyband_offset frequency points backwards combining the noise fill, obtaining the reconstructed frequency domain coefficients of the zero bit encoding subband, wherein the offset copy_band_offset is greater than or equal to 0 is;
D. performing the Inverse Modified Discrete Cosine Transform on frequency domain coefficients of non-zero bit coding subbands and reconstructed frequency domain coefficients of zero bit coding subbands to obtain an end audio signal.

Bevorzugt wird in Schritt C das folgende Verfahren angewandt, um nach der Position des bestimmten Tons zu suchen:
das Nehmen der Absolutwerte oder Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments und das Durchführen der Glättungsfilterung; und
einem Ergebnis der Glättungsfilterung entsprechend, das Suchen nach der Position eines maximalen Extremwerts der Filterungsausgaben des ersten Frequenzsegments, und das Nehmen der Position dieses maximalen Extremwerts als die Position eines bestimmten Tons.Preferably, in step C, the following method is used to search for the position of the particular tone:
taking the absolute values or square values of the frequency domain coefficients of the first frequency segment and performing the smoothing filtering; and
according to a result of the smoothing filtering, searching for the position of a maximum extreme value of the filtering outputs of the first frequency segment, and taking the position of this maximum extreme value as the position of a particular sound.

Bevorzugt ist eine Operationsformel des Nehmens der Absolutwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung wie folgt: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| oder eine Operationsformel des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist wie folgt: X_amp_i(k) = μX_amp_i-1(k – 1) + (1 – μ)X _i(k)² wobei μ ein Glättungsfilterungskoeffizient ist, X_amp_i(k) Filterungsausgaben des k-ten Frequenzpunkts des i-ten Frames bezeichnet, und X_i(k) MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames sind, und wenn i = 0, X_amp_i-1(k) = 0.Preferably, an operation formula of taking the absolute values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows: X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) | or an operation formula of taking the square values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows: X_amp _i (k) = μX_amp _i-1 (k-1) + (1-μ) X _i (k) ² where μ is a smoothing filtering coefficient, X_amp _i (k) denotes filtering outputs of the k-th frequency point of the i-th frame, and X _i (k) are MDCT coefficients after decoding the k-th frequency point of the i-th frame, and i = 0, X_amp _i-1 (k) = 0.

Bevorzugt wird das folgende Verfahren angewandt, um den maximalen Extremwert der Filterungsausgaben zu bestimmen: das direkte Suchen nach einem anfänglichen Maximalwert in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem ersten Frequenzsegment entsprechen, und das Nehmen dieses Maximalwerts als den maximalen Extremwert der Filterungsausgaben des ersten Frequenzsegments. Preferably, the following method is used to determine the maximum extremum of the filtering outputs: searching directly for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment, and taking this maximum value as the maximum extremum of the filtering outputs of the first frequency segment.

Bevorzugt wird das folgende Verfahren angewandt, um den maximalen Extremwert der Filterungsausgaben zu bestimmen:
das Nehmen eines Segments im ersten Frequenzsegment als ein zweites Frequenzsegment und das Suchen nach einem anfänglichen Maximalwert in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem zweiten Frequenzsegment entsprechen, und je nach Position des Frequenzbereichskoeffizienten, der diesem anfänglichen Maximalwert entspricht, das Durchführen verschiedener Prozesse:

Preferably, the following method is used to determine the maximum extreme value of the filtering outputs:
taking a segment in the first frequency segment as a second frequency segment and searching for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment and depending on the position of the frequency domain coefficient corresponding to that initial maximum value, performing various processes:

Bevorzugt wird in Schritt C, wenn die Spektralbandreplikation für ein Nullbit-Codierungsteilband durchgeführt wird, zuerst dem Quellfrequenzsegment und einer Startsequenznummer des Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, entsprechend eine Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands berechnet, dann wird die Spektralbandreplikationsperiode als Periode genommen und die Frequenzbereichskoeffizienten des Quellfrequenzsegments werden von der Quellfrequenzsegment-Replikationsstartsequenznummer an periodisch auf das Nullbit-Codierungsteilband repliziert.Preferably, in step C, when the spectral band replication is performed for a zero bit coding subband, first the source frequency segment and a start sequence number of the zero bit coding subband requiring spectral band replication is computed according to a source frequency segment replication start sequence number of that zero bit coding subband, then the spectral band replication period becomes a period and the frequency domain coefficients of the source frequency segment are periodically replicated from the source frequency segment replication start sequence number to the zero bit encoding subband.

Bevorzugt ist in Schritt C ein Verfahren zur Berechnung der Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands:
das Erhalten einer Sequenznummer eines Frequenzpunkts eines Start-MDCT-Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands, dessen Frequenzbereichskoeffizienten zu rekonstruieren sind, welche als fillband_start_freq bezeichnet wird, und einer Sequenznummer eines Frequenzpunkts, der denn Ton entspricht und als Tonal_pos bezeichnet wird, einer Spektralbandreplikationsperiode, die als copy_period bezeichnet wird und deren Wert gleich Tonal_pos plus 1 ist, und eines Spektralbandreplikationsversatzes, der als copyband_offset bezeichnet wird, wobei der Wert von fillband_start_freq rekursiv von copy_period subtrahiert wird, bis dieser Wert in einem Wertebereich der Sequenznummern des Quellfrequenzsegments liegt, und dieser Wert dann die Quellfrequenzsegment-Replikationsstartsequenznummer ist, die als copy_pos_mod bezeichnet wird.Preferably, in step C, a method of calculating the source frequency segment replication start sequence number of this zero bit encoding subband is:
obtaining a sequence number of a frequency point of a start MDCT frequency-domain coefficient of the zero-bit coding sub-band whose frequency domain coefficients are to be reconstructed, called fillband_start_freq, and a sequence number of a frequency point corresponding to the tone denoted tonal_pos, a spectral band replication period termed copy_period and whose value is equal to tonal_pos plus 1, and a spectral band replication offset called copyband_offset, where the value of fillband_start_freq is recursively subtracted from copy_period until that value is in a range of values of the source frequency segment sequence numbers, and that value then Source frequency segment replication start sequence number, which is referred to as copy_pos_mod.

Bevorzugt ist in Schritt C ein Verfahren, um die Spektralbandreplikationsperiode als die Periode zu nehmen und von der Quellfrequenzsegment-Replikationstartsequenznummer an die Frequenzbereichskoeffizienten des Quellfrequenzsegments periodisch auf das Nullbit-Codierungsteilband zu replizieren:
das Replizieren der Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an fortlaufend rückwärts auf das Nullbit-Codierungsteilband beginnend mit fillband_start_freq, bis ein Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt von Tonal_pos + copyband_offset erreicht, das kontinuierliche Weiterreplizieren der Frequenzbereichskoeffizienten vom copyband_offset-Frequenzpunkt an rückwärts fortlaufend auf das Nullbit-Codierungsteilband usw., bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten des aktuellen Nullbit-Codierungsteilbands abgeschlossen ist.Preferably, in step C, a method is to take the spectral band replication period as the period and periodically replicate from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment to the zero bit coding subband:
replicating the frequency domain coefficients from the source frequency segment replication start sequence number consecutively back to the zero bit coding subband beginning with fillband_start_freq until a frequency point of the source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset, continuously replicating the frequency domain coefficients from the copyband_offset Frequency point in reverse continuously to the zero bit coding subband, etc., until the spectral band replication of all frequency domain coefficients of the current zero bit coding subband is completed.

Bevorzugt kombiniert das obige Verfahren zur Spektralbandreplikation ein Verfahren zur Rauschfüllung, das angewandt wird, um die Spektrumsrekonstruktion für alle Nullbit-Codierungsteilbänder durchzuführen, oder ein Verfahren zur zufälligen Rauschfüllung, das angewandt wird, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder unterhalb eines bestimmten Frequenzpunkts durchzuführen, und ein Verfahren zur Frequenzbereichskoeffizientenreplikation kombiniert mit Rauschfüllung wird angewandt, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder oberhalb des bestimmten Frequenzpunkts durchzuführen.Preferably, the above spectral band replication method combines a noise filling method used to perform the spectrum reconstruction for all zero-bit coding subbands or a random noise filling method used to perform spectrum reconstruction for zero-bit coding subbands below a certain frequency point. and a method of frequency domain coefficient replication combined with noise filling is used to perform the spectrum reconstruction for zero bit coding subbands above the determined frequency point.

Um das obige technische Problem zu lösen, stellt die vorliegende Erfindung auch ein System zur Audiodecodierung bereit, und das System umfasst: einen Bitstrom-Demultiplexer (DeMUX), eine Amplitudenhüllkurven-Decodiereinheit, eine Bitzuweisungseinheit, eine Frequenzbereichskoeffizienten-Decodiereinheit, eine Spektralbandreplikationseinheit, eine Rauschfülleinheit und eine Einheit für Inverse Modifizierte Diskrete Cosinus-Transformation (IMDCT), wobei:
der DeMUX vorgesehen ist, um amplitudenhüllenkurvencodierte Bits, frequenzbereichskoeffizientencodierte Bits und rauschpegelcodierte Bits aus einem zu decodierenden Bitstrom zu trennen;
die Amplitudenhüllkurven-Decodiereinheit, die mit dem DeMUX verbunden ist, vorgesehen ist, um die Decodierung und Inversquantisierung für die vom Bitstrom-Demultiplexer ausgegebenen amplitudenhüllkurvencodierten Bits durchzuführen, um eine Amplitudenhüllkurve jedes Codierungsteilbands zu erhalten;
die Bitzuweisungseinheit, die mit der Amplitudenhüllkurven-Decodiereinheit verbunden ist, zur Durchführung der Bitzuweisung vorgesehen ist, um die Zahl der codierten Bits zu erhalten, die jedem Frequenzbereichskoeffizienten jedes Codierungsteilbands zugewiesen sind;
die Frequenzbereichskoeffizienten-Decodiereinheit, die mit der Amplitudenhüllkurven-Decodiereinheit und der Bitzuweisungseinheit verbunden ist, vorgesehen ist, um die Decodierung, Inversquantisierung und Inversnormalisierung für Codierungsteilbänder durchzuführen, um Frequenzbereichskoeffizienten zu erhalten;
die Spektralbandreplikationseinheit, die mit dem DeMUX, der Frequenzbereichskoeffizienten-Decodiereinheit, der Amplitudenhüllkurven-Decodiereinheit und der Bitzuweisungseinheit verbunden ist, vorgesehen ist, um nach einer Position eines bestimmten Tons eines Audiosignals in MDCT-Frequenzbereichskoeffizienten zu suchen, eine Bandbreite von einem Nullfrequenzpunkt zu einem Frequenzpunkt der Tonposition als Spektralbandreplikationsperiode zu nehmen, ein Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt, als Quellfrequenzsegment zu nehmen, die Spektralbandreplikation an Nullbit-Codierungsteilbändern durchzuführen, wobei der Versatz copyband_offset größer oder gleich Q ist; und auch, um einer Amplitudenhüllkurve eines aktuellen Codierungsteilbands entsprechend die Energieanpassung an Frequenzbereichskoeffizienten durchzuführen, die durch Replikation erhalten wurden;
die Rauschfülleinheit, die mit der Amplitudenhüllkurven-Decodiereinheit, der Bitzuweisungseinheit und der Spektralbandreplikationseinheit verbunden ist, vorgesehen ist, um der Amplitudenhüllkurve des aktuellen Nullbit-Codierungsteilbands entsprechend die Rauschfüllung für dieses Codierungsteilband durchzuführen, um rekonstruierte Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands zu erhalten;
die IMDCT-Einheit, die mit der Rauschfülleinheit verbunden ist, vorgesehen ist, um nach der Rauschfüllung die IMDCT an den Frequenzbereichskoeffizienten durchzuführen, um ein Audiosignal zu erhalten.To solve the above technical problem, the present invention also provides a system for audio decoding, and the system comprises: a bit stream demultiplexer (DeMUX), an amplitude envelope decoding unit, a bit allocation unit, a frequency domain coefficient decoding unit, a spectral band replicating unit, a noise filling unit and an Inverse Modified Discrete Cosine Transform (IMDCT) unit, wherein:
the DeMUX is arranged to separate amplitude envelope coded bits, frequency domain coefficient coded bits, and noise level coded bits from a bitstream to be decoded;
the amplitude envelope decoding unit connected to the DeMUX is provided to perform the decoding and inverse quantization for the amplitude envelope coded bits output from the bitstream demultiplexer to obtain an amplitude envelope of each coding subband;
the bit allocation unit connected to the amplitude envelope decoding unit is arranged to perform the bit allocation to obtain the number of coded bits assigned to each frequency area coefficient of each coding subband;
the frequency domain coefficient decoding unit connected to the amplitude envelope decoding unit and the bit allocation unit is provided to perform the decoding, inverse quantization and inverse normalization for coding subbands to obtain frequency domain coefficients;
the spectral band replicating unit connected to the DeMUX, the frequency-domain coefficient decoding unit, the amplitude envelope decoding unit and the bit allocation unit is provided to search for a position of a particular tone of an audio signal in MDCT frequency domain coefficients, a bandwidth from a zero frequency point to a frequency point take the audio position as a spectral band replication period, frequency segment from a frequency point of the zero frequency point shifting backwards copy_band_offset frequency points up to a frequency point of the frequency point of the audio position shifting the copyband_offset frequency points backwards, taking as the source frequency segment to perform spectral band replication on zero bit coding subbands where the offset copy_band_offset is greater than or equal to Q; and also, in accordance with an amplitude envelope of a current encoding subband, performing the energy adjustment on frequency domain coefficients obtained by replication;
the noise fill unit connected to the amplitude envelope decoding unit, the bit allocation unit and the spectral band replicating unit is provided to perform the noise fill for that coding subband corresponding to the amplitude envelope of the current zero bit coding subband to obtain reconstructed frequency domain coefficients of the zero bit coding subband;
the IMDCT unit connected to the noise filling unit is provided to perform the IMDCT on the frequency domain coefficient after the noise filling to obtain an audio signal.

Bevorzugt umfasst die Spektralbandreplikationseinheit: ein Tonpositionssuchmodul, ein Perioden- und Quellfrequenzsegment-Berechnungsmodul, ein Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmodul und ein Spektralbandreplikationsmodul, die in Reihe geschaltet sind, wobei:
das Tonpositionssuchmodul vorgesehen ist, um eine Position eines bestimmten Tons eines Audiosignals in MDCT-Frequenzbereichskoeffizienten zu suchen;
das Perioden- und Quellfrequenzsegment-Berechnungsmodul vorgesehen ist, um der Tonposition entsprechend eine Spektralbandreplikationsperiode und ein Quellfrequenzsegment zu bestimmen, und diese Spektralbandreplikationsperiode ist eine Bandbreite von einem Nullfrequenzpunkt zu einem Frequenzpunkt der Tonposition, und dieses Quellfrequenzsegment ist ein Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt;
das Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmodul vorgesehen ist, um dem Quellfrequenzsegment und einer Startsequenznummer eines Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, entsprechend eine Quellfrequenzsegment-Replikationsstartsequenznummer eines Nullbit-Codierungsteilbands zu berechnen;
das Spektralbandreplikationsmodul vorgesehen ist, um die Spektralbandreplikationsperiode als Periode zu nehmen und von der Quellfrequenzsegment-Replikationstartsequenznummer an die Frequenzbereichskoeffizienten des Quellfrequenzsegments periodisch auf das Nullbit-Codierungsteilband zu replizieren.Preferably, the spectral band replicating unit comprises: a sound position search module, a period and source frequency segment calculation module, a source frequency segment replication start sequence number calculation module and a spectral band replication module connected in series, wherein:
the sound position search module is provided to search a position of a particular sound of an audio signal in MDCT frequency domain coefficients;
the period and source frequency segment calculation module is provided to determine a spectral band replication period and a source frequency segment according to the sound position, and this spectral band replication period is a bandwidth from a zero frequency point to a frequency point of the sound position, and this source frequency segment is a frequency segment from a frequency point of the zero frequency point backwards shift copy_band_offset frequency points up to a frequency point of the frequency point of the audio position shifting backwards copy_band_offset frequency points;
the source frequency segment replication start sequence number calculation module is provided for calculating a source frequency segment replication start sequence number of a zero bit encoding subband corresponding to the source frequency segment and a start sequence number of a zero bit encoding subband requiring spectral band replication;
the spectral band replicating module is provided to periodically take the spectral band replication period and periodically replicate from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment to the zero bit coding subband.

Bevorzugt wendet das Tonpositionssuchmodul das folgende Verfahren an, um die Tonposition zu suchen: das Nehmen der Absolutwerte oder Quadratwerte der MDCT-Frequenzbereichskoeffizienten des ersten Frequenzsegments und das Durchführen einer Glättungsfilterung; und einen Ergebnis der Glättungsfilterung entsprechend, das Suchen nach einer Position eines maximalen Extremwerts der Filterungsausgabe des ersten Frequenzsegments, und das Nehmen der Position dieses maximalen Extremwerts als die Position des Tons.Preferably, the sound position search module employs the following method to search the sound position: taking the absolute values or square values of the MDCT frequency domain coefficients of the first frequency segment and performing a smoothing filtering; and corresponding to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering output of the first frequency segment, and taking the position of this maximum extreme value as the position of the sound.

Bevorzugt ist eine Operationsformel dieses Tonpositionssuchmoduls des Nehmens der Absolutwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| oder eine Operationsformel des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist: X_amp_i(k) = μX_amp_i-1(k – 1) + (1 – μ)X _i(k)² wobei μ ein Glättungsfilterungskoeffizient ist, X_amp_i(k) Filterungsausgaben des k-ten Frequenzpunkts des i-ten Frames bezeichnet, und X_i(k) MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames sind, und wenn i = 0, X_amp_i-1(k) = 0.Preferred is an operation formula of this tone position search module of taking the absolute values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering: X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) | or an operation formula of taking the square values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering is: X_amp _i (k) = μX_amp _i-1 (k-1) + (1-μ) X _i (k) ² where μ is a smoothing filtering coefficient, X_amp _i (k) denotes filtering outputs of the k-th frequency point of the i-th frame, and X _i (k) are MDCT coefficients after decoding the k-th frequency point of the i-th frame, and i = 0, X_amp _i-1 (k) = 0.

Bevorzugt sucht das Tonpositionssuchmodul in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem ersten Frequenzsegment entsprechen, direkt nach einem anfänglichen Maximalwert, und nimmt diesen Maximalwert als den maximalen Extremwert der Filterungsausgaben des ersten Frequenzsegments.Preferably, the sound position search module in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment searches immediately after an initial maximum value, and takes this maximum value as the maximum extreme value of the filtering outputs of the first frequency segment.

Bevorzugt wird, wenn das Tonpositionssuchmodul den maximalen Extremwert der Filterungsausgaben bestimmt, ein Segment im ersten Frequenzsegment als ein zweites Frequenzsegment genommen, und ein anfänglicher Maximalwert wird in den Filterungsausgaben der Frequenzbereichskoeffizienten gesucht, die dem zweiten Frequenzsegment entsprechen, und je nach Position des Frequenzbereichskoeffizienten, der diesem anfänglichen Maximalwert entspricht, werden verschiedene Prozesse durchgeführt:

Preferably, when the sound position search module determines the maximum extremum of the filtering outputs, a segment in the first frequency segment is taken as a second frequency segment, and an initial maximum value is searched in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment and the position of the frequency domain coefficient this initial maximum value, various processes are performed:

a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment, and comparing continuously forward until the filtering output the current frequency domain coefficient is greater than the filtering output of a previous frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient; wherein the filtering output of the frequency coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value;
b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment, and comparing continuously backward until the filtering output of the current frequency domain coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final determined maximum extreme value, or comparing until the filtering output of the frequency domain coefficient highest frequency of the first frequency segment is greater than the filtering output of an earlier frequency-domain coefficient, the filtering output of the frequency-domain coefficient of the highest frequency of the first frequency segment then being the finally determined maximum extreme value;
c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

Bevorzugt umfasst ein Prozess des Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmoduls zur Berechnung der Quellfrequenzsegment-Replikationsstartsequenznummer des Nullbit-Codierungsteilbands, das der Spekralreplikation bedarf:
das Erhalten einer Sequenznummer eines Startfrequenzpunkts des Nullbit-Codierungsteilbands, dessen Frequenzbereichskoeffizienten aktuell zu rekonstruieren sind und die als fillband_start_freq bezeichnet wird, und einer Sequenznummer eines Frequenzpunkts, der dem Ton entspricht, die als Tonal_pos bezeichnet wird, der Spektralbandreplikationsperiode, als copy_period bezeichnet wird, deren Wert gleich Tonal_pos plus 1 ist, und einer Quellsegment-Startsequenznummer, die als copyband_offset bezeichnet wird, wobei der Wert der fillband_start_freq rekursiv von copy_period subtrahiert wird, bis dieser Wert in einem Wertebereich der Sequenznummern des Quellfrequenzsegments liegt, und dieser Wert die Quellfrequenzsegment-Replikationsstartsequenznummer ist, die als copy_pos_ mod bezeichnet wird.Preferably, a process of the source frequency segment replication start sequence number calculation module for calculating the source frequency segment replication start sequence number of the zero bit encoding subband that requires the Spekral replication comprises:
obtaining a sequence number of a start frequency point of the zero-bit coding sub-band whose frequency domain coefficients are to be currently reconstructed, and which is called fillband_start_freq, and a sequence number of a frequency point corresponding to the tone called Tonal_pos, the spectral band replication period, called copy_period, the Value equal to tonal_pos plus 1, and a source segment start sequence number called copyband_offset, wherein the value of the fillband_start_freq is recursively subtracted from copy_period until that value is in a range of values of the source frequency segment sequence numbers, and this value is the source frequency segment replication start sequence number which is called copy_pos_mod.

Bevorzugt werden, wenn das Spektralbandreplikationsmodul die Spektralbandreplikation durchführt, Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an rückwärts fortlaufend mit fillband_start_freq beginnend auf das Nullbit-Codierungsteilband repliziert, bis ein Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt von Tonal_pos + copyband_offset erreicht, die Frequenzbereichskoeffizienten werden vom copyband_offset-Frequenzpunkt an kontinuierlich rückwärts fortlaufend auf das Nullbit-Codierungsteilband weiterrepliziert usw., bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten des aktuellen Nullbit-Codierungsteilbands abgeschlossen ist.Preferably, when the spectral band replicating module performs the spectral band replication, frequency domain coefficients are replicated backwards from the source frequency segment replication start sequence number to the zero bit subband until a frequency point of the source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset, the frequency domain coefficients become from the copyband_offset frequency point continuously replicating backwards to the zero bit coding subband, etc., until the spectral band replication of all frequency domain coefficients of the current zero bit coding subband is completed.

Bevorzugt wird ein Verfahren zur Frequenzbereichskoeffizientenreplikation, das von der Spektralbandreplikationseinheit angewandt wird, die Rauschfüllung kombiniert, die von der Rauschfülleinheit angewandt wird, eingesetzt um die Spektrumsrekonstruktion für alle Nullbit-Codierungsteilbänder durchzuführen, oder die Rauschfülleinheit führt die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder unterhalb eines bestimmten Frequenzpunkts durch, indem sie ein Verfahren zur zufälligen Rauschfüllung anwendet, und das Verfahren zur Frequenzbereichskoeffizientenreplikation, das von der Spektralbandreplikationseinheit angewandt wird, wird mit der Rauschfüllung durch die Rauschfülleinheit kombiniert, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder oberhalb des bestimmten Frequenzpunkts durchzuführen.Preferably, a frequency domain coefficient replication method employed by the spectral band replicating unit that combines noise filling applied by the noise replacer unit to perform spectrum reconstruction for all zero bit coding subbands or the noise replacer performs spectrum reconstruction for zero bit coding subbands below a particular frequency point by applying a random noise filling method, and the frequency domain coefficient replication method employed by the spectral band replicating unit is combined with the noise filling by the noise replacer unit to perform the spectrum reconstruction for zero bit coding subbands above the determined frequency point.

Die vorliegende Erfindung sucht die Position eines bestimmten Tons eines Audiosignals in den MDCT-Frequenzbereichskoeffizienten, die durch ein Decodierungsende eines Systems zur Audiocodierung und -decodierung decodiert werden, und bestimmt eine Frequenzbereichsreplikationsperiode dieser Tonposition entsprechend, und führt dann dieser Frequenzbereichsreplikationsperiode entsprechend die Spektralbandreplikation durch und kombiniert Energieniveauanpassung und Rauschfüllung, um die Frequenzbereichskoeffizientenrekonstruktion an uncodierten Codierungsteilbändern durchzuführen, wobei das Energieniveau der Rauschfüllung und Spektralbandreplikation durch die Spektralhüllkurvenwerte uncodierter Codierungsteilbänder geregelt wird. Dieses Verfahren kann die Spektralhüllkurve des uncodierten Codierungsteilbands und die interne Toninformation gut wiederherstellen und einen besseren subjektiven Höreindrück erreichen.The present invention seeks the position of a particular tone of an audio signal in the MDCT frequency domain coefficients decoded by a decoding end of an audio coding and decoding system, and determines a frequency domain replication period corresponding to that sound position, and then performs spectral band replication in accordance with this frequency domain replication period Energy level adjustment and noise filling to perform the frequency domain coefficient reconstruction on uncoded coding subbands, wherein the energy level of noise filling and spectral band replication is controlled by the spectral envelope values of uncoded coding subbands. This method can well restore the spectral envelope of the uncoded encoding sub-band and the internal sound information and achieve a better subjective audio experience.

Kurze Beschreibung der ZeichnungenBrief description of the drawings

1 ist eine schematische Darstellung des erfindungsgemäßen Verfahrens zur Spektralbandreplikation; 1 is a schematic representation of the inventive method for Spektralbandreplikation;

2 ist eine schematische Darstellung des erfindungsgemäßen Verfahrens zur Audiodecodierung; 2 is a schematic representation of the method according to the invention for audio decoding;

3 ist eine schematische Strukturdarstellung des Moduls der erfindungsgemäßen Vorrichtung zur Spektralbandreplikation; 3 is a schematic structural representation of the module of the inventive device for Spektralbandreplikation;

4 ist eine schematische Strukturdarstellung des erfindungsgemäßen Systems zur Audiodecodierung. 4 is a schematic structural representation of the system according to the invention for audio decoding.

Bevorzugte Ausführungsformen der vorliegenden Erfindung Preferred embodiments of the present invention

Der Kerngedanke der vorliegenden Erfindung ist: das Suchen der Position eines bestimmten Tons eines Audiosignals in den MDCT-Frequenzbereichskoeffizienten, die durch ein Decodierungsende eines Systems zur Audiocodierung und -decodierung decodiert werden, und das Bestimmen einer Frequenzbereichsreplikationsperiode dieser Tonposition entsprechend, und dann das Durchführen der Spektralbandreplikation dieser Frequenzbereichsreplikationsperiode entsprechend, und das Kombinieren der Energieniveauanpassung und Rauschfüllung, um die Frequenzbereichskoeffizientenrekonstruktion an uncodierten Codierungsteilbändern durchzuführen, wobei das Energieniveau der Rauschfüllung und Spektralbandreplikation durch die Spektralhüllkurvenwerte uncodierter Codierungsteilbänder geregelt wird. Dieses Verfahren kann die Spektralhüllkurve des uncodierten Codierungsteilbands und die interne Toninformation gut wiederherstellen und einen besseren subjektiven Höreindruck erreichen.The gist of the present invention is to search the position of a particular tone of an audio signal in the MDCT frequency domain coefficients decoded by a decoding end of an audio coding and decoding system and determine a frequency domain replication period corresponding to that tone position, and then perform the Spectral band replication corresponding to this frequency domain replication period, and combining the energy level matching and noise filling to perform the frequency domain coefficient reconstruction on uncoded coding subbands, wherein the energy level of noise filling and spectral band replication is controlled by the spectral envelope values of uncoded coding subbands. This method can well restore the spectral envelope of the uncoded encoding sub-band and the internal tone information and achieve a better subjective listening experience.

Alle Frequenzbereichskoeffizienten, die in der vorliegenden Erfindung genannt werden, beziehen sich auf die MDCT-Frequenzbereichskoeffizienten.All frequency domain coefficients mentioned in the present invention relate to the MDCT frequency domain coefficients.

Wie in 1 gezeigt, umfasst das erfindungsgemäße Verfahren zur Spektralbandreplikation:
101: die Position eines bestimmten Tons eines Audiosignals wird in den MDCT-Frequenzbereichskoeffizienten gesucht;
das bevorzugte Verfahren zur erfindungsgemäßen Suche nach der Tonposition ist das Durchführen der Glättungsfilterung an den MDCT-Frequenzbereichskoeffizienten, und das Verfahren umfasst:

a1, Absolutwerte oder Quadratwerte der MDCT-Frequenzbereichskoeffizienten werden auf einem bestimmten Frequenzsegment mit niedrigen Frequenzen genommen, und die Glättungsfilterung wird durchgeführt; das bestimmte Frequenzsegment kann hier ein Frequenzsegment mit niedrigen Frequenzen sein, deren Energie den statistischen Eigenschaften des Spektrums stärker zentralisiert bestimmt wird und das als erstes Frequenzsegment bezeichnet wird. Die niedrige Frequenz bezieht sich hier auf die Frequenzkomponenten, die weniger als die Hälfte der Gesamtbandbreite eines Signals ausmachen.

As in 1 the spectral band replication method according to the invention comprises:
101 the position of a particular sound of an audio signal is searched for in the MDCT frequency domain coefficients;
the preferred method of searching for the sound position according to the invention is to perform the smoothing filtering on the MDCT frequency domain coefficients, and the method comprises:

a1, absolute values or square values of the MDCT frequency domain coefficients are taken on a certain low-frequency frequency segment, and the smoothing filtering is performed; the particular frequency segment may here be a frequency segment with low frequencies whose energy is determined to be more centralized to the statistical properties of the spectrum and which is referred to as the first frequency segment. The low frequency here refers to the frequency components that make up less than half the total bandwidth of a signal.

Die MDCT-Frequenzbereichskoeffizienten beziehen sich hier auf die MDCT-Frequenzbereichskoeffizienten, die vom Decodierungsende des Systems zur Audiocodierung und -decodierung decodiert werden, und sind von der niedrigen Frequenz zur hohen Frequenz geordnet, und die Sequenznummer des ersten Frequenzpunkts wird als 0 bezeichnet, und die Sequenznummern der nachfolgenden Frequenzpunkte werden fortlaufend um 1 erhöht.The MDCT frequency domain coefficients here refer to the MDCT frequency domain coefficients decoded from the decoding end of the audio encoding and decoding system and are ordered from the low frequency to the high frequency, and the sequence number of the first frequency point is denoted 0, and the Sequence numbers of the following frequency points are incremented by 1.

Die Operationsformel des Nehmens der Absolutwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist wie folgt: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| oder die Operationsformel des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist wie folgt: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| wobei μ ein Glättungsfilterungskoeffizient ist und der Wertebereich (0, 1) ist, was 0,125 sein kann. X_amp_i(k) bezeichnet die Filterungsausgabe des k-ten Frequenzpunkts des i-ten Frames, X _i(k) bezeichnet den MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames, und wenn i = 0, X_amp_i-1(k) = 0.

a2. einem Ergebnis der Glättungsfilterung entsprechend wird die Position eines maximalen Extremwerts der Filterungsausgaben gesucht, und die Position dieses maximalen Extremwerts wird als die Tonposition genommen;

The operation formula of taking the absolute values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows:

X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) |

or the operation formula of taking the square values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering is as follows:

X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) |

where μ is a smoothing filtering coefficient and the range of values is (0, 1), which may be 0.125. X_amp _i (k) denotes the filtering output of the k-th frequency point of the ith frame,

X

_i (k) denotes the MDCT coefficient after decoding the k-th frequency point of the ith frame, and when i = 0, X_amp _i-1 (k) = 0.

a2. according to a result of the smoothing filtering, the position of a maximum extreme value of the filtering outputs is searched, and the position of this maximum extreme value is taken as the sound position;

Der Ton des Audiosignals in der vorliegenden Erfindung ist die Tonhöhe eines Audiosignals oder eine bestimmte Harmonische der Tonhöhe.The sound of the audio signal in the present invention is the pitch of an audio signal or a certain harmonic of the pitch.

Es gibt die folgenden zwei Verfahren, um die Position des maximalen Extremwerts der Filterungsausgaben des ersten Frequenzsegments zu suchen:

(1) ein anfänglicher Maximalwert wird direkt aus den Filterungsausgaben der Frequenzbereichskoeffizienten gesucht, die dem ersten Frequenzsegment entsprechen, und dieser Maximalwert wird als der maximale Extremwert der Filterungsausgaben des ersten Frequenzsegments genommen, und die Sequenznummer des entsprechenden Frequenzpunkts wird als die Position des maximalen Extremwerts (d. h., des Tons) genommen;
(2) während der Suche nach dem maximalen Extremwert wird ein Segment in diesem ersten Frequenzsegment als das zweite Frequenzsegment genommen, und ein anfänglicher Maximalwert wird aus den Filterungsausgaben der Frequenzbereichskoeffizienten gesucht, die dem zweiten Frequenzsegment entsprechen, und dieser anfängliche Maximalwert wird als der maximale Extremwert der Filterungsausgaben des ersten Frequenzsegments genommen, und die Sequenznummer des entsprechenden Frequenzpunkts wird als die Position des maximalen Extremwerts (d. h., des Tons) genommen.

There are the following two methods for finding the position of the maximum extreme value of the filtering outputs of the first frequency segment:

(1) an initial maximum value is searched directly from the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment, and this maximum value is taken as the maximum extreme value of the filtering outputs of the first frequency segment, and the sequence number of the corresponding frequency point is expressed as the position of the maximum extreme value ( ie, the sound);
(2) During the maximum extreme value search, a segment in this first frequency segment is taken as the second frequency segment, and an initial maximum value is searched from the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment, and this initial maximum value is considered the maximum extreme value the filtering outputs of the first frequency segment are taken, and the sequence number of the corresponding frequency point is taken as the position of the maximum extreme value (ie, tone).

Die Startpunktposition des zweiten Frequenzsegments ist größer als der Startpunkt des ersten Frequenzsegments, und die Endpunktposition des zweiten Frequenzsegments ist kleiner als der Endpunkt des ersten Frequenzsegments, und bevorzugt ist die Zahl der Frequenzbereichskoeffizienten im ersten Frequenzsegment und im zweiten Frequenzsegment nicht kleiner als 8.The start point position of the second frequency segment is greater than the start point of the first frequency segment, and the end point position of the second frequency segment is smaller than the end point of the first frequency segment, and preferably the number of frequency domain coefficients in the first frequency segment and the second frequency segment is not less than 8.

Um zu verhindern, dass der Frequenzbereichskoeffizient, der dem gesuchten anfänglichen Maximalwert entspricht, nicht die Tonposition des Audiosignals ist, wird während der Suche nach dem Tonposition zuerst der anfängliche Maximalwert aus den Filterungsausgaben dieses zweiten Frequenzsegments gesucht, und je nach Position des Frequenzbereichskoeffizienten, der dem anfänglichen Maximalwert entspricht, werden verschiedene Prozesse durchgeführt:

(a) wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments ist, wird diese Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer früheren niedrigeren Frequenz im ersten Frequenzsegment verglichen und fortlaufend vorwärts verglichen, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, und der aktuelle Frequenzbereichskoeffizient gilt als die Tonposition, das heißt, diese Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten ist der endgültig bestimmte maximale Extremwert, oder sie werden verglichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten einer niedrigsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, und der Frequenzbereichskoeffizient der niedrigsten Frequenz des ersten Frequenzsegments gilt als die Tonposition, das heißt, die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments ist der endgültig bestimmte maximale Extremwert;
(b) wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments ist, wird diese Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer späteren höheren Frequenz im ersten Frequenzsegment verglichen und fortlaufend rückwärts verglichen, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, und der aktuelle Frequenzbereichskoeffizient gilt als die Tonposition, das heißt, diese Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten ist der endgültig bestimmte maximale Extremwert, oder sie werden verglichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten einer höchsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, und der Frequenzbereichskoeffizient der höchsten Frequenz des ersten Frequenzsegments gilt als die Tonposition, das heißt, die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments ist der endgültig bestimmte maximale Extremwert;
(c) wenn dieser anfängliche Maximalwert die Filterungsausgabe eines Frequenzbereichskoeffizienten zwischen der niedrigsten Frequenz und der höchsten Frequenz im zweiten Frequenzsegment ist, ist der Frequenzbereichskoeffizient, der diesem anfänglichen Maximalwert entspricht, die Tonposition, das heißt, dieser anfängliche Maximalwert ist der endgültig bestimmte maximale Extremwert.

In order to prevent the frequency domain coefficient corresponding to the searched initial maximum value from being the audio position of the audio signal, during the search for the audio position, the initial maximum value is first searched for from the filtering outputs of that second frequency segment, and depending on the position of the frequency domain coefficient initial maximum value, various processes are performed:

(a) if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, this filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment is compared with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment and continuously forward compared until Filtering output of the current frequency domain coefficient is greater than the filtering output of a previous frequency domain coefficient, and the current frequency domain coefficient is the audio position, that is, this filtering output of the current frequency domain coefficient is the final maximum maximum value or they are compared until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later one Frequency range coefficients, and the frequency range coefficient of the lowest frequency of the first frequency segment is considered to be the sound position, that is, the filtering output of the frequency range coefficient of the lowest frequency of the first frequency segment is the finally determined maximum extreme value;
(b) if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, this filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment is compared with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment and continuously compared backwards until the Filtering output of the current frequency domain coefficient is greater than the filtering output of a later frequency domain coefficient, and the current frequency domain coefficient is the tone position, that is, this filtering output of the current frequency domain coefficient is the final maximum maximum value, or compared until the filtering output of the frequency domain coefficient is a highest frequency of the first frequency segment is greater than the filtering output of a previous frequency z-range coefficients, and the frequency domain coefficient of the highest frequency of the first frequency segment is considered the audio position, that is, the filtering output of the frequency domain coefficient of the highest frequency of the first frequency segment is the final maximum maximum value;
(c) If this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

Im Folgenden wird das Verfahren zur Bestimmung der Audiosignalposition beschrieben, wobei als Beispiel angenommen wird, dass die Frequenzbereichskoeffizienten des ersten Frequenzsegments die 24. bis 64. MDCT-Frequenzbereichskoeffizienten sind und die Frequenzbereichskoeffizienten des zweiten Frequenzsegments die 33. bis 56. MDCT-Frequenzbereichskoeffizienten sind:
Der Maximalwert wird aus den Filterungsausgaben der 33. bis 56. MDCT-Frequenzbereichskoeffizienten gesucht; wenn der Maximalwert dem 33. Frequenzbereichskoeffizienten entspricht, wird beurteilt, ob das erkannte Ausgabeergebnis des 32. Frequenzbereichskoeffizienten größer ist als das des 33. Frequenzbereichskoeffizienten, und wenn ja, wird der Vergleich vorwärts fortgesetzt, und es wird beurteilt, ob das erkannte Ausgabeergebnis des 31. Frequenzbereichskoeffizienten größer ist als das des 32. Frequenzbereichskoeffizienten, wobei diesem Verfahren gemäß fortlaufend vorwärts verglichen wird, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die eine frühere; oder bis gefunden wird, dass die Filterungsausgabe des 24. Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe des 25. Frequenzbereichskoeffizienten, und dann ist der aktuelle Frequenzbereichskoeffizient oder der 24. Frequenzbereichskoeffizient die Tonposition.In the following, the method for determining the audio signal position will be described assuming, for example, that the frequency domain coefficients of the first frequency segment are the 24th to 64th MDCT frequency domain coefficients and the frequency domain coefficients of the second frequency segment are the 33rd to 56th MDCT frequency domain coefficients.
The maximum value is sought from the filtering outputs of the 33rd to 56th MDCT frequency domain coefficients; if the maximum value corresponds to the 33rd frequency domain coefficient, it is judged whether the recognized output result of the 32nd frequency domain coefficient is larger than that of the 33rd frequency domain coefficient. Frequency domain coefficients, and if so, the comparison is continued forwards, and it is judged whether the recognized output result of the 31st frequency domain coefficient is greater than that of the 32nd frequency domain coefficient, this method being continuously forward compared until the filtering output of the current frequency domain coefficient is larger is as the one earlier; or until it is found that the filtering output of the 24th frequency domain coefficient is greater than the filtering output of the 25th frequency domain coefficient, and then the current frequency domain coefficient or the 24th frequency domain coefficient is the sound position.

Wenn der Maximalwert der 56. ist, wird ein entsprechendes Verfahren angewandt, um fortlaufend rückwärts zu suchen, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die eines späteren, und der aktuelle Frequenzbereichskoeffizient ist die Tonposition; oder bis gefunden wird, dass die Filterungsausgabe des 64. Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe des 63. Frequenzbereichskoeffizienten, und dann ist der 64. Frequenzbereichskoeffizient die Tonposition.If the maximum value is 56th, a corresponding method is used to continuously search backward until the filtering output of the current frequency domain coefficient is larger than that of a later one, and the current frequency domain coefficient is the sound position; or until it is found that the filtering output of the 64th frequency domain coefficient is greater than the filtering output of the 63rd frequency domain coefficient, and then the 64th frequency domain coefficient is the audio position.

Wenn der Maximalwert zwischen dem 33. und 56. liegt, ist der Frequenzbereichskoeffizient, der diesem Maximalwert entspricht, die Tonposition.When the maximum value is between the 33rd and 56th, the frequency domain coefficient corresponding to this maximum value is the sound position.

Der Wert dieser Position wird als Tonal_pos bezeichnet, das heißt, die Sequenznummer des Frequenzpunkts, der dem maximalen Extremwert entspricht.
102: eine Spektralbandreplikationsperiode wird der Tonposition entsprechend bestimmt, und diese Spektralbandreplikationsperiode ist die Bandbreite vom Nullfrequenzpunkt zum Frequenzpunkt der Tonposition;The value of this position is called tonal_pos, that is, the sequence number of the frequency point that corresponds to the maximum extreme value.
102 : a spectral band replication period is determined according to the sound position, and this spectral band replication period is the bandwidth from the zero frequency point to the frequency point of the sound position;

Die Spektralbandreplikationsperiode wird als copy_period bezeichnet, und die copy_period ist gleich Tonal_pos plus 1.
103: ein Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt des Frequenzpunkts der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt, wird als das Quellfrequenzsegment genommen, und die Spektralbandreplikation wird für Nullbit-Codierungsteilbänder durchgeführt.The spectral band replication period is called copy_period, and the copy_period is equal to tonal_pos plus 1.
103 : a frequency segment from a frequency point of the zero-frequency point shifting backward copy-band_offset frequency points to a frequency point of the frequency point of the audio position shifting backward copy-band_offset frequency points is taken as the source-frequency segment, and the spectral-band replication is performed for zero-bit coding sub-bands.

Das in der vorliegenden Erfindung genannte Nullbit-Codierungsteilband bezieht sich auf die Codierungsteilbänder, welchen 0 Bit zugeordnet sind, und wird auch uncodiertes Codierungsteilband genannt.The zero-bit coding sub-band referred to in the present invention refers to the coding sub-bands to which 0 bits are assigned, and is also called an uncoded coding sub-band.

Das heißt, die Startsequenznummer des Frequenzpunkts des Quellfrequenzsegments ist copyband_offset, und die Endsequenznummer ist copyband_offset + Tonal_pos.That is, the start sequence number of the frequency point of the source frequency segment is copyband_offset, and the end sequence number is copyband_offset + tonal_pos.

In der vorliegenden Erfindung ist der Wert des Spektralbandreplikationsversatzes (als copyband_offset bezeichnet) voreingestellt, copyband_offset ☐ 0, und wenn der voreingestellte copyband_offset = 0, ist das Quellfrequenzsegment das Frequenzsegment vom Nullfrequenzpunkt zum Frequenzpunkt der Tonposition, und um Spektralsprünge der Spektralbandreplikation zu reduzieren, ist copyband_offset auf größer als null eingestellt, und dann ist das Quellfrequenzsegment der MDCT-Frequenzbereichskoeffizient von einem Frequenzpunkt des Nullfrequenzpunkts, der einen kleinen Bereich von Frequenzpunkten zu einem Frequenzpunkt des Frequenzpunkts des Frequenzpunkts der Position des maximalen Extremwerts rückwärts verschiebt, die einen selben kleinen Bereich von Frequenzpunkten rückwärts verschiebt, und die Spektralfüllung der Nullbit-Codierungsteilbänder oberhalb einer bestimmten Frequenzpunkts wird ganz vom Quellfrequenzsegment repliziert;
während der Durchführung der Spektralbandreplikation wird zuerst dem Quellfrequenzsegment und einer Startsequenznummer des Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, entsprechend eine Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands berechnet, und dann wird die Spektralbandreplikationsperiode als die Periode genommen, die Frequenzbereichskoeffizienten des Quellfrequenzsegments werden von der Quellfrequenzsegment-Replikationsstartsequenznummer an periodisch auf das Nullbit-Codierungsteilband repliziert.In the present invention, the value of the spectral band replication offset (called copyband_offset) is preset, copyband_offset_0, and if the default copyband_offset = 0, the source frequency segment is the frequency segment from the zero frequency point to the frequency point of the audio position, and to reduce spectral transitions of the spectral band replication is copyband_offset is set to greater than zero, and then the source frequency segment is the MDCT frequency domain coefficient of a frequency point of the zero frequency point which shifts a small range of frequency points backward to a frequency point of the frequency point of the maximum extreme position frequency point which reverses a same small range of frequency points shifts, and the spectral fill of the zero bit coding subbands above a certain frequency point is replicated entirely from the source frequency segment;
while performing the spectral band replication, first, the source frequency segment and a start sequence number of the zero bit coding subband requiring spectral band replication are computed according to a source frequency segment replication start sequence number of that zero bit coding subband, and then the spectral band replication period is taken as the period; the frequency domain coefficients of the source frequency segment are taken from the Source frequency segment replication start sequence number is periodically replicated to the zero bit coding subband.

Ein Verfahren zur Bestimmung der Quellfrequenzsegment-Replikationstartsequenznummer ist:
Zuerst wird, beginnend mit dem ersten Nullbit-Codierungsteilband, das der Replikation bedarf, die Sequenznummer des Frequenzpunkts des Start-MDCT-Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands erhalten, deren Frequenzbereichskoeffizienten zu rekonstruieren sind, welche als fillband_start_freq bezeichnet wird, und die Sequenznummer des Frequenzpunkts, der dem Ton entspricht, wird als Tonal_pos bezeichnet, und die Replikationsperiode copy_period wird durch Tonal_pos plus 1 erhalten. Der Spektralbandreplikationsversatz wird als copyband_offset bezeichnet, und der Wert von fillband_start_freq wird rekursiv von copy_period subtrahiert, bis der Wert im Wertebereich der Sequenznummer des Quellfrequenzsegments liegt, und dieser Wert ist die Quellfrequenzsegment-Replikationsstartsequenznummer, die als copy_pos_mod bezeichnet wird.One method of determining the source frequency segment replication start sequence number is:
First, starting with the first zero bit coding subband requiring replication, the sequence number of the frequency point of the start MDCT frequency domain coefficient of the zero bit coding subband whose frequency domain coefficients are to be reconstructed, called fillband_start_freq, and the sequence number of the frequency point, which corresponds to the tone is called tonal_pos and the replication period copy_period is obtained by tonal_pos plus 1. The spectral band replication offset is called copyband_offset, and the value of fillband_start_freq is recursively subtracted from copy_period until the value is within the range of the sequence number of the source frequency segment, and this value is the source frequency segment replication start sequence number, which is called copy_pos_mod.

Die Quellfrequenzsegment-Replikationsstartsequenznummer copy_pos_mod kann durch den folgenden Pseudocode-Algorithmus erhalten werden:

The source frequency segment replication start sequence number copy_pos_mod can be obtained by the following pseudocode algorithm:

Nach Abschluss der Operation ist copy_pos_mod die Quellfrequenzsegment-Replikationsstartsequenznummer.Upon completion of the operation, copy_pos_mod is the source frequency segment replication start sequence number.

Während der Replikation werden die Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an rückwärts fortlaufend auf das Nullbit-Codierungsteilband repliziert, das fillband_start_freq als die Startposition nimmt, bis den Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt von Tonal_pos + copyband_offset erreicht, und die Frequenzbereichskoeffizienten werden vom copyband_offset-Frequenzpunkt an kontinuierlich rückwärts auf das Nullbit-Codierungsteilband weiterrepliziert usw., und der Rest kann durch Analogie abgeleitet werden, bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten im aktuellen Nullbit-Codierungsteilbands abgeschlossen ist.During replication, the frequency domain coefficients are replicated backwards from the source frequency segment replication start sequence number to the zero bit coding subband taking fillband_start_freq as the start position until the frequency point of the source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset, and the frequency domain coefficients become from the copyband_offset frequency point is further replicated backwards to the zero-bit coding sub-band, etc., and the remainder can be derived by analogy until the spectral-band replication of all frequency-domain coefficients in the current zero-bit coding subband is completed.

Wenn der Spektralbandreplikationsversatz copyband_offset auf 10 eingestellt ist, wird das Frequenzband beginnend mit copy_pos_mod von fillband_start_freq an in einer Reihenfolge von der niedrigen Frequenz zur hohen Frequenz auf das Nullbit-Codierungsteilband repliziert, bis nach dem Frequenzpunkt Tonal_pos + 10 die Replikation vom 10. Frequenzbereichskoeffizienten an erneut gestartet wird, und der Rest kann durch Analogie abgeleitet werden, und alle Signale dieses Nullbit-Codierungsteilbands werden vom 10. bis zum Tonal_pos + 10 Frequenzbereichskoeffizienten repliziert, und die Frequenzbereichskoeffizienten von den Frequenzpunkten 10 bis Tonal_pos + 10 sind das Quellfrequenzsegment der Spektralbandreplikation.When the spectral band replication offset copyband_offset is set to 10, the frequency band is replicated from fillband_start_freq to the zero bit coding subband in order from low frequency to high frequency, beginning with copy_pos_mod, and replicating from the 10th frequency domain coefficient to after the frequency point Tonal_pos + 10 The remainder may be derived by analogy and all the signals of this zero bit coding subband are replicated from the 10th to the Tonal_pos + 10 frequency domain coefficients, and the frequency domain coefficients from the frequency points 10 to Tonal_pos + 10 are the source frequency segment of spectral band replication.

Die Anwendung des erfindungsgemäßen Verfahrens zur Spektralbandreplikation kann das Spektrum für alle Nullbit-Codierungsteilbänder replizieren und die Spektrumsrekonstruktion auch durchführen, indem für Nullbit-Codierungsteilbänder unterhalb eines bestimmten Frequenzpunkts ein Verfahren zur zufälligen Rauschfüllung angewandt wird, und für Nullbit-Codierungsteilbänder oberhalb eines bestimmten Frequenzpunkts das Verfahren zur Frequenzbereichskoeffizientenreplikation kombiniert mit der Rauschfüllung angewandt wird, um die Spektrumsrekonstruktion durchzuführen.The use of the spectral band replication method of the present invention can replicate the spectrum for all zero-bit coding subbands and also perform spectrum reconstruction by applying a random noise filling method to zero-bit coding subbands below a particular frequency point, and the method for zero-bit coding subbands above a particular frequency point for frequency domain coefficient replication combined with noise filling is applied to perform the spectrum reconstruction.

2 ist ein schematisches Strukturdiagramm des Verfahrens zur Audiodecodierung nach einem Beispiel der vorliegenden Erfindung. Wie in 2 gezeigt, umfasst dieses Verfahren:
201: für alle amplitudenhüllkurvencodierten Bits in einem zu decodierenden Bitstrom werden die Decodierung und die Inversquantisierung durchgeführt, um die Amplitudenhüllkurve jedes Codierungsteilbands zu erhalten;
codierte Bits eines Frames werden aus dem codierten Bitstrom extrahiert, der vom Codierungsende (das heißt, vom Bitstrom-Demultiplexer DeMUX) übertragen wird; nach der Extraktion der codierten Bits wird jedes amplitudenhüllkurvencodierte Bit in diesem Frame decodiert, um den quantitativen Index der Amplitudenhüllkurve jedes Codierungsteilbands Th_q (J), j = 0, ..., L – 1 zu erhalten. Für den quantitativen Index der Amplitudenhüllkurve wird die Inversquantisierung durchgeführt, um die Amplitudenhüllkurve rms (r), r = 0, ..., L – 1 zu erhalten.
202: die Bitzuweisung wird für jedes Codierungsteilband durchgeführt;
ein Anfangswert der Signifikanz jedes Codierungsteilbands wird dem quantitativen Index der Amplitudenhüllkurve jedes Codierungsteilbands entsprechend berechnet, und die Bitzuweisung wird für jedes Codierungsteilband unter Verwendung der Signifikanz des Codierungsteilbands durchgeführt, um die Bitzuweisungszahl der Codierungsteilbänder zu erhalten; das Verfahren zur Bitzuweisung am Decodierungsende ist mit dem am Codierungsende völlig identisch. Im Prozess der Bitzuweisung sind die Bitzuweisungsschrittweite und die codierungsteilbandsignifikanzreduzierte Schrittweite nach der Bitzuweisung variabel.
203: der Bitzuweisungszahl des Codierungsteilbands entsprechend werden die Inversquantisierung und Decodierung an jedem Nicht-Nullbit-Codierungsteilband durchgeführt, um die MDCT-Frequenzbereichskoeffizienten der Nicht-Nullbit-Codierungsteilbänder zu erhalten;
204: die Position eines bestimmten Tons des Audiosignals wird in den MDCT-Frequenzbereichskoeffizienten gesucht, die Bandbreite vom Nullfrequenzpunkt zum Frequenzpunkt der Tonposition wird als die Spektralbandreplikationsperiode genommen, das Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt der Tonposition, der copyband_offset-Frequenzpunkte rückwärts verschiebt, wird als das Quellfrequenzsegment genommen, und die Spektralbandreplikation wird am Nullbit-Codierungsteilband durchgeführt; der detaillierte Prozess dieses Schritts wird im Verfahren zur Spektralbandreplikation erläutert und es wird nicht näher darauf eingegangen.
205: der Amplitudenhüllkurve des aktuellen Codierungsteilbands entsprechend wird die Energieanpassung für die durch Replikation erhaltenen Frequenzbereichskoeffizienten durchgeführt, und in Kombination mit der Rauschfüllung werden die rekonstruierten Frequenzbereichskoeffizienten der Nullbit-Codierungsteilbänder erhalten;
den vom Codierungsende übertragenen rauschpegelcodierten Bits entsprechend wird die Energieanpassung für die Frequenzbereichskoeffizienten durchgeführt, die durch Replikation in jedem Nullbit-Codierungsteilband erhalten wurden:
die Amplitudenhüllkurve der Frequenzbereichskoeffizienten, die durch Replikation des Nullbit-Codierungsteilband r erhalten wurden, wird berechnet und wird als sbr_rms (r) bezeichnet. 2 Fig. 10 is a schematic structural diagram of the audio decoding method according to an example of the present invention. As in 2 shown, this procedure includes:
201 for all the amplitude envelope coded bits in a bitstream to be decoded, the decoding and the inverse quantization are performed to obtain the amplitude envelope of each coding subband;
coded bits of a frame are extracted from the coded bit stream transmitted from the coding end (that is, from the bit stream demultiplexer DeMUX); after extracting the coded bits, each amplitude envelope coded bit in that frame is decoded to obtain the quantitative index of the amplitude envelope of each coding subband _Thq (J), j = 0, ..., L-1. For the quantitative index of the amplitude envelope, the inverse quantization is performed to obtain the amplitude envelope rms (r), r = 0, ..., L-1.
202 the bit allocation is performed for each coding subband;
an initial value of the significance of each coding subband is calculated in accordance with the quantitative index of the amplitude envelope of each coding subband, and the bit allocation is performed for each coding subband using the significance of the coding subband to obtain the bit allocation number of the coding subbands; the method for bit allocation at the decoding end is completely identical to that at the end of the coding. In the process of bit allocation, the bit allocation step size and the coding subband significance reduced step size are variable after the bit allocation.
203 In accordance with the bit allocation number of the encoding subband, the inverse quantization and decoding are performed on each non-zero bit coding subband to obtain the MDCT frequency domain coefficients of the nonzero bit coding subbands;
204 the position of a particular tone of the audio signal is searched in the MDCT frequency domain coefficients, the bandwidth from the zero frequency point to the frequency point of the tone position is taken as the spectral band replication period, the frequency segment is shifted from a frequency point of the zero frequency point, the copyband_offset frequency points backwards, to a frequency point of the Tone position shifting backwards copy-band_offset frequency points is taken as the source frequency segment, and spectral-band replication is performed on the zero-bit coding sub-band; The detailed process of this step is explained in the spectral band replication procedure and will not be discussed further.
205 in accordance with the amplitude envelope of the current encoding subband, the energy adjustment for the frequency domain coefficients obtained by replication is performed, and in combination with the noise filling, the reconstructed frequency domain coefficients of the zero bit coding subbands are obtained;
In accordance with the noise level coded bits transmitted from the coding end, the power adjustment is performed for the frequency domain coefficients obtained by replication in each zero-bit coding subband:
the amplitude envelope of the frequency domain coefficients obtained by replicating the zero-bit coding subband r is calculated and is referred to as sbr_rms (r).

Die Berechnungsformel zur Durchführung der Energieanpassung an den Frequenzbereichskoeffizienten ist: X_sbr(r) = X_sbr(r)·sbr_lev_scale(r)·rms(r)/sbr_rms(r) The calculation formula for performing the energy adjustment on the frequency domain coefficients is: X _ sbr (r) = X_sbr (r) · sbr_lev_scale (r) · rms (r) / sbr_rms (r)

Wobei X _ sbr (r) die Frequenzbereichskoeffizienten nach der Energieanpassung des Nullbit-Codierungsteilbands r bezeichnet, X_sbr(r) die Frequenzbereichskoeffizienten bezeichnet, die durch Replikation des Nullbit-Codierungsteilbands r erhalten wurden, sbr_rms(r) die Amplitudenhüllkurve (das heißt, das quadratische Mittel) der Frequenzbereichskoeffizienten ist, die durch Replikation X_sbr(r) des Nullbit-Codierungsteilbands r erhalten wurden, rms(r) die Amplitudenhüllkurve der Frequenzbereichskoeffizienten vor der Codierung des Nullbit-Codierungsteilbands r ist, und sbr_lev_scale(r) der Energiegewinn-Skalierfaktor der Spektralbandreplikation des Nullbit-Codierungsteilbands r ist, und der Wertebereich (0, 2) ist. Der praktischen auditiven Wahrnehmung entsprechend kann jedes Teilband dieselben oder unterschiedliche Koeffizientenwerte annehmen.In which X _ sbr (r) denotes the frequency domain coefficients after energy adjustment of the zero bit coding subband r, X_sbr (r) denotes the frequency domain coefficients obtained by replicating the zero bit coding subband r, sbr_rms (r) the amplitude envelope (that is, the root mean square) of the frequency domain coefficients rms (r) is the amplitude envelope of the frequency domain coefficients before encoding the zero bit coding subband r, and sbr_lev_scale (r) is the energy gain scaling factor of the spectral band replication of the zero bit coding subband r obtained by replication X_sbr (r) of the zero bit coding subband r r is and the range of values is (0, 2). According to practical auditory perception, each subband may assume the same or different coefficient values.

Nach Abschluss der Energieanpassung der replizierten Frequenzbereichskoeffizienten werden die Frequenzbereichskoeffizienten nach der Energieanpassung durch das Weißrauschen hinzugefügt, um den endgültig rekonstruierten Frequenzbereichskoeffizienten X zu generieren: X(r) = X_sbr(r) + rms(r)·noise_lev_scale(r)·random(); After completing the energy adjustment of the replicated frequency domain coefficients, the frequency domain coefficients after the power adjustment by the white noise are added to the final reconstructed frequency domain coefficient X to generate: X (r) = X _ sbr (r) + rms (r) · noise_lev_scale (r) · random ();

Wobei X (r) den rekonstruierten Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands r bezeichnet, X _ sbr (r) den Frequenzbereichskoeffizienten nach der Energieanpassung des Nullbit-Codierungsteilbands r bezeichnet, rms(r) die Amplitudenhüllkurve des Frequenzbereichskoeffizienten vor der Codierung des Nullbit-Codierungsteilbands r ist, random() der zufällige Phasenwert ist, der vom Zufallsphasengenerator generiert wird, der zufällige Rückgabewerte von +1 oder –1 generiert, und noise_lev_scale(r) der Rauschpegel-Skalierfaktor des Nullbit-Codierungsteilbands r ist, und der Wertebereich (0, 2) ist. Der praktischen auditiven Wahrnehmung entsprechend kann jedes Teilband dieselben oder unterschiedliche Koeffizientenwerte annehmen.In which X (r) denotes the reconstructed frequency domain coefficient of the zero-bit coding subband r, X _ sbr (r) denotes the frequency domain coefficient after energy adjustment of the zero-bit coding subband r, rms (r) is the amplitude envelope of the frequency domain coefficient prior to encoding the zero-bit coding subband r, random () is the random phase value generated by the random-phase generator, the random return values of +1 or -1, and noise_lev_scale (r) is the noise level scaling factor of the zero-bit coding subband r, and the value range is (0, 2). According to practical auditory perception, each subband may assume the same or different coefficient values.

Für Frequenzbereichskoeffizienten des Nullbit-Codierungsteilbands, deren höchste Frequenz kleiner ist als die gesuchte Tonfrequenz, wird zur Durchführung der Rekonstruktion das Verfahren zur Rauschfüllung angewandt.For frequency range coefficients of the zero-bit coding subband whose highest frequency is smaller than the desired tone frequency, the method for noise filling is used to carry out the reconstruction.

Das erfindungsgemäße Verfahren zur Spektralbandreplikation kann angewandt werden, um die Spektrumsrekonstruktion für alle Nullbit-Codierungsteilbänder durchzuführen, und es kann auch ein Verfahren zur zufälligen Rauschfüllung anwenden, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder unterhalb eines bestimmten Frequenzpunkts durchzuführen, und ein Verfahren zur Frequenzbereichskoeffizientenreplikation kombiniert mit Rauschfüllung anwenden, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder oberhalb des bestimmten Frequenzpunkts durchzuführen.
206: die Inverse Modifizierte Diskrete Cosinus-Transformation (IMDCT) wird an den Frequenzbereichskoeffizienten von Nicht-Nullbit-Codierungsteilbändern und an den rekonstruierten Frequenzbereichskoeffizienten von Nullbit-Codierungsteilbändern durchgeführt, um das Endaudiosignal zu erhalten.The spectral band replication method of the present invention can be used to perform spectrum reconstruction for all zero-bit coding subbands, and it can also use a random noise filling method to perform spectrum reconstruction for zero-bit coding subbands below a particular frequency point, and a method of Apply frequency domain coefficient replication combined with noise filling to perform the spectrum reconstruction for zero bit coding subbands above the determined frequency point.
206 The Inverse Modified Discrete Cosine Transform (IMDCT) is performed on the frequency domain coefficients of non-zero bit coding subbands and on the reconstructed frequency domain coefficients of zero bit coding subbands to obtain the final audio signal.

Zur Implementierung des obigen Verfahrens zur Spektralbandreplikation stellt die vorliegende Erfindung auch eine Vorrichtung zur Spektralbandreplikation bereit, wie in 3 gezeigt, wobei die Vorrichtung zur Spektralbandreplikation ein Tonpositionssuchmodul, ein Perioden- und Quellfrequenzsegment-Berechnungsmodul, ein Quellfrequenzsegment-Replikationsstartindex-Berechnungsmodul und ein Spektralbandreplikationsmodul umfasst, die in Reihe geschaltet sind, wobei:
Das Tonpositionssuchmodul vorgesehen ist, um in den MDCT-Frequenzbereichskoeffizienten die Position eines bestimmten Tons eines Audiosignals zu suchen, und umfasst insbesondere: das Nehmen der Absolutwerte oder Quadratwerte der MDCT-Frequenzbereichskoeffizienten des ersten Frequenzsegments und das Durchführen einer Glättungsfilterung; und einem Ergebnis der Glättungsfilterung entsprechend, das Suchen nach einer Position eines maximalen Extremwerts der Filterungsausgabe des ersten Frequenzsegments, und die Position dieses maximalen Extremwerts ist dann die Tonposition.
das Perioden- und Quellfrequenzsegment-Berechnungsmodul vorgesehen ist, um der Tonposition entsprechend eine Spektralbandreplikationsperiode und ein Quellfrequenzsegment zu bestimmen, und die Spektralbandreplikationsperiode ist die Bandbreite vom Nullfrequenzpunkt zum Frequenzpunkt der Tonposition, das Quellfrequenzsegment ist das Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der rückwärts zu einem Frequenzpunkt des Frequenzpunts der Tonposition verschiebt, der copyband_offset-Frequenzpunkte rückwärts verschiebt;
wenn die Sequenznummer des Frequenzpunkts der Tonposition als Tonal_pos bezeichnet wird, wird der voreingestellte Spektralbandreplikationsversatz als copyband_offset bezeichnet, und dann ist copyband_offset die Startsequenznummer der Frequenzbereichskoeffizienten des Quellfrequenzsegment, und die Endsequenznummer ist copyband_offset + Tonal_pos.In order to implement the above spectral band replication method, the present invention also provides a spectral band replication apparatus as disclosed in U.S.P. 3 wherein the spectral band replication apparatus comprises a tone position search module, a period and source frequency segment calculation module, a source frequency segment replication start index calculation module, and a spectral band replication module connected in series, wherein:
The audio position search module is provided to search in the MDCT frequency domain coefficients the position of a particular audio signal sound, and in particular comprises: taking the absolute values or square values of the MDCT frequency domain coefficients of the first frequency segment and performing a smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering output of the first frequency segment, and the position of this maximum extreme value is then the sound position.
the period and source frequency segment calculation module is provided to determine a spectral band replication period and a source frequency segment according to the sound position, and the spectral band replication period is the bandwidth from the zero frequency point to the frequency point of the sound position; the source frequency segment is the frequency segment from a frequency point of the zero frequency point going back to one Shifts the frequency point of the frequency point of the sound position shifting backwards copy-band_offset frequency points;
if the sequence number of the frequency point of the tonal position is called tonal_pos, the preset spectral band replication offset is called copyband_offset, and then copyband_offset is the start sequence number of the frequency domain coefficients of the source frequency segment, and the end sequence number is copyband_offset + tonal_pos.

Das Quellfrequenzsegment-Replikationsstartsequenznummer-Berechnungsmodul ist vorgesehen, um denn Quellfrequenzsegment und der Startsequenznummer des Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf, entsprechend die Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands zu berechnen.The source frequency segment replication start sequence number calculation module is provided to calculate the source frequency segment and the start sequence number of the zero bit encoding subband requiring spectral band replication according to the source frequency segment replication start sequence number of that zero bit encoding subband.

Das Spektralbandreplikationsmodul ist vorgesehen, um die Spektralbandreplikationsperiode als Periode zu nehmen und von der Quellfrequenzsegment-Replikationstartsequenznummer an die periodische Replikation der Frequenzbereichskoeffizienten des Quellfrequenzsegments auf das Nullbit-Codierungsteilband durchzuführen.The spectral band replication module is provided to take the spectral band replication period as a period and to perform from the source frequency segment replication start sequence number to the periodic replication of the frequency domain coefficients of the source frequency segment to the zero bit encoding subband.

Bevorzugt ist die Operationsformel dieses Tonpositionssuchmoduls des Nehmens der Absolutwerte der MDCT-Frequenzbereichskoeffizienten des ersten Frequenzsegments, um die Glättungsfilterung durchzuführen: X_amp_i(k) = μX_amp_i-1(k) + (1 – μ)|X _i(k)| Oder die Operation des Nehmens der Quadratwerte der Frequenzbereichskoeffizienten des ersten Frequenzsegments zur Durchführung der Glättungsfilterung ist: X_amp_i(k) = μX_amp_i-1(k – 1) + (1 – μ)X _i(k)² Preferably, the operation formula of this tone position search module is to take the absolute values of the MDCT frequency domain coefficients of the first frequency segment to perform the smoothing filtering: X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) | Or the operation of taking the square values of the frequency domain coefficients of the first frequency segment to perform the smoothing filtering is: X_amp _i (k) = μX_amp _i-1 (k-1) + (1-μ) X _i (k) ²

Wobei μ ein Glättungsfilterungskoeffizient ist, X_amp_i(k) die Filterungsausgaben des k-ten Frequenzpunkts des i-ten Frames bezeichnet, und X _i(n) MDCT-Koeffizienten nach der Decodierung des k-ten Frequenzpunkts des i-ten Frames sind, und wenn i = 0, X_amp_i-1(x) = 0.Where μ is a smoothing filtering coefficient, X_amp _i (k) denotes the filtering outputs of the k-th frequency point of the i-th frame, and X _i (n) are MDCT coefficients after decoding the k-th frequency point of the ith frame, and when i = 0, X_amp _i-1 (x) = 0.

Bevorzugt ist das erste Frequenzsegment ein Frequenzsegment mit niedrigen Frequenzen, deren Energie den spektralstatistischen Eigenschaften gemäß zentralisierter ist, wobei die niedrigen Frequenzen sich auf Spektralkomponenten beziehen, die weniger als die Hälfte einer Gesamtbandbreite eines Signals ausmachen.Preferably, the first frequency segment is a frequency segment having low frequencies whose energy is more centralized according to spectral statistics, the low frequencies referring to spectral components that make up less than half the total bandwidth of a signal.

Bevorzugt sucht das Tonpositionssuchmodul in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem ersten Frequenzsegment entsprechen, direkt nach einem anfänglichen Maximalwert, und dieser Maximalwert wird als der maximale Extremwert der Filterungsausgaben des ersten Frequenzsegments genommen.Preferably, the sound position search module in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment searches immediately after an initial maximum value, and this maximum value is taken as the maximum extreme value of the filtering outputs of the first frequency segment.

Bevorzugt wird, wenn das Tonpositionssuchmodul den maximalen Extremwert der Filterungsausgaben bestimmt, ein Segment im ersten Frequenzsegment als zweites Frequenzsegment genommen, und ein anfänglicher Maximalwert in den Filterungsausgaben der Frequenzbereichskoeffizienten, die dem zweiten Frequenzsegment entsprechen, wird gesucht, und je nach Position des Frequenzbereichskoeffizienten, der diesem anfänglichen Maximalwert entspricht, werden verschiedene Prozesse durchgeführt:

a. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten einer niedrigsten Frequenz des zweiten Frequenzsegments ist, wird diese Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer früheren niedrigeren Frequenz im ersten Frequenzsegment verglichen und fortlaufend vorwärts verglichen, bis die Filterungsausgabe eines aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, und die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten ist dann der endgültig bestimmte maximale Extremwert, oder, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, und die Filterungsausgabe des Frequenzbereichskoeffizienten der niedrigsten Frequenz des ersten Frequenzsegments ist dann als der endgültig bestimmte maximale Extremwert;
b. wenn dieser anfängliche Maximalwert die Filterungsausgabe des Frequenzbereichskoeffizienten einer höchsten Frequenz des zweiten Frequenzsegments ist, wird diese Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des zweiten Frequenzsegments mit der Filterungsausgabe des Frequenzbereichskoeffizienten einer späteren höheren Frequenz im ersten Frequenzsegment verglichen und fortlaufend rückwärts verglichen, bis die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten größer ist als die Filterungsausgabe eines späteren Frequenzbereichskoeffizienten, und die Filterungsausgabe des aktuellen Frequenzbereichskoeffizienten ist dann der endgültig bestimmte maximale Extremwert, oder verglichen, bis die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments größer ist als die Filterungsausgabe eines früheren Frequenzbereichskoeffizienten, und die Filterungsausgabe des Frequenzbereichskoeffizienten der höchsten Frequenz des ersten Frequenzsegments ist dann der endgültig bestimmte maximale Extremwert;
c. wenn dieser anfängliche Maximalwert die Filterungsausgabe eines Frequenzbereichskoeffizienten zwischen der niedrigsten Frequenz und der höchsten Frequenz im zweiten Frequenzsegment ist, ist der Frequenzbereichskoeffizient, der diesem anfänglichen Maximalwert entspricht, die Tonposition, das heißt, dieser anfänglichen Maximalwert ist der endgültig bestimmte maximale Extremwert.

Preferably, when the sound position search module determines the maximum extreme value of the filtering outputs, a segment in the first frequency segment is taken as the second frequency segment, and an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment is searched, and depending on the position of the frequency domain coefficient this initial maximum value, various processes are performed:

a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment is compared with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment and continuously forward compared until the filtering output of a current one Frequency range coefficients is greater than the filtering output of a previous frequency domain coefficient, and the filtering output of the current frequency domain coefficient is then the final maximum maximum value, or until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient, and the filtering output of Frequency domain coefficients of low frequency of the first frequency segment is then as the finally determined maximum extreme value;
b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment is compared with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment and continuously backward compared until the filtering output of the current one Frequency domain coefficients is greater than the filtering output of a later frequency domain coefficient, and the filtering output of the current frequency domain coefficient is then the final maximum maximum value or compared until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient and the filtering output the frequency domain coefficient of the hö next frequency of the first frequency segment is then the finally determined maximum extreme value;
c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum determined maximum value.

Bevorzugt umfasst der Prozess des Quellfrequenzsegment-Startsequenznummer-Berechnungsmoduls zum Berechnen der Quellfrequenzsegment-Replikationsstartsequenznummer dieses Nullbit-Codierungsteilbands, das der Spektralbandreplikation bedarf: das Erhalten der Sequenznummer des Startfrequenzpunkts des Nullbit-Codierungsteilbands, dessen Frequenzbereichskoeffizienten aktuell zu rekonstruieren sind, wobei die Sequenznummer als fillband_start_freq bezeichnet wird, und einer Sequenznummer eines Frequenzpunkts, der dem Ton entspricht, die als Tonal_pos bezeichnet wird, und der Spektralbandreplikationsperiode, die als copy_period bezeichnet wird und deren Wert gleich Tonal_pos plus 1 ist, und einer Quellsegment-Startsequenznummer, die als copyband_offset bezeichnet wird, wobei der Wert von fillband_start_freq rekursiv von copy_period subtrahiert wird, bis dieser Wert in einen Wertebereich der Sequenznummer des Quellfrequenzsegments liegt, und dieser Wert dann die Quellfrequenzsegment-Replikationsstartsequenznummer ist.Preferably, the process of the source frequency segment start sequence number calculation module for calculating the source frequency segment replication start sequence number of this zero bit encoding subband requires spectral band replication: obtaining the sequence number of the start frequency point of the zero bit encoding subband whose frequency domain coefficients are to be currently reconstructed, the sequence number being called fillband_start_freq and a sequence number of a frequency point corresponding to the tone called Tonal_pos and the spectral band replication period called copy_period whose value is Tonal_pos plus 1 and a source segment start sequence number called copyband_offset the value of fillband_start_freq is recursively subtracted from copy_period until that value falls within a range of the sequence number of the source frequency segment, and then this value is the source frequency segment replication tart sequence number is.

Bevorzugt umfasst das Frequenzbandreplikationsmodul, das die Spektralbandreplikation durchführt, insbesondere:
die Frequenzbereichskoeffizienten von der Quellfrequenzsegment-Replikationsstartsequenznummer an werden beginnend mit fillband_start_freq fortlaufend rückwärts auf das Nullbit-Codierungsteilband repliziert, bis der Frequenzpunkt der Quellfrequenzsegmentreplikation einen Frequenzpunkt von Tonal_pos + copyband_offset erreicht, und die Frequenzbereichskoeffizienten werden vom copyband_offset-Frequenzpunkt an kontinuierlich rückwärts auf dieses Nullbit-Codierungsteilband weiterrepliziert, und der Rest kann durch Analogie abgeleitet werden, bis die Spektralbandreplikation aller Frequenzbereichskoeffizienten des aktuellen Nullbit-Codierungsteilbands abgeschlossen ist.The frequency band replication module that performs the spectral band replication preferably comprises in particular:
the frequency domain coefficients from the source frequency segment replication start sequence number are consecutively replicated back to the zero bit coding subband beginning with fillband_start_freq until the frequency point source frequency replication point reaches a frequency point of Tonal_pos + copyband_offset, and the frequency domain coefficients continuously change from the copyband_offset frequency point to this zero bit coding subband and the remainder may be derived by analogy until spectral band replication of all frequency domain coefficients of the current zero-bit coding subband is completed.

Um das obige Decodierverfahren zu implementieren, stellt die vorliegende Erfindung auch ein System zur Audiodecodierung bereit, und wie in 4 gezeigt, umfasst dieses System: einen Bitstrom-Demultiplexer (DeMUX), eine Amplitudenhüllkurven-Decodiereinheit, eine Bitzuweisungseinheit, eine Frequenzbereichskoeffizienten-Decodiereinheit, eine Spektralbandreplikationseinheit, eine Rauschfülleinheit und eine Inverse Modifizierte Diskrete Cosinus-Transformation (IMDCT), wobei:
Der Bitstrom-Demultiplexer (DeMUX) vorgesehen ist, um amplitudenhüllenkurvencodierte Bits, frequenzbereichskoeffizientencodierte Bits und rauschpegelcodierte Bits aus einem zu decodierenden Bitstrom zu trennen;
Die Amplitudenhüllkurven-Decodiereinheit, die mit dem Bitstrom-Demultiplexer verbunden ist, vorgesehen ist, um die Decodierung und Inversquantisierung für die vom Bitstrom-Demultiplexer ausgegebenen amplitudenhüllkurvencodierten Bits durchzuführen, um die Amplitudenhüllkurve jedes Codierungsteilbands zu erhalten;
die Bitzuweisungseinheit, die mit der Amplitudenhüllkurven-Decodiereinheit verbunden ist, zur Durchführung der Bitzuweisung vorgesehen ist, um die Zahl der codierten Bits zu erhalten, die jedem Frequenzbereichskoeffizienten in jedem Codierungsteilbands zugewiesen sind.In order to implement the above decoding method, the present invention also provides a system for audio decoding, and as in 4 this system comprises: a bit stream demultiplexer (DeMUX), an amplitude envelope decoding unit, a bit allocation unit, a frequency domain coefficient decoding unit, a spectral band replicating unit, a noise filling unit and an Inverse Modified Discrete Cosine Transform (IMDCT), wherein:
The bit stream demultiplexer (DeMUX) is provided to separate amplitude envelope coded bits, frequency domain coefficient coded bits, and noise level coded bits from a bitstream to be decoded;
The amplitude envelope decoding unit connected to the bit stream demultiplexer is provided to perform the decoding and inverse quantization for the amplitude envelope coded bits output from the bit stream demultiplexer to obtain the amplitude envelope of each coding subband;
the bit allocation unit connected to the amplitude envelope decoding unit is arranged to perform the bit allocation to obtain the number of coded bits assigned to each frequency area coefficient in each coding subband.

Die Bitzuweisungseinheit umfasst: ein Signifikanz-Berechnungsmodul, ein Bitzuweisungsmodul und ein Bitzuweisungsmodifikationsmodul, wobei:
das Signifikanz-Berechnungsmodul vorgesehen ist, um den Anfangswert der Signifikanz jedes Codierungsteilbands dem quantitativen Index der Amplitudenhüllkurve des Codierungsteilbands entsprechend zu berechnen;
das Bitzuweisungsmodul vorgesehen ist, um die Bitzuweisung an jedem Frequenzbereichskoeffizienten in den Codierungsteilbändern dem Anfangswert der Signifikanz jedes Codierungsteilbands entsprechend durchzuführen, und während des Prozesses der Bitzuweisung sind die Bitzuweisungsschrittweite und die signifikanzreduzierte Schrittweite nach der Bitzuweisung variabel;
das Bitzuweisungsmodifikationsmodul vorgesehen ist, um nach der Durchführung der Bitzuweisung den Zählwert der Iterationszeiten und die Signifikanz jedes Codierungsteilbands der Bitzuweisung des Codierungsendes entsprechend zu modifizieren und dann die Modifikation der Bitzuweisung an den Zählzeiten der Codierungsteilbänder durchzuführen.The bit allocation unit comprises: a significance calculation module, a bit allocation module and a bit allocation modification module, wherein:
the significance calculation module is provided to calculate the initial value of the significance of each encoding subband corresponding to the quantitative index of the amplitude envelope of the encoding subband;
the bit allocation module is provided to perform the bit allocation on each frequency area coefficient in the coding subbands corresponding to the initial value of the significance of each coding subband, and during the process of bit allocation, the bit allocation step size and the significantly reduced step size after the bit allocation are variable;
the bit assignment modification module is arranged to appropriately modify, after the execution of the bit allocation, the count value of the iteration times and the significance of each coding subband of the bit allocation of the coding end, and then perform the modification of the bit allocation on the count times of the encoding subbands.

Wenn das Bitzuweisungsmodul die Bitzuweisung durchführt, sind die Bitzuweisungsschrittweite und die signifikanzreduzierte Schrittweite nach der Bitzuweisung der Codierungsteilbänder mit niederwertigen Bits größer als die Bitzuweisungsschrittweite und die signifikanzreduzierte Schrittweite nach der Bitzuweisung der Nullbit-Codierungsteilbänder und der Codierungsteilbänder mit höherwertigen Bits.When the bit allocation module performs the bit allocation, the bit allocation step size and the significantly reduced step size after the bit allocation of the lower order coding subbands are greater than the bit allocation step size and the significantly reduced step size after the bit allocation of the zero bit coding subbands and the higher order coding coding subbands.

Wenn das Bitzuweisungsmodifikationsmodul die Bitmodifikation durchführt, sind die Bitzuweisungsschrittweite und die signifikanzreduzierte Schrittweite nach der Bitzuweisung der Codierungsteilbänder mit niederwertigen Bits kleiner als die Bitzuweisungsschrittweite und die signifikanzreduzierte Schrittweite nach der Bitzuweisung der Nullbit-Codierungsteilbänder und der Codierungsteilbänder mit höherwertigen Bits.When the bit allocation modification module performs the bit modification, the bit allocation step size and the significantly reduced step size after the bit allocation of the lower order coding subbands are smaller than the bit allocation step size and the significantly reduced step size after the bit allocation of the zero bit coding subbands and the higher order coding coding subbands.

Die Frequenzbereichskoeffizienten-Decodiereinheit, die mit der Amplitudenhüllkurven-Decodiereinheit und der Bitzuweisungseinheit verbunden ist, ist vorgesehen, um die Decodierung, Inversquantisierung und Inversnormalisierung an den Codierungsteilbändern durchzuführen, um die Frequenzbereichskoeffizienten zu erhalten;The frequency-domain coefficient decoding unit connected to the amplitude envelope decoding unit and the bit allocation unit is provided to perform the decoding, inverse quantization and inverse normalization on the coding sub-bands to obtain the frequency area coefficients;

Die Spektralbandreplikationseinheit, die mit dem DeMUX, der Frequenzbereichskoeffizienten-Decodiereinheit, der Amplitudenhüllkurven-Decodiereinheit und der Bitzuweisungseinheit verbunden ist, ist vorgesehen, um nach einer Position eines bestimmten Tons des Audiosignals in den MDCT-Frequenzbereichskoeffizienten zu suchen und die Bandbreite vom Nullfrequenzpunkt zum Frequenzpunkt der Tonposition als die Spektralbandreplikationsperiode zu nehmen, oder um das Frequenzsegment von einem Frequenzpunkt des Nullfrequenzpunkts, der copyband_offset-Frequenzpunkte rückwärts verschiebt, bis zu einem Frequenzpunkt der Tonposition, der die copyband_offset-Frequenzpunkte rückwärts verschiebt, als das Quellfrequenzsegment zu nehmen und die Spektralbandreplikation am Nullbit-Codierungsteilband durchzuführen; sie ist auch vorgesehen, um die Energieanpassung an Frequenzbereichskoeffizienten, die nach der Energieanpassung erhalten wurden, der Amplitudenhüllkurve des aktuellen Nullbit-Codierungsteilbands entsprechend durchzuführen.The spectral band replicating unit connected to the DeMUX, the frequency-domain coefficient decoding unit, the amplitude envelope decoding unit and the bit allocation unit is provided to search for a position of a particular tone of the audio signal in the MDCT frequency domain coefficients and the bandwidth from the zero frequency point to the frequency point To take the audio position as the spectral band replication period, or to take the frequency segment from a frequency point of the zero frequency point shifting the copyband_offset frequency points backwards to a frequency point of the audio position that shifts the copyband_offset frequency points backwards, as the source frequency segment and spectral band replication at zero bit Perform coding subband; it is also intended to perform the energy adjustment on frequency domain coefficients obtained after the power adjustment, according to the amplitude envelope of the current zero bit coding subband.

Die spezifische Implementierung dieser Spektralbandreplikationseinheit ist dieselbe wie die der obigen Vorrichtung zur Spektralbandreplikation, und es wird nicht näher darauf eingegangen.The specific implementation of this spectral band replication unit is the same as that of the above spectral band replication apparatus, and will not be discussed further.

Die Rauschfülleinheit, die mit der Amplitudenhüllkurven-Decodiereinheit, der Bitzuweisungseinheit und der Spektralbandreplikationseinheit verbunden ist, ist vorgesehen, um die Rauschfüllung für dieses Codierungsteilband der Amplitudenhüllkurve des aktuellen Nullbit-Codierungsteilbands entsprechend durchzuführen und rekonstruierte Frequenzbereichskoeffizienten der Nullbit-Codierungsteilbänder zu erhalten; Das obige Verfahren zur Spektralbandreplikation, das von der Spektralbandreplikationseinheit angewandt wird, kombiniert das Verfahren zur Rauschfüllung durch die Rauschfülleinheit, um die Spektrumsrekonstruktion für alle Nullbit-Codierungsteilbänder durchzuführen; oder die Rauschfülleinheit wendet das Verfahren zur zufälligen Rauschfüllung an, um die Spektrumsrekonstruktion für Nullbit-Codierungsteilbänder unterhalb eines bestimmten Frequenzpunkts durchzuführen, und für Nullbit-Codierungsteilbänder oberhalb des bestimmten Frequenzpunkts wendet die Spektralbandreplikationseinheit ein Verfahren zur Frequenzbereichskoeffizientenreplikation an, das die Rauschfüllung durch die Rauschfülleinheit kombiniert, um die Spektrumsrekonstruktion durchzuführen.The noise fill unit connected to the amplitude envelope decoding unit, the bit allocation unit and the spectral band replicating unit is provided to appropriately perform the noise filling for this coding subband of the amplitude envelope of the current zero bit coding subband and to obtain reconstructed frequency domain coefficients of the zero bit coding subbands; The above spectral band replication method employed by the spectral band replicating unit combines the method of noise filling by the noise replacer unit to perform the spectrum reconstruction for all zero-bit coding subbands; or the noise-filling unit applies the random noise filling method to perform the spectrum reconstruction for zero bit coding subbands below a certain frequency point, and for zero bit coding subbands above the determined frequency point, the spectral band replicating unit applies a frequency domain coefficient replication method combining the noise filling by the noise filling unit, to perform the spectrum reconstruction.

Die Inverse Modifizierte Diskrete Cosinus-Transformation(IMDCT)-Einheit, die mit der Rauschfülleinheit verbunden ist, ist vorgesehen, um nach der Rauschfüllung die IMDCT an den Frequenzbereichskoeffizienten durchzuführen, um das Audiosignal zu erhalten.The Inverse Modified Discrete Cosine Transform (IMDCT) unit connected to the noise fill unit is provided to perform the IMDCT on the frequency domain coefficients after noise filling to obtain the audio signal.

Claims

Spectral band replication method comprising: A. searching for a position of a particular audio signal in MDCT frequency domain coefficients; According to the tone position, determining a spectral band replication period and a source frequency segment, said spectral band replication period being a bandwidth from a zero frequency point to a frequency point of the audio position and said source frequency segment shifting a frequency segment from a frequency point of the zero frequency point rearwardly shifting the copyband_offset frequency points to a frequency point the frequency point of the audio position is the backward shift of the copyband_offset frequency points, the offset copy_band_offset being greater than or equal to 0; C. corresponding to the spectral band replication period, performing spectral band replication on zero-bit coding subbands.

The method of claim 1, wherein in step A, the following method is used to search the position of the particular tone: taking the absolute values or square values of the frequency domain coefficients of a first frequency segment and performing the smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering outputs of the first frequency segment, and taking the position of this maximum extreme value as the position of the particular sound.

The method of claim 2, wherein an operation formula of taking the absolute values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows:

X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) |

or an operation formula of taking the square values of the frequency-domain coefficients of the first frequency segment to perform the smoothing filtering is as follows:

X_amp _i (k) = μX_amp _i-1 (k) + (1-μ) | X _i (k) |

where μ is a smoothing filtering coefficient, X_amp _i (k) denotes the filtering output of the k-th frequency point of the ith frame, and

X

_i (k) is the MDCT coefficient after decoding the k-th frequency point of the ith frame, and when i = 0, X_amp _i-1 (k) = 0.

The method of claim 2, wherein the first frequency segment is a low frequency frequency segment whose energy is determined to be more centralized according to spectral statistical characteristics, wherein the low frequencies refer to spectral components that make up less than half the total bandwidth of a signal.

The method of claim 2, wherein the following method is used to determine the maximum extreme value of the filtering outputs: directly searching for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment, and taking that maximum value as the maximum extreme value of Filtering outputs of the first frequency segment.

The method of claim 2, wherein the following method is used to determine the maximum extremum of the filtering outputs: taking a segment in the first frequency segment as a second frequency segment and searching for an initial maximum value in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment, and depending on the position of the frequency domain coefficient corresponding to that initial maximum value, performing various processes: a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment and comparing continuously forward until the filtering output of one current frequency range coefficient is greater than the filtering output of an earlier frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient the filtering output of the frequency domain coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value; b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment and continuously comparing backward until the filtering output of one current frequency range coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final maximum maximum value, or comparing until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient the filtering output of the frequency domain coefficient then the highest frequency of the first frequency segment is the finally determined maximum extreme value; c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, then the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

A method according to any one of claims 1 to 6, wherein in step C, when the spectral band replication is performed for a zero bit coding subband, first the source frequency segment and a start sequence number of the zero bit coding subband requiring spectral band replication, corresponding to a source frequency segment replication start sequence number of that zero bit coding subband and then periodically replicated from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment to the zero bit coding subband, the spectral band replication period being one period.

The method of claim 7, wherein in step C, a method of calculating the source frequency segment replication start sequence number of this zero bit encoding subband is: obtaining a sequence number of a frequency point of a start MDCT frequency domain coefficient of the zero-bit encoding sub-band whose frequency domain coefficients are to be reconstructed, the sequence number being called fillband_start_freq, and a sequence number of a frequency point corresponding to the tone, called Tonal_pos, of the spectral band replication period; is called copy_period and its value is tonal_pos plus 1, and a spectral band replication offset called copyband_offset, where copy_period is recursively subtracted from the value of fillband_start_freq until that value is in a range of values of the source frequency segment sequence numbers, and then that value is the source frequency segment replication start sequence number, which is called copy_pos_mod.

The method of claim 7, wherein in step C, a method to perform from the source frequency segment replication start sequence number to the periodic replication of the frequency domain coefficients of the source frequency segment to the zero bit encoding subband wherein the spectral band replication period is one period is: replicating the frequency domain coefficients from the source frequency segment Start replication sequence number backward consecutively beginning with fillband_start_freq on the zero-bit coding subband until a frequency point of source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset, continuously replicating the frequency domain coefficients from the copyband_offset Frequency point in reverse continuously to the zero bit coding subband, etc., until the spectral band replication of all frequency domain coefficients of the current zero bit coding subband is completed.

A spectral band replication apparatus comprising: a tone position search module, a period and source frequency segment calculation module, a source frequency segment replication start sequence number calculation module, and a spectral band replication module connected in series, wherein the sound position search module is provided to search a position of a particular sound of an audio signal in MDCT frequency domain coefficients; the period and source frequency segment calculation module is provided for determining, according to the position of the tone, a spectral band replication period and a source frequency segment, this spectral band replication period being a bandwidth from a zero frequency point to a frequency point of the audio position and that source frequency segment is a frequency segment from a frequency point being the zero frequency point, shifts the copyband_offset frequency points backward to a frequency point of the sound position that shifts the copyband_offset frequency points backwards; the source frequency segment replication start sequence number calculation module is provided to calculate a source frequency segment replication start sequence number of a zero bit encoding subband corresponding to the source frequency segment and a start sequence number of the zero bit encoding subband requiring spectral band replication; the spectral band replication module is adapted to perform from the source frequency segment replication start sequence number to the periodic replication of the frequency domain coefficients of the source frequency segment to the zero bit encoding subband, the spectral band replication period being one period.

The method of claim 10, wherein the audio position search module in the filtering outputs of the frequency domain coefficients corresponding to the first frequency segment searches directly for an initial maximum value and takes this maximum value as the maximum extreme value of the filtering outputs of the first frequency segment.

The method of claim 10, wherein when said sound position search module determines the maximum extremum of the filtering outputs, a segment in the first frequency segment is taken as the second frequency segment and an initial maximum value is searched in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment and the position of the second frequency segment Frequency domain coefficients corresponding to this initial maximum value, various processes are performed: a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment and comparing continuously forward until the filtering output of one current frequency range coefficient is greater than the filtering output of an earlier frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient the filtering output of the frequency domain coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value; b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment and continuously comparing backward until the filtering output of one current frequency range coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final maximum maximum value, or comparing until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient the filtering output of the frequency domain coefficient then the highest frequency of the first frequency segment is the finally determined maximum extreme value; c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, then the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

Apparatus according to any one of claims 10 to 12, wherein a process of the source frequency segment replication start sequence number calculation module for calculating the source frequency segment replication start sequence number of the zero bit encoding sub-band requiring spatial replication comprises: obtaining a sequence number of a start frequency point of the zero-bit encoding sub-band whose frequency domain coefficients are to be currently reconstructed, the sequence number being called fillband_start_freq and a sequence number of a frequency point corresponding to the tone called Tonal_pos, the spectral band replication period, called copy_period, whose value is Tonal_pos plus 1, and a source segment start sequence number, called copyband_offset, where copy_period is recursive is subtracted from the value of fillband_start_freq until this value is in a range of values of the sequence numbers of the source frequency segment, and then this value is the source frequency segment replication start sequence enum number, which is called copy_pos_mod.

The apparatus of any one of claims 10 to 12, wherein when the spectral band replication module performs the spectral band replication, the frequency domain coefficients are replicated backwards from the source frequency segment replication start sequence number beginning at fillband_start_freq to the zero bit coding subband until a frequency point of the source frequency segment replication reaches a frequency point of Tonal_pos + copyband_offset , the frequency domain coefficients are continuously replicated backwards from the copyband_offset frequency point to the zero bit encoding subband, and so on, until the spectral band replication of all frequency domain coefficients of the current zero bit encoding subband is completed.

A method of audio decoding comprising: A. performing the decoding and inverse quantization on each amplitude envelope coded bit in a bitstream to be decoded to obtain an amplitude envelope of each coding subband; B. performing bit allocation on each coding subband and performing inverse quantization on non-zero bit coding subbands to obtain frequency domain coefficients of the nonzero bit coding subbands; C. searching for a position of a particular audio signal in MDCT frequency domain coefficients, taking a bandwidth from a zero frequency point to a frequency point of the audio position as the spectral band replication period, taking a frequency segment from a frequency point of the zero frequency point shifting the copyband_offset frequency points backward until as a source frequency segment, performing the spectral band replication on zero-bit coding subbands and, corresponding to an amplitude envelope of a current encoding subband, performing the energy adjustment on frequency domain coefficients obtained by replication, and a frequency point of the frequency point of the audio position shifting the copyband_offset frequency points backwards combining the noise fill, obtaining the reconstructed frequency domain coefficients of the zero bit encoding subband, wherein the offset copy_band_offset is greater than or equal to 0 is; D. performing the Inverse Modified Discrete Cosine Transform on frequency domain coefficients of the non-zero bit coding subbands and reconstructed frequency domain coefficients of the zero bit coding subbands to obtain an end audio signal.

The method of claim 15, wherein in step C the following method is used to search the position of the particular tone: taking the absolute values or square values of the frequency domain coefficients of a first frequency segment and performing the smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering output of the first frequency segment, and taking the position of this maximum extreme value as the position of the particular sound.

A method according to claim 16, wherein in step C, when the spectral band replication is performed for a zero-bit coding subband, first the source frequency segment and a start sequence number of the zero-bit coding sub-band requiring spectral band replication is calculated according to a source frequency segment replication start sequence number of that zero-bit coding sub-band and then from the source frequency segment replication start sequence number to the frequency domain coefficients of the source frequency segment are periodically replicated to the zero bit coding subband, the spectral band replication period being one period.

The method of claim 15, wherein the above spectral band replication method is used in combination with a noise filling method to perform the spectrum reconstruction for all zero-bit coding subbands, or a random noise filling method is used to perform the Performing spectrum reconstruction for zero-bit coding subbands below a particular frequency point, and applying frequency domain coefficient replication in combination with noise filling to perform the spectrum reconstruction for zero-bit coding subbands above the determined frequency point.

A system for audio decoding, comprising: a bit stream demultiplexer (DeMUX), an amplitude envelope decoding unit, a bit allocation unit, a frequency domain coefficient decoding unit, a spectral band replicating unit, a noise filling unit and an inverse modified discrete cosine transform (IMDCT), wherein the DeMUX is arranged to separate amplitude envelope coded bits, frequency domain coefficient coded bits, and noise level coded bits from a bitstream to be decoded; the amplitude envelope decoding unit connected to the DeMUX is provided to perform the decoding and inverse quantization for the amplitude envelope coded bits output from the bitstream demultiplexer to obtain an amplitude envelope of each coding subband; the bit allocation unit connected to the amplitude envelope decoding unit is arranged to perform the bit allocation to obtain the number of coded bits assigned to each frequency area coefficient of each coding subband; the frequency-domain coefficient decoding unit connected to the amplitude envelope decoding unit and the bit allocation unit is provided to perform the decoding, inverse quantization and inverse normalization for coding subbands to obtain frequency reference coefficients; the spectral band replicating unit connected to the DeMUX, the frequency-domain coefficient decoding unit, the amplitude envelope decoding unit and the bit allocation unit is provided to search for a position of a particular tone of an audio signal in MDCT frequency domain coefficients, a bandwidth from a zero frequency point to a frequency point to take the sound position as a spectral band replication period, a frequency segment from a frequency point of the zero frequency point shifting backwards copy_band_offset frequency points to take as a source frequency segment up to a frequency point of the frequency position of the sound position, which shifts the copyband_offset frequency points backwards, the spectral band replication to zero bit coding subbands where the copy_band_offset offset is greater than or equal to 0; and also, in accordance with an amplitude envelope of a current encoding subband, performing the energy adjustment on frequency domain coefficients obtained by replication; the noise fill unit connected to the amplitude envelope decoding unit, the bit allocation unit and the spectral band replicating unit is provided to perform the noise fill for that coding subband corresponding to the amplitude envelope of the current zero bit coding subband to obtain reconstructing frequency domain coefficients of the zero bit coding subband; the IMDCT unit connected to the noise filling unit is provided to perform the IMDCT on the frequency domain coefficient after the noise filling to obtain an audio signal.

The system of claim 19, wherein the spectral band replicating unit comprises a sound position search module, a period and source frequency segment calculation module, a source frequency segment replication start sequence number calculation module and a spectral band replication module connected in series, wherein the sound position search module is provided to search a position of a particular sound of an audio signal in MDCT frequency domain coefficients; the period and source frequency segment calculation module is provided to determine a spectral band replication period and a source frequency segment corresponding to the sound position, this spectral band replication period being a bandwidth from a zero frequency point to a frequency point of the audio position and this source frequency segment is a frequency segment from a frequency point of the zero frequency point, the copyband_offset Shifts frequency points backwards until a frequency point of the frequency point of the sound position shifts the copyband_offset frequency points backwards; the source frequency segment replication start sequence number calculation module is provided to calculate a source frequency segment replication start sequence number of a zero bit encoding subband corresponding to the source frequency segment and a start sequence number of the zero bit encoding subband requiring spectral band replication; the spectral band replication module is adapted to perform from the source frequency segment replication start sequence number to the periodic replication of the frequency domain coefficients of the source frequency segment to the zero bit encoding subband, the spectral band replication period being one period.

The system of claim 19, wherein the sound position search module uses the following method to search for the sound position: taking the absolute values or squared values of the MDCTs Frequency domain coefficients of the first frequency segment and performing a smoothing filtering; and according to a result of the smoothing filtering, searching for a position of a maximum extreme value of the filtering output of the first frequency segment, the position of this maximum extreme value being the sound position.

The system of claim 21, wherein when said sound position search module determines the maximum extremum of the filtering outputs, a segment in the first frequency segment is taken as a second frequency segment and an initial maximum value is searched in the filtering outputs of the frequency domain coefficients corresponding to the second frequency segment and the position of the second frequency segment Frequency domain coefficients corresponding to this initial maximum value, various processes are performed: a. if this initial maximum value is the filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the lowest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of an earlier lower frequency in the first frequency segment and comparing continuously forward until the filtering output of one current frequency range coefficient is greater than the filtering output of an earlier frequency domain coefficient, the filtering output of the current frequency domain coefficient is then a final maximum maximum value, or comparing until the filtering output of the lowest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a later frequency domain coefficient the filtering output of the frequency domain coefficient of the lowest frequency of the first frequency segment is then the finally determined maximum extreme value; b. if this initial maximum value is the filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment, comparing that filtering output of the frequency domain coefficient of the highest frequency of the second frequency segment with the filtering output of the frequency domain coefficient of a later higher frequency in the first frequency segment and continuously comparing backward until the filtering output of the first frequency segment current frequency range coefficient is greater than the filtering output of a later frequency domain coefficient, the filtering output of the current frequency domain coefficient then being the final maximum maximum value, or comparing until the filtering output of the highest frequency frequency domain coefficient of the first frequency segment is greater than the filtering output of a previous frequency domain coefficient the filtering output of the frequency domain coefficient then the highest frequency of the first frequency segment is the finally determined maximum extreme value; c. if this initial maximum value is the filtering output of a frequency domain coefficient between the lowest frequency and the highest frequency in the second frequency segment, then the frequency domain coefficient corresponding to this initial maximum value is the sound position, that is, this initial maximum value is the final maximum maximum value.

The system of claim 19, wherein a frequency domain coefficient replication method employed by the spectral band replicating unit is used in combination with the noise filling method employed by the noise replacer unit to perform the spectrum reconstruction for all zero bit coding subbands, or a random noise filling method is applied by the noise replacer unit, to perform the spectrum reconstruction for zero-bit coding subbands below a particular frequency point, and a frequency domain coefficient replication method employed by the spectral band replicating unit is used in combination with the noise filling method employed by the noise filling unit to perform the spectrum reconstruction for zero-bit coding subbands above the determined frequency point.