WO2014030928A1 - Procédé de codage de signaux audio, procédé de décodage de signaux audio, et appareil mettant en œuvre les procédés - Google Patents
Procédé de codage de signaux audio, procédé de décodage de signaux audio, et appareil mettant en œuvre les procédés Download PDFInfo
- Publication number
- WO2014030928A1 WO2014030928A1 PCT/KR2013/007505 KR2013007505W WO2014030928A1 WO 2014030928 A1 WO2014030928 A1 WO 2014030928A1 KR 2013007505 W KR2013007505 W KR 2013007505W WO 2014030928 A1 WO2014030928 A1 WO 2014030928A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- track
- pulses
- pulse
- tracks
- encoding target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
Definitions
- the present invention relates to encoding and decoding of an audio signal, and more particularly, to a method and apparatus for searching an encoding / decoding target of an audio signal.
- audio signals include signals of various frequencies, and the human audible frequency is in the range of about 200 Hz to 3 kHz, whereas the average human voice is in the range of about 200 Hz to 3 kHz.
- the input audio signal may include not only a band in which a human voice exists but also a component of a high frequency region of 7 kHz or more, where a human voice is hard to exist.
- SWB wide band
- a coding scheme suitable for NB (sampling rate ⁇ ⁇ 8 kHz) or a coding scheme suitable for WB (sampling rate ⁇ ⁇ 16 kHz) is applied to a signal of SWB (sampling rate ⁇ 32 kHz).
- SWB sampling rate
- An object of the present invention is to provide a method and apparatus for band extension of a voice and audio encoder in a digital communication environment.
- An object of the present invention is to provide a method and apparatus for encoding / decoding audio and audio signals having backward compatibility in decoding a bitstream encoded in a sine mode.
- An object of the present invention is to provide a pulse search method and apparatus applicable to both CELP mode and sine mode.
- the present invention relates to encoding and decoding of an audio signal.
- the encoding according to the present invention determines the importance of tracks constituting a track pair according to track-specific energy of an audio signal, and searches for and encodes pulses from the track of high importance. Determining a target pulse and quantizing information of the determined encoding target pulse.
- the encoder determines the importance of the tracks constituting the track pair according to the track-specific energy of the audio signal, determines the encoding target pulse by searching the pulses from the high importance track, and determines the information of the determined encoding target pulse. By quantizing the audio signal can be encoded.
- Decoding according to the present invention also includes generating pulses for the audio signal from the tracks constituting the track pair based on inverse quantization and reconstructing the audio signal based on the pulses. At this time, the generation of the pulse may be performed for each track in a predetermined order.
- the decoder may generate pulses for the audio signal from the tracks constituting the track pair and restore the audio signal based on the pulses based on inverse quantization.
- the decoder may generate a pulse for each track.
- the dense pulses can be efficiently retrieved and transmitted / stored without using additional bits.
- bitstream encoded according to the existing sine wave mode can be decoded, backward compatibility is guaranteed.
- the dense pulses can be effectively searched without using additional bits.
- FIG. 1 schematically illustrates an example of an encoder configuration that may be used when an ultra-wideband signal is processed by a band extension method.
- FIG. 2 is a diagram for explaining an example of a configuration of an encoder based on the configuration of a core encoder.
- FIG. 3 schematically illustrates an example of a decoder configuration that may be used when an ultra-wideband signal is processed by a band extension method.
- FIG. 4 is a diagram illustrating an example of a decoder configuration based on the configuration of a core decoder.
- FIG. 5 is a diagram schematically illustrating a method of encoding a sine wave in a sine mode.
- FIG. 6 is a diagram schematically illustrating track information encoded / decoded in a layer 6 to which a sine mode is applied.
- FIG. 7 schematically illustrates an example of track information regarding a sine wave mode in layer 6, which is a first SWB layer.
- FIG. 8 is a diagram schematically illustrating an example in which two tracks are paired in the case of two steps.
- FIG. 9 is a diagram schematically illustrating an example in which three tracks are paired in the case of three steps.
- FIG. 10 is a flowchart schematically illustrating a sine wave search method applied to each layer according to an embodiment of the present invention.
- FIG. 11 is a diagram schematically illustrating a case where an independent search is performed for each track without considering the characteristics of track pairs.
- FIG. 12 is a diagram schematically illustrating an example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- FIG. 13 is a view schematically illustrating another example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- FIG. 14 is a view schematically illustrating another example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- FIG. 16 is a block diagram schematically illustrating an example of an encoder to which the methods of FIGS. 14 and 16 are applied.
- 17 is a flowchart schematically illustrating an example of a method of searching for a pulse of a track according to frame energy or tonality according to the present invention.
- FIG. 18 is a flowchart schematically illustrating a method for searching / selecting a pulse based on a CELP mode in the present invention.
- FIG. 19 is a flowchart schematically illustrating an example of an audio signal encoding method according to the present invention.
- FIG. 20 is a flowchart schematically illustrating an example of an audio signal decoding method according to the present invention.
- first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
- Components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, and do not mean that each component is made of separate hardware or one software component unit.
- Each component is included in a list of components for convenience of description, and at least two of the components may be combined to form one component, or one component may be divided into a plurality of components to perform a function.
- NB narrow bands
- WB wide bands
- SWBs super wide bands
- a speech and audio encoding / decoding technique a Code Excited Linear Prediction (CELP) mode, a sinusoidal mode, or the like may be used.
- CELP Code Excited Linear Prediction
- the coder may be divided into a baseline coder and an enhancement layer.
- the enhancement layer may be further divided into a lower band enhancement layer (LBE) layer, a bandwidth extension (BWE) layer, and a higher band enhancement layer (HBE) layer.
- LBE lower band enhancement layer
- BWE bandwidth extension
- HBE higher band enhancement layer
- the LBE layer improves low-band sound quality by encoding / decoding a difference signal, that is, an excitation signal, between a sound source processed by a core encoder / core decoder and an original sound. Since the high band signal has similarity with the low band signal, it is possible to recover the high band signal at a low bit rate through the high band extension method using the low band.
- a method of scaling and processing a SWB signal may be considered.
- the method of band extending the SWB signal may operate in the Modified Discrete Cosine Transform (MDCT) domain.
- MDCT Modified Discrete Cosine Transform
- the enhancement layers may be handled by being divided into a generic mode and a sinusoidal mode. For example, when three enhancement layers are used, the first enhancement layer may be processed in generic mode and sign mode, and the second and third enhancement layers may be processed in sign mode.
- a sinusoid includes both a sine wave and a cosine wave in which the sinusoid is shifted in phase by half. Therefore, in the present invention, a sinusoid may mean a sine wave or a cosine wave. If the input sine wave is a cosine wave, it may be converted into a sine wave or cosine wave in the encoding / decoding process, and the conversion depends on the conversion method of the input signal. Even when the input sine wave is a sine wave, it may be converted to a cosine wave or a sine wave in the encoding / decoding process, and the conversion depends on the conversion method of the input signal.
- coding is based on adaptive replication of the coded wideband signal subbands.
- sine mode coding a sine wave is added to high frequency contents.
- the sine mode is an efficient encoding technique for a signal having a strong periodicity or a signal having a tone component, and may encode sign, amplitude, and position information for each sine wave component.
- a predetermined number for example, 10 MDCT coefficients may be encoded for each layer.
- FIG. 1 schematically illustrates an example of an encoder configuration that may be used when an ultra-wideband signal is processed by a band extension method.
- an encoder structure of a G.718 Annex B scalable extension to which a sine mode is applied will be described as an example.
- the encoder of FIG. 1 is composed of a generic mode and a sign mode for SWB extension, and when an additional bit is allocated, the encoder mode can be used by extending the sign mode.
- the encoder 100 includes a down sampling unit 105, a core encoder 110, an MDCT unit 115, a tonality estimation unit, a tonality determination unit 125, and a SWB ( Super Wide Band) encoding unit 130.
- the SWB encoder 130 includes a generic mode unit 135, a sine wave mode unit 140, and additional sine wave units 145 and 150.
- the down sampling unit 105 down-samples the input signal to generate a WB signal that can be processed by a core encoder.
- SWB encoding is performed in the MDCT domain.
- the core encoder 110 encodes the WB signal to MDCT the synthesized WB signal and outputs MDCT coefficients.
- MDCT Modified Discrete Cosine Transform
- Input signal in the windowed time domain Is a symmetric window function.
- the MDCT unit 115 MDCTs the SWB signal, and the tonality estimator 120 estimates the tonality of the MDCT signal.
- the choice between generic mode and sine mode is based on tonality. For example, when using three layers in the scalable SWB band extension method, the first layer, that is, layer 6mo (layer 7mo) may be selected based on the tonality estimate.
- the generic mode and / or the sine mode may be used in the layer 6mo of the three layers, and the sine mode may be used in the upper layer (layer 7mo, layer 8mo).
- the tonality estimation may be performed based on correlation analysis between spectral peaks in a current frame and a past frame.
- the tonality estimator 120 outputs the tonality estimate to the tonality determiner 125.
- the tonality determiner 125 determines whether the MDCT-converted signal is tonal based on the degree of tonality, and transmits it to the SWB encoder 130. For example, the tonality determination unit 125 compares the tonality estimation value input from the tonality estimator 120 with a predetermined reference value to determine whether the MDCT-converted signal is a tonal signal or a non-tonal signal.
- the SWB encoder 130 processes the MDCT coefficients of the MDCT SWB signal.
- the SWB encoder 130 may process the MDCT coefficients of the SWB signal by using the MDCT coefficients of the synthesized WB signal input through the core encoder 110.
- the signal is transmitted to the generic mode unit 135, and when it is determined to be tonal, the signal is transmitted to the sine wave mode unit 140. do.
- the generic mode may be used when it is determined that the input frame is not tonal.
- the low frequency spectrum is directly transposed to high frequencies and parameterized to follow the envelope of the original high frequency. At this time, the parameterization can be made more coarsely than the case of the original high frequency.
- high frequency content can be coded at a low bit rate.
- the high frequency band is divided into sub-bands, and according to a predetermined similarity criterion, the one that is most similarly matched among coded and block normalized broadband contents is selected.
- the selected contents are scaled and output as synthesized high frequency content.
- the sinusoidal mode unit 140 may be used when the input frame is tonal. In sine mode, a finite set of sinusoidal components is added to the high frequency (HF) spectrum to generate a SWB signal. At this time, the HF spectrum is generated using the MDCT coefficients of the SW synthesis signal.
- HF high frequency
- the additional sine wave units 145 and 150 add additional sine waves to the signal output in the generic mode and the signal output in the sine mode to improve the generated signal. For example, when additional bits are allocated, the additional sine wave units 145 and 150 determine an additional sine wave (pulse) to transmit and extend the sine mode to quantize to improve the signal.
- additional sine wave pulse
- outputs of the core encoder 110, the tonality determination unit 125, the generic mode unit 135, the sine wave mode unit 140, and the additional sine wave units 145 and 150 are converted into bit streams. May be sent to the decoder.
- FIG. 2 is a diagram for explaining an example of a configuration of an encoder based on the configuration of a core encoder.
- the encoder 200 includes a bandwidth checker 205, a sampling converter 210, an MDCT converter 215, a core encoder 220, an important MDCT coefficient extractor and a quantizer 265. It includes.
- the bandwidth checking unit 205 may determine whether the input signal (audio signal) is a narrow band (NB) signal, a wide band (WB) signal, or a super wide band (SWB) signal.
- the NB signal may have a sampling rate of 8 kHz
- the WB signal may have a sampling rate of 16 kHz
- the SWB signal may have a sampling rate of 32 kHz.
- the bandwidth checking unit 205 may convert an input signal into a frequency domain to determine a component and a zone of upper band bins of the spectrum.
- the encoder 200 may not include the bandwidth checking unit 205 when the input signal is fixed, for example, when the input signal is fixed to NB.
- the bandwidth checking unit 205 determines the input signal and outputs the NB or WB signal to the sampling converter 210, and outputs the SWB signal to the sampling converter 210 or the MDCT converter 215.
- the sampling converter 210 performs sampling for converting an input signal into a WB signal input to the core encoder 220.
- the sampling converter 210 up-samples the input signal to be a signal having a sampling rate of 12.8 kHz when the input signal is an NB signal, and the sampling rate is 12.8 kHz when the input signal is a WB signal.
- the down-sampling to the signal can produce a 12.8kHz low-band signal.
- the sampling converter 210 downsamples the sampling rate to be 12.8 kHz to generate an input signal of the core encoder 220.
- the core encoder 220 includes a preprocessor 225, a linear prediction analyzer 230, a quantizer 235, a CELP mode performer 240, a quantizer 245, an inverse quantizer 250, synthesis and post-processing.
- the preprocessor 225 may filter low frequency components among the lower band signals input to the core encoder 220 and transmit only a signal of a desired band to the linear prediction analyzer.
- the linear prediction analyzer 230 may extract a linear prediction coefficient (LPC) from the signal processed by the preprocessor 225.
- LPC linear prediction coefficient
- the linear prediction analyzer 230 may extract the 16th linear prediction coefficient from the input signal and transfer the extracted 16th linear prediction coefficient to the quantization unit 235.
- the quantization unit 235 quantizes the linear prediction coefficients transmitted from the linear prediction analyzer 230.
- the linear prediction residual signal is generated by filtering the original lower band signal using the quantized linear prediction coefficients in the lower band.
- the linear prediction residual signal generated by the quantization unit 235 is input to the CELP mode performing unit 240.
- the CELP mode performing unit 240 detects a pitch of the input linear prediction residual signal by using a self-correlation function.
- a first open loop pitch search method a first closed loop pitch search method, and Abs (Analysis by Synthesis) may be used.
- the CELP mode performing unit 240 may extract the adaptive codebook index and the gain information based on the detected pitch information.
- the CELP mode performing unit 240 may extract the index and the gain of the fixed codebook based on the remaining components limiting the contribution of the adaptive codebook in the linear prediction residual signal.
- the CELP mode performing unit 240 quantizes the parameters (pitch, adaptive codebook index and gain, fixed codebook index and gain) related to the linear prediction residual signal extracted through the pitch search, the adaptive codebook search, and the fixed codebook search. To pass on.
- the quantizer 245 quantizes the parameters transmitted from the CELP mode performer 240.
- Parameters related to the quantized linear prediction residual signal in the quantization unit 245 may be output as a bit stream and transmitted to the decoder.
- the parameters related to the quantized linear prediction residual signal may be transferred to the inverse quantizer 250.
- the inverse quantization unit 250 generates an excitation signal reconstructed using the extracted and quantized parameters through the CELP mode.
- the generated excitation signal is transmitted to the synthesis and post processor 255.
- the synthesis and post-processing unit 255 synthesizes the reconstructed excitation signal and the quantized linear prediction coefficient, generates a synthesized signal of 12.8 kHz, and restores the 16 kHz WB signal through upsampling.
- the MDCT converter 260 converts the restored WB signal by a modified disc cosine transform (MDCT) method.
- MDCT modified disc cosine transform
- the MDCT transformed WB signal is output to the important MDCT coefficient extraction and quantization unit 265.
- the important MDCT coefficient extraction and quantization unit 265 corresponds to the SWB coding unit shown in FIG.
- the important MDCT coefficient extraction and quantization unit 265 receives the MDCT transform coefficients for the SWB from the MDCT transform unit 215 and the MDCT transform coefficients for the synthesized WB from the MDCT transform unit 260.
- the important MDCT coefficient extraction and quantization unit 265 extracts a transform coefficient to be quantized by using the input MDCT transform coefficients.
- the details of the important MDCT coefficient extraction and quantization unit 265 extracting MDCT coefficients are the same as those of the SWB encoder of FIG. 1.
- the important MDCT coefficient extraction and quantization unit 265 quantizes the extracted MDCT coefficients, outputs them as a bitstream, and transmits them to the decoder.
- FIG. 3 schematically illustrates an example of a decoder configuration that may be used when an ultra-wideband signal is processed by a band extension method.
- the decoder 300 includes a core decoder 305, a first post processor 310, an up sampling unit 315, a SWB decoder 320, an IMDCT unit 350, and a second post processor. 355, and an adder 360.
- the SWB decoder 320 includes a generic mode unit 325, a sinusoidal wave unit 330, and additional sinusoidal wave units 335 and 340.
- the core encoder 305, the generic mode unit 325, the sine wave unit 330, and the additional sine wave unit 335 may receive target information to be processed from the bit stream and / or auxiliary information for processing. Can be.
- the core decoder 305 decodes the wideband signal to synthesize the WB signal.
- the synthesized WB signal is input to the first post processor 310, and the MDCT transform coefficients of the synthesized WB signal are input to the SWB decoder 320.
- the first post processor 310 improves the synthesized WB signal in the time domain.
- the upsample 315 upsamples the WB signal to form a SWB signal.
- the SWB decoder 320 decodes the MDCT of the SWB signal input from the bitstream.
- the MDCT coefficients of the synthesized WB signal (Synthesized Super Wide Band Signal) input from the core decoder 305 may be used.
- the decoding of the SWB signal is mainly performed in the MDCT domain.
- the generic mode unit 325 and the sine wave mode unit 330 decode the first layer of the enhancement layer, and the upper layer may be decoded by the additional sine wave units 335 and 340.
- the SWB decoder 320 performs a decoding process in the reverse order of the encoding process, corresponding to the encoding process described by the SWB encoder. In this case, the SWB decoder 320 determines whether the input information is tonal from the bitstream, and in the case of the tonal, the SWB decoder 320 or the sine wave mode unit 330 and the additional sine wave unit 340. If the decoding process is not performed, and not tonal, the decoding process may be performed by the generic mode unit 325 or the generic mode unit 325 and the additional sine wave unit 335.
- the generic mode unit 325 configures the HF signal by adaptive sub-band replica. Two sinusoidal components are then added to the spectrum of the first SWB enhancement layer. Generic mode and sine mode utilize similar enhancement layers that underlie sine mode coding.
- the sine wave mode unit 330 generates a high frequency (HF) signal based on a finite set of sine wave components.
- the additional sine wave units 335 and 340 add sine waves to the upper SWB layer and improve the quality of the high band content.
- the IMDCT unit 350 performs an inverse MDCT to output a signal in the time domain, and the second post-processing unit 355 improves the inverse MDCT processed signal in the time domain.
- the adder 360 adds the SWB signal decoded and upsampled by the core decoder and the SWB signal output from the SWB decoder 320 and outputs a reconstructed signal.
- the decoder 400 includes a core decoder 410, a post-processing / sampling transformer 450, an inverse quantizer 460, an upper MDCT coefficient generator 470, and an MDCT inverse transformer 480. And a post-processing filtering unit 490.
- the bitstream including the NB signal or WB signal transmitted from the encoder is input to the core decoder 410.
- the core decoder 410 includes an inverse transformer 420, a linear prediction synthesizer 430, and an MDCT transformer 440.
- the inverse transform unit 420 may inverse transform the audio information encoded in the CELP mode and restore the excitation signal based on a parameter received from the encoder.
- the inverse transform unit 420 may transmit the reconstructed excitation signal to the linear prediction synthesis unit 430.
- the linear prediction synthesizer 430 may reconstruct a lower band signal (NB signal, WB signal, etc.) using the excitation signal transmitted from the inverse transformer 420 and the linear prediction coefficient transmitted from the encoder.
- the lower band signal (12.8 kHz) reconstructed by the linear prediction synthesis unit 430 may be downsampled to NB or upsampled to WB.
- the WB signal is output to the post-processing / sampling converter 450 or to the MDCT converter 440.
- the post-processing / sampling converter 450 may up-sample the NB signal or the WB signal to generate a synthesized signal for use in restoring the SWB signal.
- the MDCT converter 440 MDCT transforms the restored lower band signal and transmits the MDCT coefficient generator 470.
- the inverse quantizer 460 and the upper MDCT coefficient generator 470 correspond to the SWB decoder of the decoder illustrated in FIG. 3.
- the dequantizer 460 receives the SWB signal and the parameter quantized through the bitstream from the encoder and dequantizes the received information.
- the dequantized SWB signal and the parameter are transmitted to the upper MDCT coefficient generator 470.
- the upper MDCT coefficient generator 470 receives the MDCT coefficients for the synthesized NB signal or the WB signal from the core decoder 410, and receives necessary parameters from the bitstream for the SWB signal to dequantize the SWB. Generate MDCT coefficients for the signal. As shown in FIG. 3, the upper MDCT coefficient generator 470 may apply the generic mode or the sine mode according to whether the signal is tonal, and may apply an additional sine wave to the signal of the enhancement layer.
- the MDCT inverse transform unit 480 restores a signal through an inverse transform on the generated MDCT coefficients.
- the post processing filter 490 may apply filtering on the restored signal. Filtering allows for post-processing such as reducing quantization errors, highlighting peaks and killing valleys.
- the SWB signal may be restored by synthesizing the signal restored by the post-processing filter 490 and the signal restored by the post-processing / sampling converter 450.
- the band extension method passes through a core encoder and an enhancement layer processor (SWB encoder) to encode a SWB input signal.
- SWB encoder an enhancement layer processor
- SWB decoder an enhancement layer processor
- the SWB signal is downsampled at a sampling rate corresponding to the WB and encoded by a WB encoder (core encoder).
- the encoded WB signal is synthesized and then MDCT transformed, and the MDCT coefficients for the WB may be input to the SWB encoder.
- the SWB input signal is encoded by being divided into a generic mode and a sine mode according to the degree of tonality in the MDCT coefficient domain after MDCT conversion.
- encoding for an enhancement layer may be further performed using an additional sine wave.
- Signal information corresponding to WB among SWB signals is decoded by a WB decoder (core decoder).
- the decoded WB signal is synthesized and then MDCT-converted so that the MDCT coefficients for the WB can be input to the SWB decoder.
- the encoded SWB signal is decoded by being divided into a generic mode and a sine mode corresponding to the encoded mode, and further, decoding of an enhancement layer may be performed using an additional sine wave.
- the inverted SWB signal and the WB signal may be synthesized through additional post-processing such as upsampling and then restored to the SWB signal.
- the sine mode does not encode all sine waves constituting the audio signal (also called sine wave components constituting the audio signal), but encodes only sine waves having a high energy among sine waves constituting the audio signal. Therefore, unlike when encoding all sine waves, in the sine mode, the encoder encodes the selected sine wave as well as the amplitude information and the sign information and transmits the position information of the selected sine wave to the decoder.
- the sine waves constituting the audio signal mean MDCT coefficients X (k) obtained by MDCT transforming the respective sine waves constituting the audio signal. Therefore, when describing the characteristics of the sine wave in the sine mode in the present specification, the magnitude of the sine wave is the magnitude (C) of the MDCT coefficient obtained by MDCT conversion of the sine wave component, the sign of the sine wave component, Note the position (pos).
- the position of the sine wave is a position in the frequency domain, and may be a wave number k specifying each sine wave constituting the audio signal, or an index corresponding to the wave number k.
- 'sine wave' or 'pulse' may mean an MDCT coefficient of each sine wave component constituting the input audio signal.
- the position of the sine wave is described by specifying the wave number of the sine wave.
- this is for convenience of description and the present invention is not limited thereto, and the contents of the present invention may be equally applied even when using separate information for specifying the positions of the sine waves in the frequency domain as the position of the sine wave.
- the sine mode is not suitable for encoding all sine waves because it needs to transmit position information of the sine wave, but is effective when a small number of sine waves should be used to guarantee sound quality or a low bit rate should be transmitted. Therefore, it can be used for a band extension technique or a low bit rate audio codec.
- FIG. 5 is a diagram schematically illustrating a method of encoding a sine wave in a sine mode.
- sine waves constituting the input audio signal are located corresponding to the wave number k of each sine wave.
- An upward sine wave represents a positive MDCT coefficient
- a downward sine wave represents a negative MDCT coefficient.
- the magnitude of the sine wave (MDCT coefficient) corresponds to the length of the sine wave.
- FIG. 5 illustrates a case where a positive sine wave having a size 126 is positioned at position 4 and a negative sine wave having a size 18 is positioned at position 74 as an example.
- magnitude information, sign information, and position information of a sine wave are transmitted.
- a predetermined number of sine waves may be searched and quantized for each track.
- FIG. 6 is a diagram schematically illustrating track information encoded / decoded in a layer 6 to which a sine mode is applied.
- two sine waves are searched and quantized for tracks 1 to 3 of six tracks (tracks 0 to 5), and one sine wave for track 4 and track 5 is quantized. Is searched and quantized. The search can be performed for each track.
- S_i_j (i, j is an integer of 0 or more) means a j-th sine wave of the i-th track segment.
- the search may proceed in the order of track 0 ⁇ track 1 ⁇ track 2 ⁇ track 3 ⁇ track 4 ⁇ track 5.
- a total of 10 sine waves in layer 6 may be searched in the order of S_0_1 ⁇ S_0_2 ⁇ S_1_1 ⁇ S_1_2 ⁇ S_2_1 ⁇ S_2_2 ⁇ S_3_1 ⁇ S_3_2 ⁇ S_4_1 ⁇ S_5_1.
- a baseline coder and an enhancement layer may be included in order to encode a 32 KHz SWB input signal.
- downsampling may be performed at 16 kHz, and the downsampled signal may be encoded by a WB encoder. Since the encoded WB signal is used for encoding in the enhancement layer, it is converted to MDCT after synthesis, and the WB MDCT coefficient is input to the enhancement layer.
- the SWB input signal may be converted into MDCT and then divided into two modes (general mode and sine mode) according to the degree of tonality in the MDCT coefficient domain.
- the sinusoidal mode may be processed in parallel with the generic mode in the case of layer 6.
- a frame determined to be tonal it may be encoded in a sine mode.
- 10 pulses may be extracted from an HF (High Frequency) signal and encoded. For example, the first four pulses are extracted in the band corresponding to 7000-8600 Hz, the next four pulses are extracted in the 8600-10200 Hz band, and the last two pulses are respectively the 10200-11800 Hz band and 11800-12600 Hz. Extracted from the band.
- HF High Frequency
- the position of the pulse is quantized / coded and transmitted.
- the position of the extracted pulse is the original signal M 32 (k) and the HF composite signal. It can be determined using the difference value of.
- M is the magnitude of the MDCT coefficient
- k is the position of the pulse (sine wave) represents a wave number (wave number).
- M 32 (k) represents the pulse magnitude at position k for the SWB up to 32 KHz.
- Ten pulses (positions of pulses) to be encoded may be determined through Equation 1. Specifically, through Equation 2, the original signal M 32 (k) and the HF composite signal The ten pulses with the largest difference value can be determined.
- a signal having a large position between the original signal and the HF synthesized signal may be determined as a signal to be encoded in the current layer.
- the pulse search process using Equation 2 is a process of finding the maximum value of the original signal M 32 (k) .
- D (k) the entire band may be divided into five subbands to make D j (k) for each subband.
- the pulse number N j of each subband may be predetermined.
- Dj (k) is the difference between the original signal and the HF synthesized signal at k of subband j
- N j is the number of pulses searched in subband j .
- Table 1 schematically illustrates the process of finding the N j largest D values (the largest original signal in the case of layer 6) for each subband.
- the maximum value N can be retrieved and the retrieved N value can be stored in an array called input_data.
- FIG. 7 schematically illustrates an example of track information regarding a sine wave mode in layer 6, which is a first SWB layer.
- each sine wave (MDCT coefficient) constituting the audio signal in the frequency domain is displayed at a position corresponding to the wave number of each sine wave.
- Track 0 is located in the frequency range of 280 to 342, and consists of sine waves with a spacing of two (2 steps, 2 steps) in the position unit (for example, wave number or frequency).
- Track 1 is located in the frequency range of 281 to 343, and consists of sine waves with an interval of two.
- Track 2 is located in the frequency range of 344 ⁇ 406, and consists of sine waves spaced by two.
- Track 3 is located in the frequency range of 345 ⁇ 407, and consists of sine waves with intervals of two.
- Track 4 is located in the frequency range of 408 ⁇ 471, and consists of sine waves having an interval of one step (1 step, 1 step).
- Track 5 is located in the frequency range of 472 ⁇ 503, and consists of sine waves with intervals of one.
- a sine wave (pulse) satisfying a predetermined condition is searched by a predetermined number for each track according to the track order, and quantized.
- quantizing a sine wave (pulse) when sine wave mode is applied may include quantizing a MDCT coefficient of a sine wave (pulse).
- the MDCT coefficient may mean the magnitude of a sine wave at a specific frequency.
- quantizing a sinusoid includes (1) quantizing the magnitude of the sine wave (absolute value of the MDCT coefficients), (2) quantizing the frequency of the sine wave (position of the MDCT coefficients), and (3) Quantizing the phase (Sign) of the MDCT coefficients.
- a pulse may mean an MDCT coefficient that is a magnitude of a sine wave. It may also be referred to as a maximum sinusoid or sinusoidal maximum in that a pulse may mean a sinusoidal peak at a particular frequency.
- staging such as 'maximum sine wave (pulse)' or 'pulse (maximum sine wave)' indicates that the maximum sine wave and the pulse may have the same meaning. It does not mean that pulses can mean different things.
- quantizing a pulse herein includes (1) quantizing the magnitude of the pulse (MDCT coefficient) and (2) quantizing the position of the pulse. In this case, quantizing the magnitude of the pulse may include quantizing the absolute value of the pulse and quantizing the sign of the pulse.
- quantizing a sine wave means that the sine wave is quantized in that the sine wave is encoded so that the sine wave can be recovered after selecting and quantizing a specific pulse among the pulses (MDCT coefficients) constituting the sine wave. It may be used as.
- the sine wave may mean a signal of each frequency in the sine wave mode
- the pulse may mean a signal at a specific position in the CELP mode.
- 'sine wave' excludes 'pulse', or 'pulse' does not exclude 'sine wave'.
- layer 6 two pulses are searched and quantized in each of four tracks from track 0 to track 3 according to bit allocation, and one pulse is searched and quantized in track 4 and track 5, respectively.
- the search in each track can be said to find the largest pulse in a track by the number allocated for each track.
- Table 2 shows the number of sine waves (pulses) extracted by the search for each track in the sine mode, the starting position of the track (starting position of the search), the interval size of each pulse position, and the number of pulses for each track. Indicates.
- the magnitude c j (l) of the extracted pulse may be encoded as shown in Equation 3.
- Equation 3 the magnitude value is encoded, but the sign information is lost. Therefore, the sign value of the pulse may be separately encoded by the following Equation 4.
- pos j (0), Sign_sin j (0), and c j (0) indicate the position, sign, and magnitude of a large pulse
- pos j (1), Sign_sin j (1), and c j (1 ) Denotes the position, symbol, and magnitude of the small pulse.
- the encoding is performed by using the original signal as a target signal in Equation 2, but in the case of an upper layer of layer 6, for example, in the case of Layer 7 or Layer 8, as shown in Equation 2, the original signal of the previous layer
- the encoding is performed by using the difference between the synthesized signal and the higher layer synthesized signal as a target signal.
- an uncoded signal may be encoded and transmitted in a lower layer.
- the encoding method performed on the upper layer of layer 6 is also similar to the encoding method described above with respect to layer 6.
- an additional 10 pulses may be extracted from the HF (7 to 14 kHz) signal.
- a frequency band to be encoded may be set differently according to a generic mode and a sine mode.
- HF signal from generic mode Is divided into eight subbands and energy is calculated for each subband.
- Each subband is composed of 32 MDCT coefficients as shown in Table 2, and the energy calculation method in each subband is shown in Equation 5.
- Equation 5 Is the HF signal resynthesized via generic mode.
- eight subbands may be arranged in order of energy magnitude from the subband having the highest energy by comparing the energy of each subband with each other. Five subbands with the highest energy among the aligned subbands are selected and five pulses are extracted for each subband according to the sine wave coding method described in Layer 6. At this time, the position of the track defined in the sine wave coding method depends on the energy characteristic of the HF signal for each frame.
- HF signal output in sine mode A total of 10 pulses extracted from are extracted through two processes, four extraction and six extraction. Four pulses may be extracted at positions corresponding to the 9400 to 11000 Hz band, and six pulses may be extracted at positions corresponding to the 11000 to 13400 Hz band.
- Table 4 shows information for each track in the sine mode (sign mode frame) of layer 7.
- Table 4 shows the number of sine waves extracted by the search for each track of the layer 7 as the encoding target, the start position of the track (start position of the search), the interval size of the pulse position of each track, and the number of pulses.
- the remaining four of the first 10 pulses can be extracted two by two tracks, and the band from which the pulses are extracted is 12150 to 13750 Hz.
- the extraction of the remaining 10 pulses out of 20 pulses is similar.
- the first six of the ten pulses can be extracted from three tracks, two per track, and the band from which the pulses are extracted is 8600 to 11000 Hz.
- the remaining four pulses can be extracted two by two from two tracks, and the band from which the pulses are extracted is 11000-12600 Hz.
- Table 5 describes an example of a sine wave track structure in the generic mode frame of Layer 8.
- sine mode sine mode frame
- two different processes of extracting 10 pulses are performed.
- the remaining four pulses of the first ten pulses can be extracted two by two tracks, and the band from which the pulses are extracted is 11000 to 12600 Hz.
- the extraction of the remaining 10 pulses out of 20 pulses is similar.
- the first six of the ten pulses can be extracted per track from two tracks on three tracks, with the band being extracted from 94000 to 11000 Hz.
- the remaining four pulses can be extracted two by two from two tracks, and the band from which the pulses are extracted is from 11000 to 13400 Hz.
- Table 6 shows an example of a sinusoidal track structure for a first set of extracting the first 10 pulses of 20 pulses in a sine mode frame of Layer 8.
- Table 7 shows an example of a sinusoidal track structure for a second set of extracting the second 10 of 20 pulses in a sine mode frame of Layer 8.
- each track consists of 2 steps and 3 steps.
- FIG. 8 is a diagram schematically illustrating an example in which two tracks are paired in the case of two steps. Referring to FIG. 8, when track 0 and track 1 in two steps are paired, it can be seen that pulse positions of both tracks are adjacent to each other.
- FIG. 9 is a diagram schematically illustrating an example in which three tracks are paired in the case of three steps. Referring to FIG. 9, when tracks 2, 3, and 4 of 3 steps are paired, it can be seen that pulse positions of each track are adjacent to each other.
- the search is performed independently from each track while sequentially searching from the first track to the last track.
- the search for each track is done independently, but using conventional methods, even for adjacent tracks (for paired tracks), The tracks are searched independently without considering the characteristics of each track.
- the sine waves searched in the first track do not affect the search of the second track paired with the first track.
- the sine waves found in the first track do not affect the sine wave search in the second track, and the sine waves found in the second track do not affect the sine wave search in the third track.
- the present invention can be applied not only to layer 6 which is a base layer of SWB, but also to layer 7 and layer 8 which are enhancement layers of layer 6.
- layer 6 which is a base layer of SWB
- layer 7 and layer 8 which are enhancement layers of layer 6.
- MDCT coefficients sine waves or pulses
- Equation 2 So far, an example based on an absolute value of Equation 2 has been described as a method of searching for a signal.
- the present invention is not limited thereto, and is based on a convolution value of an impulse response of, for example, a linear prediction coefficient (LPC) synthesis filter. Or search based on Mean Square Error (MSE). A method based on convolution and a method based on MSE will be described later.
- LPC linear prediction coefficient
- MSE Mean Square Error
- FIG. 10 is a flowchart schematically illustrating a sinusoid search method applied to each layer according to an example of the present invention.
- the example of FIG. 10 may be performed by the SWB encoder of FIG. 1.
- some or all of the steps in the example of FIG. 10 may be performed by the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the steps of FIG. 10 are performed by the SWB encoder and / or the SWB decoder.
- a target signal is generated (S1010).
- the target signal may be MDCT coefficients to be quantized.
- the SWB encoder and / or the SWB decoder may generate MDCT coefficients (target signals) to be quantized.
- the absolute value of the generated target signal (MDCT coefficient to be quantized) is calculated (S1020).
- the SWB encoder and / or the SWB decoder calculates an absolute value for the MDCT coefficients to be quantized.
- the absolute value of the MDCT coefficient can be calculated using Equation 2.
- the absolute value of the MDCT coefficient M 32 (k) of the original signal can be obtained.
- MDCT coefficients M 32 (k) of the original signal and MDCT coefficients of the HF composite signal You can find the absolute value of the difference between them.
- a sine wave (maximum sine wave, maxima sinusoid) having a maximum value may be searched for (S1030).
- a sine wave maximizing D (k) of Equation 2 may be referred to as a maximum sinusoid.
- the SWB encoder and / or the SWB decoder may search for at least one maximum sine wave in each track. At this time, the number of searched maximum sine waves (maxima) may be determined for each track.
- the SWB encoder and / or the SWB decoder may search for a sine wave having an absolute value of Equation 2 by a predetermined number for each track. For example, in the case of layer 6, at least two maximum sine waves may be searched in tracks 0 to 3, and at least one maximum sine wave may be searched in tracks 4 and 5.
- a position change for quantizing a sign of the sine wave may be performed (S1040).
- the position change may be performed in a track in which two or more sine waves are searched. Therefore, this step may not be performed when one sine wave is found.
- the location change may be performed based on the method described in Table 3. For example, considering the case of transmitting two sine waves or two pulses, the sign (+ or-) of the first sine wave / pulse is encoded. At this time, if the first sine wave / pulse is greater than the second sine wave / pulse, the signs of the two sine waves / pulse are the same, and if the magnitude of the first sine wave / pulse is smaller than the second sine wave / pulse, the signs of the two sine waves / pulse are different. Can be determined.
- the SWB encoder may set a position of a sine wave / pulse so that a code can be derived.
- the SWB decoder decodes the sign of the first sine wave / pulse, determines that the sign of the two sine waves / pulses is the same when the magnitude of the first sine wave / pulse is larger than the second sine wave / pulse bota, and the magnitude of the first sine wave / pulse If is smaller than the second sine wave / pulse, it can be determined that the coding of the two sine wave / pulse is different.
- the signal amplitudes of the searched sine waves / pulses are grouped (S1070).
- the SWB encoder and / or the SWB decoder may group signal amplitudes according to a sine wave / pulse group to be quantized. Grouping may be performed regardless of the track.
- the amplitudes of 10 detected signals are grouped.
- Sine waves or pulses may be grouped in sequence of three, three, and four.
- three signal magnitudes are grouped in Group 1, where the signal magnitudes of the two signals found in track 0 and one of the two signals found in track 1 may be grouped.
- three signal sizes may be grouped.
- the signal size of one of the two signals found in the track 1 and the signal size of the two signals found in the track 2 may be grouped.
- four signal magnitudes may be grouped.
- the signal magnitudes of two signals retrieved in track 3, the signal magnitude of signals retrieved in track 4, and the signal magnitudes of signals retrieved in track 5 may be grouped.
- the grouped signal magnitudes may be quantized in group units (S1080).
- the SWB encoder and / or the SWB decoder may perform quantization based on multi-dimension vector quantization (VQ).
- pulses of adjacent positions between two paired tracks may be searched or pulses of mutually separated positions between two paired tracks may be searched.
- the SWB encoder may search for and encode adjacent pulses among pulses of paired tracks.
- the SWB decoder may restore the SWB signal by decoding adjacent pulses among the pulses of the paired tracks.
- the pulse may be a sinusoidal MDCT coefficient as described above.
- track 0 and track 1 are two pairs of track pairs
- track 2 and track 3 are two pairs of tracks.
- track 0 and track 1 of two steps form a track pair
- track 2, track 3, and track 4 of three steps form another track pair.
- track 0, track 1, and track 2 in 3 steps form one track pair, and track 3 and track 4 in 2 steps You can see different track pairs.
- track 0 and track 1 in 2 steps form one track pair, track 2, track 3, and track 4 in 3 steps. Can be seen to form another track pair.
- sine wave mode frame of layer 8 refers to frames processed in sine wave mode in layer 7 among frames processed in layer 8.
- the tracks of the track pairs are searched in consideration of characteristics of the tracks.
- the search is performed in consideration of pulses searched in other tracks.
- FIG. 11 is a diagram schematically illustrating a case where an independent search is performed for each track without considering the characteristics of track pairs.
- the same search may be performed for track 1 to quantize information about the searched pulses.
- the search / quantization for track 1 is performed separately and independently from the search result of track 0.
- a sine wave value may be retrieved from a second track based on a sine wave value first detected from the track.
- tracks having adjacent positions form a track pair means that the tracks in the track pair have the same step (pulse interval), and each pulse in the track paired tracks is adjacent to each other in the track paired adjacent tracks. It means the case.
- the present embodiment is applicable to adjacent tracks, for example, when three tracks having three steps form a track pair, or two tracks having two steps form a track pair.
- FIG. 12 is a diagram schematically illustrating an example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- a search is performed on any one track among tracks constituting a track pair, a method of selecting a pulse searched in another track and an adjacent pulse as a pulse to be encoded.
- each track constituting the track pair is adjacent to each other.
- the example of FIG. 12 may be performed by the SWB encoder of FIG. 1.
- some or all steps of the example of FIG. 12 may be performed by the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the steps of FIG. 12 are performed by the SWB encoder and / or the SWB decoder.
- a target signal is first generated (S1200).
- the target signal may be pulses to be quantized, that is, MDCT coefficients.
- the SWB encoder and / or the SWB decoder may generate MDCT coefficients (target signals) to be quantized.
- the absolute value of the generated target signal (MDCT coefficient to be quantized) is calculated (S1205).
- the SWB encoder and / or the SWB decoder calculates an absolute value for the MDCT coefficients to be quantized.
- the absolute value of the MDCT coefficient can be calculated using Equation 2.
- the absolute value of the MDCT coefficient M 32 (k) of the original signal can be obtained.
- MDCT coefficients M 32 (k) of the original signal and MDCT coefficients of the HF composite signal You can find the absolute value of the difference between them.
- the absolute value is calculated immediately after generating the target signal.
- the absolute value of Equation 2 may be generated in the process of searching for the maximum value pulse for each track while searching for each track of the track pair.
- the SWB encoder and / or the SWB decoder determines whether a track pair exists in the MDCT coefficients (target signal) to be encoded.
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the energy of tracks constituting the track pair is calculated (S1215). For example, when track 0 and track 1 constitute a track pair, the SWB encoder and / or SWB decoder may calculate the energy of track 0 and the energy of track 1.
- the energy between the tracks constituting the track pair is compared with each other (S1220).
- the SWB encoder and / or the SWB decoder may search for the tracks in the order of the highest energy among the tracks constituting the track pair. For example, if track 0 and track 1 constitute a track pair and the energy of track 1 is greater than the energy of track 0, the SWB encoder and / or SWB decoder may first search for track 1 and then search for track 0.
- the processing in the energy order is a case of searching, and each process may be performed in the subsequent tracks according to the original track order.
- track 0 may be processed first and track 1 may be processed subsequently.
- candidate pulses are searched for each track according to the search order (S1225).
- the SWB encoder and / or the SWB decoder may search for candidate pulses in a high energy track and then search for candidate pulses in a low energy track.
- the SWB encoder and / or SWB decoder may search for candidates of the pulse to be encoded by searching a predetermined number more than the number of pulses to be searched and encoded.
- N1 the number of pulses to be searched and encoded in a track with high energy
- N2 the number of pulses to be searched and encoded in a track with low energy
- N1 and N2 may be the same.
- the SWB encoder and / or the SWB decoder may search for N1 + n1 pulses (n1 is an integer greater than or equal to 0) pulses on a high energy track, and N2 + n2 (n2 is an integer greater than 0) pulses on a low energy track. Can be.
- n2 may be greater than or equal to n1.
- N1 + n1 pulses and N2 + n2 pulses may be selected in the order of the largest absolute value according to Equation 2 in each track.
- the number of pulses to be searched in the track pairing tracks is 2, as many as 2 + n1 pulses are selected in the high energy track and as many as 2 + n2 pulses are selected in the low energy track. Can be.
- the SWB encoder and / or the SWB decoder may select the maximum sine wave (pulse) as many as the number to search and encode in a track having a large energy among tracks constituting the track pair.
- the searched pulses are compared with the pulse searched in the track with the largest energy in the order of the greatest absolute value (S1235).
- the SWB encoder and / or the SWB decoder may compare the N2 + n2 pulses found in the low energy track with the pulses found in the high energy track. In this case, the N2 + n2 pulses may be compared with the pulses selected in the tracks with the highest energy in the order of the greatest absolute value.
- the two adjacent pulses are each to be encoded in the high energy track. And one of the pulses to be signed in the track having a small energy (S1240).
- the pulse with the largest absolute value (maximum sine wave) in the low energy track may be selected.
- N1 x (N2 + n2) pulse combinations P t1 and P t2 may be configured, which may consist of N1 pulses selected on the high energy track and N2 + n2 pulses retrieved on the low energy track. If P t1 and P t2 are contiguous combinations (P t1, adj , P t2, adj ), then P t1, adj is determined as the pulse to be encoded in the high energy track and P t2, adj is encoded in the low energy track. It can be determined by the pulse to be.
- the absolute values may be selected in order from the adjacent pulse pairs.
- N2 pulses can be selected in the order of the greatest absolute value, as in the case of the high energy track, even in the low energy track.
- step S1230 only the number of pulses (maximum sine wave) to be encoded in the track having a large energy is selected.
- the present invention is not limited thereto.
- P t1, adj is determined as the pulse to be encoded in the high energy track and P t2, adj is encoded in the low energy track. It may be determined by the pulse to be.
- the position may be changed to quantize the sign of the selected pulse (S1245).
- steps S1235 and S1240 of selecting an encoding target pulse the steps are performed in consideration of the pulses found in other tracks, but only the pulses in the same track are considered in the position change step.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- the position of the pulse is quantized (S1250).
- the quantization target position is a position determined in consideration of the sign of the pulse in step S1245.
- the sign and amplitude of the selected pulse may be encoded (S1265).
- Quantizing the information indicating the sign and magnitude of the pulse includes: quantizing the sign of the pulse (S1270), size grouping step (S1275) for quantizing the pulse amplitude (S1275), and quantizing the magnitude of the pulse ( S1280) may be included. Quantization of the size indicating information may be performed based on multi-dimensional vector quantization (VQ), and grouping of sizes may be referred to as a prerequisite for multi-dimensional VQ.
- VQ multi-dimensional vector quantization
- the maximum sine wave can be selected by the number of maximum sine waves (pulses) searched in each track (S1360).
- the SWB encoder and / or the SWB decoder may search for the maximum sine wave (pulse) by the number of pulses to be encoded for each track and select it as an encoding / quantization target pulse without considering the pulse waves searched for in other tracks.
- the quantization step S1365 of position / magnitude / sign may be performed in the same manner as if a track pair exists.
- a search for a predetermined number more pulses as candidate pulses than the number of encoding target pulses in each track is described, but the present invention is not limited thereto.
- a pulse serving as an encoding target may be searched without searching for a larger number of pulses as candidate pulses than the number of encoding target pulses.
- N1 pulses may be searched for in a track of high energy.
- step S1230 may not be performed.
- the candidate pulses are searched for in the tracks (tracks with high energy and tracks with low energy) constituting the track pair, and then pulses to be encoded for each track are selected.
- the candidate pulse may be searched and the encoding target pulse may be selected for each track constituting the track pair.
- the candidate pulses of the low energy track may be searched, and the encoding target pulse of the low energy track may be selected in consideration of the position of the selected pulse in the high energy track. have.
- the track in order to determine the importance of a track, the track is divided into an upper importance track and a lower importance track based on the energy of the track.
- the present invention is not limited thereto.
- other criteria may be applied in addition to energy.
- the tracks of the track pairs can be detected in the same manner as described with reference to FIG. 12 to determine the encoding target pulse and quantize the information of the pulse.
- the criteria for searching for candidate pulses in each track is the absolute value of the MDCT coefficient (pulse). In this case, other characteristic values may be added as another criterion.
- the pulses to be finally encoded / quantized may be selected based on whether they are pulses in a position adjacent to the pulses of the higher importance track (the track having higher energy in the example of FIG. 12).
- FIG. 12 a case in which a track having a high energy and a track having a low energy forms one track pair is described as an example.
- the present invention described with reference to FIG. 12 may be equally applied to a case in which two tracks constitute a track pair, as well as a case in which two or more tracks constitute a track.
- steps of FIG. 12 may be applied in order to all tracks so that an encoding target pulse may be determined in all tracks of a target signal.
- FIG. 13 is a view schematically illustrating another example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- a search is performed for any one track among tracks constituting a track pair, a method of selecting a pulse adjacent to a pulse searched in other tracks as a pulse to be encoded.
- each track constituting the track pair is adjacent.
- the example of FIG. 13 may be performed by the SWB encoder of FIG. 1.
- some or all of the steps in the example of FIG. 13 may be performed by the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the steps of FIG. 13 are performed by the SWB encoder and / or the SWB decoder.
- a target signal is first generated (S1300).
- the target signal may be pulses to be quantized, that is, MDCT coefficients.
- the SWB encoder and / or the SWB decoder may generate MDCT coefficients (target signals) to be quantized.
- the SWB encoder and / or the SWB decoder determines whether a track pair exists in the MDCT coefficients (target signal) to be encoded.
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the features of the tracks constituting the track pair are extracted (S1215).
- the extracted feature is the same feature for the track pair, and may have different values, for example, different values for each track constituting the track pair.
- the order of importance of the tracks may be determined (S1315). For example, considering a case where track-specific energy is used as a feature extracted for each track, a track having a high energy may be determined as a track of high importance, that is, a track in which a search for pulse is performed first.
- processing in the order according to the extracted feature value is a case of searching, and each process may proceed according to the original track order in other steps later.
- track 0 may be processed first and track 1 may be processed subsequently.
- an order of searching for pulses may be determined according to the feature value of the tracks constituting the track pair. For example, depending on what the feature is, when a track having a large feature value is a more important track, the pulse search may proceed from the track having a large feature value. Alternatively, in the case where a track having a small feature value is a more important track, a pulse search may be performed from a track having a small feature value.
- candidate pulses may be searched for each track according to the importance order.
- candidate pulses are searched for in the first priority importance track (S1320). If the number of pulses to be encoded in the priority track is M1 (M1 is an integer greater than 0), a predetermined number (m1) more pulses than the number of pulses to be encoded may be searched as candidate pulses in the priority track. .
- pulses corresponding to the number M1 of pulses to be coded in the priority ranking track may be selected (S1325).
- the number of pulses to be encoded may be equal to the number of pulses searched in the priority tracks when the pulses searched in the other tracks of the track pair (maximum sine wave) are not considered.
- candidate pulses are searched for in the second rank importance track (S1330). If the number of pulses to be encoded in the second priority track is M2 (M2 is an integer greater than 0), a predetermined number (m2) more pulses than the number of pulses to be encoded may be searched as candidate pulses of the second priority track. . At this time, the number m2 of additionally searched pulses in the second priority track can be equal to or greater than the number m1 of additionally searched pulses in the first priority track.
- the SWB encoder and / or the SWB decoder may compare the positions of the M2 + m2 pulses found in the second priority track with the positions of the pulses selected in the first priority track.
- the pulses retrieved from the 2nd priority track there is a pulse at a position adjacent to the selected pulse in the 1st priority track, two adjacent pulses are each encoded in the 2nd priority track and one of the pulses to be coded in the 1st priority track.
- One of the pulses may be selected (S1340).
- the pulse with the largest absolute value (maximum sine wave) in the second rank importance track may be selected.
- M1x (M2 + m2) pulse combinations P tp1 and P tp2 may be configured, which may consist of M1 pulses selected in the priority track and M2 + m2 pulses found in the priority track.
- P tp1 and P tp2 is the adjacent combination (adj P tp1,, P tp2 , adj) is called when, P tp1, determines the adj by one pulse to be encoded in the Priority track P tp2, adj encoding the at secondary importance track It can be determined by the pulse to be.
- adjacent pulse combinations between tracks forming track pairs can be further selected.
- the absolute values may be selected in order from the adjacent pulse pairs. For example, if two pulses are selected and encoded for each track, pulse pairs having the largest pulse absolute value of the second priority track among pulse pairs adjacent to the pulses of the two tracks, and then pulse absolute of the second priority track. A pulse pair with a large value can be selected and encoded / quantized.
- M2 pulses are selected in order of absolute magnitude, similar to the method of selecting pulses in the 1st priority track in the 2nd priority track. Can be.
- step S1325 only the number of pulses to be coded in the first priority track is selected.
- the present invention is not limited thereto.
- P tp1 and P tp2 is the adjacent combination (adj P tp1,, P tp2 , adj) is called when, P tp1, determines the adj by one pulse to be encoded in the Priority track P tp2, adj encoding the at secondary importance track It may be determined by the pulse to be. If a plurality of pulses are to be selected for each track, the absolute values of the pulses of the second priority track among the pairs of pulses of two adjacent pulses may be selected in ascending order.
- the priority tracks from the 3rd priority track to the following priority tracks may search for candidate pulses and select a pulse to be encoded from the candidate pulses.
- the candidate pulses for each track are sequentially searched according to the importance, the process of selecting the encoding target pulse is performed, and the candidate pulses are searched for in the lowest priority track (S1345).
- Mk is an integer greater than 0
- mk is greater than the number of pulses to be encoded. More pulses can be retrieved as candidate pulses of the lowest priority track.
- the number mk of additionally searched pulses in the lowest priority track may be equal to or greater than the number mk-1 of additionally searched pulses in the previous priority track (k-1 priority track).
- the SWB encoder and / or the SWB decoder may compare the positions of the Mk + mk pulses found in the lowest priority track with the positions of the pulses selected in the previous rank priority track.
- Mk + mk There are Mk-1 x (Mk + mk) pulse combinations (P tpk-1 , P tpk ) that can consist of the Mk-1 pulses selected in the previous priority importance track and the Mk + mk pulses found in the lowest priority importance track. Can be configured.
- P tpk-1 and P tpk are contiguous combinations (P tpk-1, adj , P tpk, adj ), then determine P tpk-1, adj as the pulse to be coded in the previous rank importance track and P tpk, adj It can be determined as the pulse to be encoded in the lowest priority track.
- adjacent pulse combinations between tracks forming track pairs can be further selected.
- the absolute values may be selected in order from the adjacent pulse pairs.
- Mk pulses maximum sine waves
- the position may be changed to quantize the sign of the selected pulse (S1345).
- the steps are performed in consideration of the pulses found in other tracks, but in the position change step, only the pulses in the same track are considered.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- the position, magnitude and / or sign of the pulse is quantized (S1365).
- the quantization target position is a position determined in consideration of the sign of the pulse in step S1245.
- the maximum sine wave can be selected by the number of maximum sine waves (pulses) searched in each track (S1360).
- the SWB encoder and / or the SWB decoder may search for the maximum sine wave by the number of pulses to be encoded for each track and select it as an encoding / quantization target pulse without considering the pulse waves searched for in other tracks.
- the quantization step S1365 of position / magnitude / sign may be performed in the same manner as if a track pair exists.
- step S1320 a search for a predetermined number more pulses as candidate pulses than the number of encoding target pulses in each track has been described, but the present invention is not limited thereto.
- the number of pulses larger than the number of encoding target pulses may not be searched as candidate pulses, but only the pulses that are encoding targets (quantization targets) may be searched. That is, unlike the lower priority track, only the M1 pulses may be searched for in the priority track. In this case, step S1325 may not be performed.
- the searching and selection of the candidate pulses are performed for each track constituting the track pair, but the present invention is not limited thereto. For example, after searching a predetermined number of candidate pulses or more than the number of encoding target pulses for all tracks constituting the track pair, encoding the pulses adjacent to the pulses selected from the higher priority tracks among the candidate pulses for each track. It can be selected by the target pulse. In this case, when searching for candidate pulses for the mode tracks constituting the track pair, the most significant track selects candidate pulses equal to the number of encoding target pulses (for example, searching for encoding target pulses rather than searching for candidate pulses). It may be.
- the track in order to determine the importance of a track, the track is divided into an upper importance track and a lower importance track based on the energy of the track.
- the present invention is not limited thereto.
- other criteria may be applied in addition to energy.
- the tracks of the track pairs may be searched for pulses in the same manner as described with reference to FIG. 13 to determine an encoding target pulse and quantize the information of the pulses.
- steps of FIG. 13 may be applied in order to all tracks so that an encoding target pulse may be determined in all tracks of a target signal.
- the method described with reference to FIGS. 12 and 13 can be applied to a case where the target signal includes all original signal components, such as the case of processing the higher band of the G.718 SWB.
- the method of FIG. 12 and FIG. 13 serves to cause the MDCT coefficient to be concentrated in a band having a strong tonal component.
- Selecting a pulse adjacent to the pulse position of the higher importance track is effective for signals with tonality and different modes depending on the tonal information, such as G.718 SWB (generic if no tonal component is present). Efficient if you have other modes).
- the MDCT based enhancement layer may correspond to the SWB encoder of FIG. 1.
- the MDCT-based enhancement layer may correspond to the SWB decoder of FIG. 3.
- pulses separated from pulses selected from other tracks may be selected as encoding target pulses.
- the method of selecting a pulse at a position away from the pulse selected in another track as the encoding target pulse can be effectively used when energy is uniformly distributed in one frame of the target signal.
- the pulse at a position relatively separated from the pulse position of the higher priority track is obtained. You can choose.
- the present method can be effectively used even when there are no modes depending on the tonality.
- FIG. 14 is a view schematically illustrating another example of a method of performing a search in consideration of a search result of another track among tracks of a track pair according to the present invention.
- a search is performed for any one track among tracks constituting a track pair, a method of selecting a pulse away from a pulse searched in other tracks as a pulse to be encoded.
- each track constituting the track pair is adjacent to each other.
- the example of FIG. 14 may be performed by the SWB encoder of FIG. 1.
- some or all of the steps in the example of FIG. 14 may be performed by the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the steps of FIG. 14 are performed by the SWB encoder and / or the SWB decoder.
- a target signal is generated (S1400).
- the target signal may be pulses to be quantized, that is, MDCT coefficients.
- the SWB encoder and / or the SWB decoder may generate MDCT coefficients (target signals) to be quantized.
- the SWB encoder and / or the SWB decoder determines whether a track pair exists in the MDCT coefficients (target signal) to be encoded.
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the features of the tracks constituting the track pair are extracted (S1410).
- the extracted feature is the same feature for the track pair, and may have different values, for example, different values for each track constituting the track pair.
- the feature to be extracted can be the energy of the track.
- the order of importance of the tracks may be determined (S1415). For example, considering a case where track-specific energy is used as a feature extracted for each track, a track having a high energy may be determined as a track of high importance, that is, a track in which a search for pulse is performed first.
- an order of searching for pulses may be determined according to the feature value of the tracks constituting the track pair. For example, depending on what the feature is, when a track having a large feature value is a more important track, the pulse search may proceed from the track having a large feature value. Alternatively, in the case where a track having a small feature value is a more important track, a pulse search may be performed from a track having a small feature value.
- candidate pulses may be searched for each track according to the importance order.
- processing in the order according to the feature values is a case of searching, and each process may proceed according to the original track order in other steps later. For example, when the bitstream is formed by quantization, track 0 may be processed first and track 1 may be processed subsequently.
- candidate pulses are searched for in the first priority importance track (S1420).
- the SWB encoder and / or the SWB decoder may search for the first priority track to find pulses having a maximum maximum value.
- more pulses may be selected by a predetermined number l1 than the number L1 of pulses to be encoded in the priority ranking track (S1425).
- the number of pulses to be encoded may be the same as the number of pulses searched in the priority tracks when the pulses (maximum sine waves) selected in other tracks of the track pair are not considered.
- the number of pulses to be coded in the priority track is L1 (L1 is an integer greater than 0), then a greater number of pulses (1, l1 are greater than or equal to 0) greater than the number of pulses to be encoded. It can be selected as a candidate pulse of the track. For example, if the number of pulses to be encoded in the priority track is two (if the number of pulses to be searched is not considered when the pulses selected in other tracks are not taken into consideration), 2 + 1 pulses can be selected.
- candidate pulses are searched for in the second most important track, which is the next important track (S1430).
- the number of pulses to be encoded in the 2nd priority importance track is L2 (L2 is an integer greater than 0)
- a certain number of pulses (l2 and l2 are greater than or equal to 0) are greater than the number of pulses to be encoded. It can be retrieved as a candidate pulse of the track. For example, if the number of pulses to be coded in the second priority track is two (the number of pulses to be searched when searching by the conventional method is two), 2 + l 2 pulses can be searched.
- the number l2 of additionally searched pulses in the second priority importance track may be equal to or greater than the number l1 of additionally searched pulses in the first priority importance track.
- the SWB encoder and / or the SWB decoder may compare the positions of the L2 + l2 pulses found in the second priority track with the positions of the pulses selected in the first priority track.
- the two apart pulses will be encoded in one of the pulses to be coded in the 1st priority track and 2nd priority track respectively.
- One of the pulses may be selected (S1440).
- the pulse having the largest absolute value may be selected in the second priority track.
- P tp1 , P tp2 The combination of (L1 + l1) x (L2 + l2) pulses (P tp1 , P tp2 ), which can consist of L1 + l1 pulses selected in the 1st priority track and L2 + l2 pulses found in the 2nd priority track, Can be configured.
- the positions of the two pulses among the pulse pairs that are separated may be selected in the order of the furthest distance. For example, in the case of selecting and encoding two pulses for each track, select a pulse pair with the longest distance between two pulses among the pulse pairs, and then select a pulse pair with a longest distance between the two pulses to track the selected pulses. Can be encoded (quantized) separately.
- L2 pulses maximum sine waves in the order of magnitude are similar to the method of selecting pulses in the 1st priority track, even in the 2nd priority track. You can choose.
- P tp1 1... L1
- P tp2 1... L2 + l2
- P tp1, away is determined as the pulse to be encoded in the priority track and P tp2, away May be determined as the pulse to be encoded in the second priority importance track.
- P tp1, away is determined as the pulse to be encoded in the priority track
- P tp2, away May be determined as the pulse to be encoded in the second priority importance track.
- a plurality of (e.g., two) pulses are to be selected for each track, it is also possible to select the pulse pair at which the two pulses are farthest apart and the second pair of pulses farthest apart.
- the maximum sine wave can be selected by the maximum number of sine waves (pulses) searched in each track (S1445).
- the SWB encoder and / or the SWB decoder may search for the maximum sine wave (pulse) by the number of pulses to be encoded for each track and select it as an encoding / quantization target pulse without considering the pulse waves searched for in other tracks.
- the position / size / sign is quantized with respect to the selected pulses (maximum sine wave) (S1455).
- the position of the pulse can be changed to quantize information indicating the sign of the selected pulse.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- Quantizing the information indicating the sign and magnitude of the pulse may include quantizing the sign of the pulse, magnitude grouping to quantize the pulse amplitude, and quantizing the magnitude of the pulse.
- Quantization of the size indicating information may be performed based on multi-dimensional vector quantization (VQ), and grouping of sizes may be referred to as a prerequisite for multi-dimensional VQ.
- a search for a predetermined number more pulses as the candidate pulses than the number of encoding target pulses in the first priority track is described, but the present invention is not limited thereto.
- the number of pulses larger than the number of encoding target pulses may not be searched as candidate pulses, but only the pulses that are encoding targets (quantization targets) may be searched. That is, unlike the lower priority track, only the M1 pulses may be searched for in the priority track. In this case, step S1425 may not be performed.
- the searching and selection of candidate pulses are performed for each track constituting the track pair, but the present invention is not limited thereto. For example, after searching a predetermined number of candidate pulses or more than the number of encoding target pulses for all tracks constituting the track pair, encoding the pulses adjacent to the pulses selected from the higher priority tracks among the candidate pulses for each track. It can be selected by the target pulse. In this case, when searching for candidate pulses for the mode tracks constituting the track pair, the most significant track selects candidate pulses equal to the number of encoding target pulses (for example, searching for encoding target pulses rather than searching for candidate pulses). It may be.
- the track in order to determine the importance of a track, the track is classified into a higher importance track and a lower importance track based on the energy of the track.
- the present invention is not limited thereto.
- other criteria may be applied in addition to energy.
- the tracks of the track pairs may be searched for pulses in the same manner as described with reference to FIG. 14 to determine an encoding target pulse, and quantize the information of the pulses.
- steps of FIG. 14 may be applied in order to all tracks so that an encoding target pulse may be determined in all tracks of a target signal.
- the example of FIG. 15 may be performed by the SWB encoder of FIG. 1.
- some or all of the steps in the example of FIG. 15 may be performed by the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the steps of FIG. 15 will be described in the SWB encoder and / or the SWB decoder.
- a target signal is first generated (S1500).
- the target signal may be pulses to be quantized, that is, MDCT coefficients.
- the SWB encoder and / or the SWB decoder may generate MDCT coefficients (target signals) to be quantized.
- the SWB encoder and / or the SWB decoder determines whether a track pair exists in the MDCT coefficients (target signal) to be encoded.
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the features of the tracks constituting the track pair are extracted (S1510).
- the extracted feature is the same feature for the track pair, and may have different values, for example, different values for each track constituting the track pair.
- the feature to be extracted may be the energy of the track.
- the order of importance of the tracks may be determined (S1515). For example, considering a case where track-specific energy is used as a feature extracted for each track, a track having a high energy may be determined as a track of high importance, that is, a track in which a search for pulse is performed first.
- an order of searching for pulses may be determined according to the feature value of the tracks constituting the track pair. For example, depending on what the feature is, when a track having a large feature value is a more important track, the pulse search may proceed from the track having a large feature value. Alternatively, in the case where a track having a small feature value is a more important track, a pulse search may be performed from a track having a small feature value.
- candidate pulses may be searched for each track according to the importance order.
- processing in the order according to the feature values is a case of searching, and each process may proceed according to the original track order in other steps later. For example, when the bitstream is formed by quantization, track 0 may be processed first and track 1 may be processed subsequently.
- candidate pulses are searched for in the first priority importance track (S1520). If the number of pulses to be coded in the priority track is P1 (P1 is an integer greater than zero), then more pulses are given the number of pulses (p1, p1 is an integer greater than or equal to 0) than the number of pulses to encode. It can be retrieved as a candidate pulse of the track. For example, if the number of pulses to be coded in the priority track is two (if the number of pulses to be searched is not considered when a pulse selected from another track is not taken into consideration), 2 + p1 pulses may be searched. .
- the number of pulses corresponding to the number P1 of the pulses to be coded in the priority track may be selected (S1525).
- the number of pulses to be encoded may be the same as the number of pulses searched in the priority tracks when the pulses (maximum sine waves) selected in other tracks of the track pair are not considered.
- candidate pulses are searched for in the second most important track, which is the next most important track (S1530). If the number of pulses to be coded in the second-order importance track is P2 (P2 is an integer greater than zero), the number of pulses (p2, p2 is an integer greater than or equal to zero) is greater than the number of pulses to encode. It can be retrieved as a candidate pulse of the track. For example, if the number of pulses to be encoded in the second priority importance track is two (the number of pulses to be searched is two when searching by the conventional method), 2 + p 2 pulses can be searched.
- the number p2 of additionally searched pulses in the second priority track may be equal to or greater than the number p1 of additionally searched pulses in the first priority track.
- the SWB encoder and / or the SWB decoder may compare the positions of the P2 + p2 pulses found in the second priority track with the positions of the pulses selected in the first priority track.
- the two apart pulses will be encoded in one of the pulses to be coded in the 1st priority track and 2nd priority track respectively.
- One of the pulses may be selected (S1540).
- the pulse having the largest absolute value (sine wave having the maximum value) may be selected.
- P1x (P2 + p2) pulse combinations P tp1 and P tp2 may be configured, which may consist of P1 pulses selected in the priority track and P2 + p2 pulses found in the priority track.
- P tp1, away P tp2 , away
- P tp1 and P tp2 the P tp1, away It may be determined as a pulse to be encoded in the priority track and P tp2, away may be determined as a pulse to be encoded in the priority track.
- the positions of the two pulses among the pulse pairs that are separated may be selected in the order of the distance. For example, in the case of selecting and encoding two pulses for each track, select a pulse pair with the longest distance between two pulses among the pulse pairs, and then select a pulse pair with a longest distance between the two pulses to track the selected pulses. Can be encoded (quantized) separately.
- P2 pulses in the order of absolute value are similar to the method of selecting pulses in the 1st priority track in the 2nd priority track. You can choose.
- step S1525 only the number of pulses to be coded in the first priority track is selected.
- the present invention is not limited thereto.
- P tp1 1... P1 + p1
- P tp2 1... P2 + p2
- P tp1 , P tp2 1... P2 + p2
- P tp1, away is determined as the pulse to be encoded in the priority track and P tp2, away May be determined as the pulse to be encoded in the second priority importance track.
- P tp1, away is determined as the pulse to be encoded in the priority track
- P tp2, away May be determined as the pulse to be encoded in the second priority importance track.
- a plurality of (e.g., two) pulses are to be selected for each track, it is also possible to select the pulse pair at which the two pulses are farthest apart and the second pair of pulses farthest apart.
- the priority tracks from the 3rd priority track to the following priority tracks may search for candidate pulses and select a pulse to be encoded from the candidate pulses.
- the candidate pulses are sequentially searched for tracks according to the importance, and the candidate pulses are searched for in the lowest priority track at the end of the process of selecting an encoding target pulse (S1545).
- Pk is an integer greater than 0
- pk is greater than the number of pulses to be encoded.
- More pulses can be retrieved as candidate pulses of the lowest priority track.
- the number pk of pulses additionally searched in the lowest priority track may be equal to or greater than the number of pulses pk-1 additionally searched in the previous priority track (k-1 priority track).
- the SWB encoder and / or the SWB decoder may compare the positions of the Pk + pk pulses found in the lowest priority track with the positions of the pulses selected in the previous priority track.
- the two apart pulses are each one of the pulses to be encoded in the lowest priority track (k rank importance track) and One of the pulses to be encoded in the previous rank importance track (k-1 rank importance track) may be selected (S1355).
- Pk-1 x (Pk + pk) pulse combinations (P tpk-1 , P tpk ) that can be composed of Pk-1 pulses selected in the previous priority importance track and Pk + pk pulses found in the lowest priority track. Can be configured.
- P tpk-1, away is determined as the pulse to be encoded in the previous rank importance track.
- P tpk, away can be determined as the pulse to be coded in the lowest priority track.
- a combination of pulses separated between tracks constituting a track pair may be further selected. For example, when there are a plurality of combinations of pulses located apart between tracks that make up a pair of tracks, a combination of pulses having the longest distance between the pulses of the two tracks and then a combination having the greatest distance between the pulses may be sequentially selected. .
- Pk pulses maximum sine waves
- the maximum sine wave can be selected by the maximum number of sine waves (pulses) searched in each track (S1560).
- the SWB encoder and / or the SWB decoder may search for the maximum sine wave (pulse) by the number of pulses to be encoded for each track and select it as an encoding / quantization target pulse without considering the pulse waves searched for in other tracks.
- the position / size / sign is quantized with respect to the selected pulses (maximum sine wave) (S1565).
- the position of the pulse can be changed to quantize information indicating the sign of the selected pulse.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- the position, magnitude and / or sign of the pulse is quantized
- a search for a predetermined number more pulses as the candidate pulses than the number of encoding target pulses in the priority track is described, but the present invention is not limited thereto.
- the number of pulses larger than the number of encoding target pulses may not be searched as candidate pulses, but only the pulses that are encoding targets (quantization targets) may be searched.
- the P1 pulses may be searched for in the priority track. In this case, step S1525 may not be performed.
- the searching and selection of candidate pulses are performed for each track constituting the track pair, but the present invention is not limited thereto. For example, after retrieving a predetermined number of candidate pulses more than the number of pulses to be coded for all tracks constituting the track pair, encoding the pulses far from the pulse selected in the higher priority track among the candidate pulses for each track. It can be selected by the target pulse. In this case, when searching for candidate pulses for the mode tracks constituting the track pair, the most significant track selects candidate pulses equal to the number of encoding target pulses (for example, searching for encoding target pulses rather than searching for candidate pulses). It may be.
- the track in order to determine the importance of a track, the track is classified into an upper importance track and a lower importance track based on the energy of the track.
- the present invention is not limited thereto.
- other criteria may be applied in addition to energy.
- the tracks of the track pairs may be searched for pulses in the same manner as described with reference to FIG. 15 to determine an encoding target pulse, and quantize the information of the pulses.
- steps of FIG. 15 may be applied to all tracks in order so that the encoding target pulse may be determined in all tracks of the target signal.
- the method of FIGS. 14 and 15 may be effective when the difference signal (difference, residual) is a target after encoding, without targeting the original signal during encoding for the G.718 SWB upper band.
- a basic core such as G.718 WB may be applied to encode an uncoded signal.
- FIG. 16 is a block diagram schematically illustrating an example of an encoder to which the methods of FIGS. 14 and 16 are applied.
- the WB signal is input to a basic core 1620.
- the signal output from the basic core 1620 may be encoded and transmitted in a bitstream.
- the difference between the signal decoded in the basic core 1620 and the original HP filtered signal may be processed in the MDCT-based enhancement layer 1630 and then output as a bitstream.
- the enhancement layer 1630 may correspond to the super wide band (SWB) encoder of FIG. 1.
- the target signal When the target signal is generated, it is possible to determine whether to select a pulse adjacent to a pulse of another track in a pair of tracks as an encoding target pulse or to select pulses apart from pulses of another track in the track pair as encoding target pulses.
- an energy distribution may be used as a feature of the target signal as a reference for determining how to select an encoding target pulse.
- the tonality determination unit of FIG. 1 may determine how to select an encoding target pulse.
- a method of selecting a pulse adjacent to a pulse selected from another track of a track pair as an encoding target pulse may be used. If the target signal has no tonal component or the energy distribution of the target signal is uniform, a method of selecting a pulse that is separated from a pulse selected from other tracks of the track pair as the encoding target pulse may be used.
- information indicating whether to select a pulse adjacent to the selected pulse or a pulse apart from the selected track in another track of the pair of tracks extracts a feature of the target signal (eg, FIG. 1). It may be input to a module (for example, the SWB encoder of FIG. 1) for selecting a pulse for each track from the tonerity determination unit of FIG.
- 17 is a flowchart schematically illustrating an example of a method of searching for a pulse of a track according to frame energy or tonality according to the present invention.
- FIG. 17 may be performed by the SWB encoder of FIG. 1 and / or the SWB decoder of FIG. 3.
- the operation may be performed in at least one of a sine wave mode unit and an additional sine wave unit of the SWB encoder and / or the SWB decoder.
- the determination and the indication of whether to select the adjacent pulse in FIG. 17 may be performed by the tonality determination unit of FIG. 1 and the SWB decoding unit of FIG. 3.
- the steps of FIG. 17 are described by the SWB encoder and / or the SWB decoder.
- a target signal is generated (S1700).
- the target signal may be pulses to be quantized, that is, MDCT coefficients.
- the feature of the target signal is extracted (S1705).
- the feature of the extracted target signal may be tonality or may be a distribution of energy.
- the target signal When determining the tonality as a characteristic of the target signal, if the target signal is tonal, it may be instructed to select a pulse adjacent to the selected pulse in another track of the track pair. In addition, if the target signal is not tonal, it may be instructed to select a pulse away from the selected pulse in another track of the track pair.
- the energy of the target signal when the energy of the target signal is concentrated in a specific band, it may be instructed to select a pulse adjacent to a selected pulse in another track of the track pair. In addition, when the energy of the target signal is evenly concentrated, it may be instructed to select a pulse away from the selected pulse in another track of the track pair.
- step S1705 for a pair of tracks, it is determined whether to select a pulse adjacent to a pulse selected in another track or a pulse away from a pulse selected in another track, and the determined information selects a pulse in the most significant track, or When delivered to the previous stage, the pulses can be selected in the same way on each track of the track pair.
- the pulses may be retrieved / selected by the method according to the example of FIG. If it is determined to select the distant pulses, the pulses can be retrieved by the method according to the example of FIG.
- step S1705 when it is determined in step S1705 for a pair of tracks to select a pulse adjacent to a pulse selected in another track or a pulse away from a pulse selected in another track, how to select a pulse for each track may be determined.
- the pulse when information on how to select a pulse is transmitted to the step of selecting a pulse for each track, the pulse may be selected according to the method indicated by the transferred information.
- the pulse when the current track is instructed to select a pulse adjacent to a selected pulse in another track, the pulse may be searched / selected by the method according to the example of FIG. 13. In addition, in the current track, when it is instructed to select a pulse at a position away from the selected pulse in another track, the pulse may be searched / selected by the method according to the example of FIG. 15.
- each track is instructed as to whether adjacent pulses or separated pulses should be selected will be described as an example.
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the features of the tracks constituting the track pair are extracted (S1715).
- the feature to be extracted may be the energy of the track.
- the order of importance of the tracks may be determined (S1720). For example, when the feature extracted for each track is energy for each track, a track having a high energy may be determined as a track having a high importance, that is, a track in which a search for pulse is performed first.
- candidate pulses are searched for in the first priority importance track (S1725). If the number of pulses to be encoded in the priority track is Q1 (Q1 is an integer greater than zero), the number of pulses (q1, q1 is an integer greater than or equal to 0) is greater than the number of pulses to be encoded. It can be retrieved as a candidate pulse of the track.
- the selection of the pulse to be encoded may be performed according to the method determined in S1705 based on the feature of the target signal.
- the encoding target pulse may be selected without considering the relationship with the pulse selected in the other track.
- an encoding target pulse may be selected based on an absolute value of the pulse.
- candidate pulses are searched for in the second rank importance track (S1735). If the number of pulses to be coded in the second-order importance track is Q2 (Q2 is an integer greater than zero), the number of pulses is greater than the number of pulses to be encoded (q2, q2 is an integer greater than or equal to zero). It can be retrieved as a candidate pulse of the track.
- the number q2 of pulses additionally searched in the second priority track may be equal to or greater than the number q1 of pulses additionally searched in the first priority track.
- Pulses to be encoded are selected among the pulses searched in the second priority importance track based on the positional relationship with the pulse selected in the first priority importance track (S1745).
- the selection of the pulse to be encoded in the second priority track may be performed according to the method determined in S1705 based on the feature of the target signal. For example, if it is instructed to select a pulse adjacent to the selected pulse in the priority priority track, the pulse may be selected by the method according to the example of FIG. 13. Further, when instructed to select a pulse at a position away from the selected pulse in the priority track, the pulse may be selected by the method according to the example of FIG. 15.
- the pulse having the largest absolute value may be selected in the second priority track.
- the priority tracks from the 3rd priority track to the following priority tracks may search for candidate pulses and select a pulse to be encoded from the candidate pulses.
- the candidate pulses for each track are sequentially searched according to the importance, and candidate pulses are searched for in the lowest priority track at the end of the process of selecting an encoding target pulse (S1750).
- a pair of tracks consists of k tracks
- Qk is an integer greater than 0
- the predetermined number qk is greater than the number of pulses to be encoded.
- More pulses can be retrieved as candidate pulses of the lowest priority track.
- the number qk of pulses additionally searched in the lowest priority track may be equal to or greater than the number qk-1 of pulses additionally searched in the previous rank importance track (k-1 rank priority track).
- Pulses to be encoded in the lowest priority track may be selected based on the positional relationship with the pulse selected in the previous rank priority track (k-1 rank priority track) (S1760).
- the selection of the pulse to be encoded in the k rank importance track may be performed according to the method determined in S1705 based on the feature of the target signal. For example, if the k-1 rank importance track is instructed to select a pulse adjacent to the selected pulse, the pulse may be selected by the method according to the example of FIG. In addition, when instructed to select a pulse at a position away from the selected pulse in the k-1 rank importance track, the pulse may be selected by the method according to the example of FIG. 15.
- the maximum sine wave can be selected by the number of maximum sine waves (pulses) searched in each track (S1765).
- the SWB encoder and / or the SWB decoder may search for the maximum sine wave (pulse) by the number of pulses to be encoded for each track and select it as an encoding / quantization target pulse without considering the pulse waves searched for in other tracks.
- the position / size / sign is quantized with respect to the selected pulses (maximum sine wave) (S1770).
- the position of the pulse can be changed to quantize information indicating the sign of the selected pulse.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- the position, magnitude and / or sign of the pulse is quantized
- a search for a predetermined number more pulses as the candidate pulses than the number of encoding target pulses in the first priority track is described, but the present invention is not limited thereto.
- the number of pulses larger than the number of encoding target pulses may not be searched as candidate pulses, but only the pulses that are encoding targets (quantization targets) may be searched.
- the P1 pulses may be searched for in the priority track. In this case, step S1725 may not be performed.
- the searching and selection of candidate pulses is performed for each track constituting the track pair, but the present invention is not limited thereto.
- the encoding target pulse is based on the positional relationship with the pulse selected in the higher priority track for each track. You can also select.
- the selection of the pulse in each track may be performed according to the method determined in S1705 based on the characteristics of the target signal. In this case, the same method may be applied to each track, or different methods may be applied.
- a candidate pulse equal to the number of encoding target pulses may be selected (for example, the encoding target pulse is searched instead of the candidate pulse search).
- the track in order to determine the importance of a track, the track is divided into an upper importance track and a lower importance track based on the energy of the track.
- the present invention is not limited thereto.
- other criteria may be applied in addition to energy.
- the tracks of the track pairs may be searched for pulses in the same manner as described with reference to FIG. 17 to determine an encoding target pulse and quantize the information of the pulses.
- steps of FIG. 17 may be applied to all tracks in order so that an encoding target pulse may be determined in all tracks of a target signal.
- the encoding target pulse can be searched according to the present invention.
- Encoding and decoding methods performed by the CELP mode are the same as described with reference to FIGS. 2 and 4.
- candidate pulses were searched using the absolute values of pulses based on Equation 2.
- candidate pulses may be selected based on a convolution value with an impulse response of the LPC synthesis filter. For example, a candidate pulse may be searched for a pulse having a minimum mean square error (MSE) between an impulse response, a convolved pulse value, and a target signal in a current track.
- MSE minimum mean square error
- FIG. 18 is a flowchart schematically illustrating a method for searching / selecting a pulse based on a CELP mode in the present invention.
- FIG. 18 may be performed by the core encoder in the encoder of FIG. 2 and / or the core decoder in the decoder of FIG. 4.
- the encoder and / or the decoder will be described as performing each step of FIG. 18.
- a target signal is first generated (S1800).
- the generated target signal may be a signal passed through a weighting filter or a signal after an adaptive codebook search in the CELP mode, that is, a new signal from which the influence of the adaptive codebook is removed from the audio signal.
- the target signal when the CELP mode is applied, may be a signal excluding a signal synthesized from (1) an audio signal and (2) a coded adaptive codebook.
- track-specific energy of tracks constituting the track pair with respect to the target signal is calculated (S1805).
- a track pair may consist of tracks having the same steps (pulse intervals) and whose position in the track is adjacent to each other with the position of the pulse in the adjacent track.
- the track-specific energy can be used as a criterion for determining in what order the tracks are to be searched.
- the energy for each track is used as a reference, but other characteristics other than energy may be calculated and used as a reference for determining the search order.
- the calculated track-specific energies are compared (S1810). By comparing the energy of the tracks, a track with higher energy may be determined as a track of high importance. Therefore, the track with the highest energy among the tracks constituting the track pair can be searched first as the first track. The track with the second highest energy is then determined as the second rank track to be searched for the second time, and can be determined up to the lowest rank track according to the energy magnitude.
- the determined rank is a rank for pulse retrieval, and may proceed in the original track order when the retrieved pulse is quantized and the bitstream is constructed.
- the MSE is calculated for each pulse with respect to the first priority track having the highest importance level (S1815). For each pulse position of the first track, a pulse whose MSE is minimum for the target signal is selected as a candidate pulse of the first track using a convolution value with the impulse response.
- the MSE for the target signal may be an MSE (Mean Square Error) between a value of the target signal and a value obtained by convolving a candidate pulse with an impulse response.
- the codebook can be used in the process of obtaining the MSE.
- the codebook specifies where in the track there may be pulses.
- the target pulses In the first track, set only the target pulses to be calculated for MSE (Put the amplitude signal (eg, the signal of magnitude 1) only at the position of the target pulse, and set the pulse size to 0 at the position of other pulses.
- MSE the amplitude signal (eg, the signal of magnitude 1)
- a predetermined number of pulses for minimizing MSE for the target signal are selected (S1820). Unlike the MDCT-based case, for all pulses in a track, the impulse response and the confluence of each pulse and the MSE between the target signals are obtained, and the predetermined number of searches for the first track in order of decreasing MSE magnitudes. Pulses can be selected. That is, the predetermined number of pulses may be selected in the order of the smallest difference from the target signal.
- C1 is an integer greater than 0
- C1 pulses are generated from the pulse with the smallest MSE for the target signal to the pulse with the smallest MSE for the target signal. Can be selected.
- the position of the selected pulses in the first rank track is fixed, and the MSE for the target signal is calculated at the positions of the pulses in the second rank track (S1825).
- the MSE between the convolution value and the target signal in the second rank track can be calculated for each pulse.
- a predetermined number more pulses than the number of pulses to be encoded may be selected as candidate pulses of the second rank track.
- the pulses selected in the 1st rank track exist in each position, and among the pulses in the 2nd rank track, only the pulse which is the current MSE calculation target is present.
- Convolution with the impulse response is performed by setting the pulse of the unit size only at the position of the MSE calculation target pulse and setting the pulse size of the other position to 0). In this way, the MSE between the convolution value and the target signal in the second rank track may be calculated for each pulse in consideration of the pulses selected in the first rank track.
- pulses are selected in the second rank track by the number of pulses to be encoded in the second rank track.
- the number of pulses to be encoded in the second rank track is C2 (C2 is an integer greater than 0) and additionally c2 (c2 is an integer of 0 or more) pulses are searched.
- the C2 + c2 second rank track pulses are convolved with the impulse response, respectively, along with the C1 first rank track pulses.
- C2 pulses may be selected as encoding target pulses of the second rank track in the order in which the MSE between each pulse of the second rank track convolved with the pulses of the first rank track and the target signal is small.
- the priority tracks from the 3rd priority track to the following priority tracks may search for candidate pulses and select a pulse to be encoded from the candidate pulses. For example, the MSE between the target signal and the value of the convolution with the pulses selected in the first rank track and the pulses selected in the second rank track after searching a predetermined number of candidate pulses in the third rank track more than the number of pulses to be encoded. Can be calculated.
- the encoding target pulse may be selected based on the MSE value calculated in consideration of the pulses selected in the first rank track and the second rank track.
- the candidate pulses for each track are sequentially searched according to the importance, the pulses to be encoded are selected, and the candidate pulses are searched up to the lowest priority track.
- the pulses of the upper major tracks are fixed and the MSE is calculated at the pulse position of the lowermost track (S1835). Assuming that a pair of tracks consists of k tracks, for the number of pulses Ck (Ck is an integer greater than 0) in the lowest priority track (k rank importance track), the predetermined number (ck, ck is an integer greater than or equal to 0). More pulses can be retrieved as candidate pulses of the lowest priority track. For example, the MSE between the convolution value and the target signal in the lowest track can be calculated for each pulse by setting that only the target pulse to calculate the MSE exists in the lowest track and performing convolution with the impulse response. have. For the lowest rank track, Ck + ck candidate pulses can be retrieved in order of decreasing MSE.
- the pulses selected in the rank 1 to k-1 rank tracks exist at each position, and among the pulses in the k rank (lowest rank) track, only the pulses currently being calculated for MSE are set (1 rank to k).
- the convolution with the impulse response is performed by setting the unit pulses at the position of the selected pulses in the -1 rank tracks and the position of the current MSE calculation target pulse in the k rank tracks and setting the pulse size at other positions to 0). do. In this way, the MSE between the convolution value and the target signal in the lowest rank track may be calculated for each pulse in consideration of the pulses selected in the previous rank tracks.
- MSEs may be selected from candidate pulses having small energy by comparing MSEs considering pulses of tracks with large energy and already searched (S1840). That is, pulses are selected in the least significant track by the number of pulses to be encoded in the least significant track, in order of decreasing MSE value in the least significant track calculated in consideration of the previous rank track.
- Information of the pulses selected in the entire track is quantized (S1845).
- the information of the quantized pulses may include at least one of a position of the pulse, a magnitude of the pulse, and a sign of the pulse.
- the position of the pulse can be changed to quantize information indicating the sign of the selected pulse.
- the position change is for transmitting only one sign bit per track. If the two selected pulses in the track have the same sign, the pulse with the larger absolute value is placed in the front position. If the two pulses are different, the pulse with the small absolute value is different. This is done by placing in the front position.
- the position of the pulse may or may not change depending on whether the signs of the two selected pulses within the same track are the same or different.
- the position, magnitude and / or sign of the pulse is quantized
- the searching and selection of candidate pulses is performed for each track constituting the track pair, but the present invention is not limited thereto.
- a convolution with an impulse response may be obtained by including the pulses selected from the higher importance tracks for each track. It may be.
- the base layer (layer 6) to which the CELP mode is applied at least two or more tracks have a structure in which track pairs can be configured. Therefore, in the example of FIG. Need not be performed.
- the track pair may not exist, and thus, the existence of the track pair may be determined before comparing the energy of each track. If no track pair exists, the encoding target pulses may be selected in order of decreasing MSE between the convolution of each pulse and the original signal for each track.
- the method of FIG. 18 may also be applied to the encoders of FIGS. 1 and 16.
- the SWB encoder 130 of FIG. 1 may be converted to a CELP-based enhancement layer unit, and the MDCT-based enhancement layer unit 1630 of FIG. 16 is also CELP-based. It can be switched to the enhancement layer unit.
- the enhancement layer unit may process higher layers of layer 6 or more that process the SWB signal.
- layers 6 or more layers may be processed based on MDCT
- layers 6 or more layers may be processed based on CELP.
- steps of FIG. 18 may be applied to all tracks in order so that the encoding target pulse may be determined in all tracks of the target signal.
- the determination of the candidate pulse is referred to as 'search', and the determination of the encoding target pulse is referred to as 'selection'.
- 'search' the determination of the candidate pulse
- 'selection' the determination of the encoding target pulse
- the present invention is not limited thereto, and 'search' and 'selection' may be used interchangeably. have.
- candidate pulses may be retrieved or selected.
- FIG. 19 is a flowchart schematically illustrating an example of an audio signal encoding method according to the present invention.
- the encoder determines an encoding target pulse (S1910).
- the encoder may determine the importance of the tracks constituting the track pair according to the track-specific energy of the audio signal, and determine the encoding target pulse by searching for the pulses from the track of high importance.
- the encoder may select (1) a pulse at a position adjacent to a pulse selected as an encoding target pulse in a track of higher importance among tracks constituting the track pair as an encoding target pulse of the current track. Also, the encoder may select (2) the pulse of the position furthest from the pulse selected as the encoding target pulse in the track of the higher importance among the tracks constituting the track pair as the encoding target pulse of the current track.
- the encoder may search for the pulses to be encoded in the tracks of the most significant importance (search for the same number of pulses as the pulses to be encoded), and may search for a predetermined number more pulses than the number of the pulses to be encoded in the tracks below the most significant importance.
- the pulses may be searched in the order of increasing absolute value.
- the magnitude of the absolute value may be determined based on Equation 2.
- the encoder may select, as the encoding target pulses of the current track, pulses adjacent to the pulse selected as the encoding target pulse in the track of the higher importance among the retrieved pulses as described above.
- the encoder may select, as the encoding target pulse of the current track, a pulse that is farthest from the pulse selected as the encoding target pulse in the track of the higher importance among the retrieved pulses.
- the encoding target pulses when there are a plurality of combinations of selectable pulses, can be selected in the order of the largest absolute value among the pulses found in the current track.
- the encoding target pulses may be selected in the order of the largest absolute value among the pulses searched in the current track.
- the method of (1) is as described in detail in the examples of FIGS. 12 and 13.
- the method of (2) is as described in detail in the examples of FIGS. 14 and 15.
- the encoder may select the encoding target pulse based on the positional relationship with the pulse selected as the encoding target pulse in the track of the higher importance among the tracks constituting the track pair.
- the selection criteria of the encoding target pulse may be adaptively determined.
- the encoder selects, as the encoding target pulse of the current track, pulses adjacent to the pulse selected as the encoding target pulse in the track of higher importance among the tracks constituting the track pair.
- pulses that are separated from the pulse selected as the encoding target pulse in the track of higher importance among the tracks constituting the track pair are used as the encoding target pulse of the current track. You can choose.
- the coder uses pulses that are separated from the pulse selected as the encoding target pulse in the track of higher importance among the tracks constituting the track pair as the encoding target pulse of the current track. If the tracks constituting the track pair are not tonal, the pulses adjacent to the pulse selected as the encoding target pulse in the track of higher importance among the tracks constituting the track pair may be selected as the encoding target pulse of the current track.
- the related method is as described in detail in the example of FIG. 17.
- the encoder may select the encoding target pulse of the current track in order of decreasing mean square error (MSE) between the 'convolution' and the 'audio signal' based on the pulse response selected based on the pulses selected as the encoding target pulse in the track of higher importance. You can also choose.
- MSE mean square error
- the convolution may be a convolution of a pulse selected as a pulse to be encoded in a track of higher importance and one of the pulses searched in the current track with an impulse response.
- Convolution may also be used in the process of searching for pulses in each track.
- the convolution may be a convolution of an impulse response and a specific pulse in the current track.
- the MSE between this convolution and the audio signal can be retrieved as candidate pulses of the current track in small order.
- the related method is as described in detail in the example of FIG. 18.
- the position of the pulse in the track can be changed in consideration of the sign of the pulse.
- the content is as described above.
- the encoder quantizes the selected encoding target pulse (S1920). Quantized pulses may be encoded and transmitted or stored in a bitstream.
- FIG. 20 is a flowchart schematically illustrating an example of an audio signal decoding method according to the present invention.
- the decoder generates a pulse for an audio signal (S2010).
- the pulse for the audio signal may be derived from the received audio data based on dequantization.
- the pulses were searched or selected from the tracks having the highest importance among the tracks constituting the track pair in the audio signal.
- the pulses at positions adjacent to the pulse selected as the encoding target pulse in the track of higher importance may be pulses for the current track.
- the related contents are the same as those described in detail with reference to FIGS. 12 and 13.
- the pulses farthest from the pulse selected as the encoding target pulse in the track of the higher importance may be pulses for the current track.
- the related contents are the same as those described with reference to FIGS. 14 and 15.
- Whether the pulse corresponds to (1) or the pulse corresponding to (2) may be adaptively determined according to the characteristics of the audio signal.
- the contents are the same as those described in the example of FIG. 17.
- the pulses corresponding to (2) are pulses selected in order of decreasing mean square error (MSE) between 'convolution' and 'audio signal' with the impulse response based on the pulses selected as encoding target pulses in the track of higher importance.
- MSE mean square error
- the convolution may be a convolution of one pulse among pulses selected as encoding target pulses in a track of higher importance and pulses searched in the current track.
- the pulses retrieved in the current track may be pulses selected in order of MSE between the convolution with the impulse response and the audio signal.
- the position of the pulse in the track may be a position changed in consideration of the sign of the pulse.
- the content is as described above.
- the decoder may reconstruct the audio signal based on the generated pulses (S2020).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201261691275P | 2012-08-21 | 2012-08-21 | |
| US61/691,275 | 2012-08-21 | ||
| US201361842396P | 2013-07-03 | 2013-07-03 | |
| US61/842,396 | 2013-07-03 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014030928A1 true WO2014030928A1 (fr) | 2014-02-27 |
Family
ID=50150168
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2013/007505 Ceased WO2014030928A1 (fr) | 2012-08-21 | 2013-08-21 | Procédé de codage de signaux audio, procédé de décodage de signaux audio, et appareil mettant en œuvre les procédés |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2014030928A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018525606A (ja) * | 2015-07-30 | 2018-09-06 | コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. | 粒子密度検出のためのレーザセンサ |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030009325A1 (en) * | 1998-01-22 | 2003-01-09 | Raif Kirchherr | Method for signal controlled switching between different audio coding schemes |
| US20040064310A1 (en) * | 2001-05-07 | 2004-04-01 | Yutaka Banba | Sub-band adaptive differential pulse code modulation/encoding apparatus, sub-band adaptive differential pulse code modulation/encoding method, wireless transmission system, sub-band adaptive differential pulse code modulation/decoding apparatus, sub-band adaptive differential pulse code modulation/decoding method, and wirel |
| KR20100086032A (ko) * | 2007-11-06 | 2010-07-29 | 노키아 코포레이션 | 오디오 코딩 장치 및 그 방법 |
| KR20100093504A (ko) * | 2009-02-16 | 2010-08-25 | 한국전자통신연구원 | 적응적 정현파 펄스 코딩을 이용한 오디오 신호의 인코딩 및 디코딩 방법 및 장치 |
| WO2011087332A2 (fr) * | 2010-01-15 | 2011-07-21 | 엘지전자 주식회사 | Procédé et appareil pour traiter un signal audio |
-
2013
- 2013-08-21 WO PCT/KR2013/007505 patent/WO2014030928A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030009325A1 (en) * | 1998-01-22 | 2003-01-09 | Raif Kirchherr | Method for signal controlled switching between different audio coding schemes |
| US20040064310A1 (en) * | 2001-05-07 | 2004-04-01 | Yutaka Banba | Sub-band adaptive differential pulse code modulation/encoding apparatus, sub-band adaptive differential pulse code modulation/encoding method, wireless transmission system, sub-band adaptive differential pulse code modulation/decoding apparatus, sub-band adaptive differential pulse code modulation/decoding method, and wirel |
| KR20100086032A (ko) * | 2007-11-06 | 2010-07-29 | 노키아 코포레이션 | 오디오 코딩 장치 및 그 방법 |
| KR20100093504A (ko) * | 2009-02-16 | 2010-08-25 | 한국전자통신연구원 | 적응적 정현파 펄스 코딩을 이용한 오디오 신호의 인코딩 및 디코딩 방법 및 장치 |
| WO2011087332A2 (fr) * | 2010-01-15 | 2011-07-21 | 엘지전자 주식회사 | Procédé et appareil pour traiter un signal audio |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018525606A (ja) * | 2015-07-30 | 2018-09-06 | コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. | 粒子密度検出のためのレーザセンサ |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP4950210B2 (ja) | オーディオ圧縮 | |
| CA2853987C (fr) | Train de bits audio a compression echelonnee ; codeur/decodeur utilisant un banc de filtre hierarchique et codage conjoint multicanal | |
| CN101518083B (zh) | 通过使用带宽扩展和立体声编码对音频信号编码和/或解码的方法和系统 | |
| EP2212884B1 (fr) | Codeur | |
| Ravelli et al. | Union of MDCT bases for audio coding | |
| KR20100085994A (ko) | Mdct 스펙트럼의 결합 인코딩을 이용하는 스케일링 가능한 스피치 및 오디오 인코딩 | |
| CN101849258A (zh) | 在可缩放语音和音频编解码器中的用于经量化的mdct频谱的码簿索引的编码/解码的技术 | |
| JP2006189836A (ja) | 広域音声符号化システム及び広域音声復号化システム、高域音声符号化及び高域音声復号化装置、並びにその方法 | |
| KR19990077753A (ko) | 오디오 신호 부호화 장치, 오디오 신호 복호화 장치 및 오디오 신호 부호화/복호화 장치 | |
| EP1441330B1 (fr) | Procédé et dispositif de codage/décodage de signaux audio, basés sur une corrélation temps/fréquence | |
| KR102048076B1 (ko) | 음성 신호 부호화 방법 및 음성 신호 복호화 방법 그리고 이를 이용하는 장치 | |
| JP5629319B2 (ja) | スペクトル係数コーディングの量子化パラメータを効率的に符号化する装置及び方法 | |
| WO2014030928A1 (fr) | Procédé de codage de signaux audio, procédé de décodage de signaux audio, et appareil mettant en œuvre les procédés | |
| US20100292986A1 (en) | encoder | |
| WO2009022193A2 (fr) | Codeur | |
| US8924202B2 (en) | Audio signal coding system and method using speech signal rotation prior to lattice vector quantization | |
| US20100280830A1 (en) | Decoder | |
| RU2409874C9 (ru) | Сжатие звуковых сигналов | |
| WO2026012380A1 (fr) | Procédé, appareil et système de traitement audio, dispositif électronique, support de stockage et produit-programme informatique | |
| CN101685637A (zh) | 音频编码方法及装置和音频解码方法及装置 | |
| Vasilache | Entropic encoding of lattice codevectors based on product code indexing | |
| WO2008114078A1 (fr) | Codeur | |
| HK1144851A (en) | Technique for encoding/decoding of codebook indices for quantized mdct spectrum in scalable speech and audio codecs |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13830902 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 13830902 Country of ref document: EP Kind code of ref document: A1 |