HK1241131B - Method, apparatus and computer readable medium for decoding hoa audio signals - Google Patents
Method, apparatus and computer readable medium for decoding hoa audio signalsInfo
- Publication number
- HK1241131B HK1241131B HK18100478.1A HK18100478A HK1241131B HK 1241131 B HK1241131 B HK 1241131B HK 18100478 A HK18100478 A HK 18100478A HK 1241131 B HK1241131 B HK 1241131B
- Authority
- HK
- Hong Kong
- Prior art keywords
- channel
- hoa
- rotation
- signal
- decoding
- Prior art date
Links
Description
本申请是基于申请号为201380036698.6、申请日为2013年7月16日、发明名称为“用于对多信道HOA音频信号进行编码以便降噪的方法和设备以及用于对多信道HOA音频信号进行解码以便降噪的方法和设备”的专利申请的分案申请。This application is a divisional application based on the patent application with application number 201380036698.6, application date July 16, 2013, and invention name “Method and device for encoding multi-channel HOA audio signals for noise reduction and method and device for decoding multi-channel HOA audio signals for noise reduction”.
技术领域Technical Field
本发明涉及用于对多信道高阶高保真度立体声响复制音频信号进行编码以便降噪的方法和设备、以及对多信道高阶高保真度立体声响复制音频信号进行解码以便降噪的方法和设备。The present invention relates to a method and apparatus for encoding a multi-channel higher-order Ambisonics audio signal for noise reduction, and a method and apparatus for decoding a multi-channel higher-order Ambisonics audio signal for noise reduction.
背景技术Background Art
高阶高保真度立体声响复制(Higher Order Ambisonics,HOA)是多信道声场表示[4],并且HOA信号是多信道音频信号。在特定扬声器装配上回放某些多信道音频信号表示,特别是HOA表示,需要特殊的呈现,这通常包括矩阵化运算。在解码之后,高保真度立体声响复制(Ambisonics)信号被“矩阵化”,亦即,被映射到与例如扬声器的实际空间位置相对应的新的音频信号。通常,在单个信道之间存在高的互相关性。Higher Order Ambisonics (HOA) is a multi-channel sound field representation [4], and an HOA signal is a multi-channel audio signal. Playing back certain multi-channel audio signal representations, in particular HOA representations, on a specific loudspeaker arrangement requires special rendering, which usually involves matrixing operations. After decoding, the Ambisonics signal is "matrixed", i.e., mapped to new audio signals corresponding to, for example, the actual spatial positions of the loudspeakers. Typically, there is a high cross-correlation between the individual channels.
问题是经历到在矩阵化运算之后编码噪声增大。在现有技术下,原因似乎是未知的。当在通过感知编码器进行压缩之前例如通过离散球面谐波变换(Discrete SphericalHarmonics Transform,DSHT)将HOA信号变换到空间域时,也发生该效应。The problem is that the coding noise increases after the matrixing operation. The reason seems to be unknown in the prior art. This effect also occurs when the HOA signal is transformed into the spatial domain, for example by the Discrete Spherical Harmonics Transform (DSHT), before being compressed by the perceptual coder.
用于高阶高保真度立体声响复制音频信号表示的压缩的通常方法是将独立的感知编码器应用于个体高保真度立体声响复制系数信道[7]。具体地,感知编码器仅考虑对在每个个体单信道信号中发生的噪声掩蔽效应进行编码。然而,这种效应典型地是非线性的。如果将这种单信道矩阵化成新信号,则可能发生噪声去掩蔽(unmasking)。当在用感知编码器进行压缩之前通过离散球面谐波变换将高阶高保真度立体声响复制信号变换到空间域时,也发生该效应[8]。A common approach for compression of high-order Ambisonics audio signal representations is to apply independent perceptual coders to the individual Ambisonics coefficient channels [7]. Specifically, the perceptual coder only considers the noise masking effect that occurs in each individual mono channel signal for coding. However, this effect is typically non-linear. If such mono channels are matrixed into a new signal, noise unmasking may occur. This effect also occurs when the high-order Ambisonics signal is transformed into the spatial domain by the discrete spherical harmonics transform before compression with the perceptual coder [8].
这种多信道音频信号表示的传输或存储通常要求适当的多信道压缩技术。通常,在最终将I个经解码的信号矩阵化为J个新信号之前,执行与信道无关的感知解码。术语矩阵化表示以加权的方式添加或混合经解码的信号将所有的信号以及所有的新信号布置在根据如下的矢量中:The transmission or storage of such a multi-channel audio signal representation generally requires appropriate multi-channel compression techniques. Typically, channel-independent perceptual decoding is performed before finally matrixing the I decoded signals into J new signals. The term matrixing means adding or mixing the decoded signals in a weighted manner to arrange all signals and all new signals in a vector according to the following:
术语“矩阵化”源自以下事实:在数学上通过以下矩阵运算从获得The term "matrixization" comes from the fact that mathematically, the matrix is obtained from
其中,A表示由混合权重(mixing weight)构成的混合矩阵(mixing matrix)。在此同义地使用术语“混合”和“矩阵化”。混合/矩阵化用于呈现任何特定扬声器装配的音频信号的目的。矩阵依赖的特定的个体扬声器装配以及因此用于在运算期间的矩阵化的矩阵在感知编码阶段通常是未知的。Here, A represents a mixing matrix consisting of mixing weights. The terms "mixing" and "matrix" are used synonymously herein. Mixing/matrixization serves the purpose of rendering the audio signal for any particular loudspeaker setup. The specific individual loudspeaker setup to which the matrix depends, and therefore the matrix used for matrixing during operation, is generally unknown during the perceptual coding stage.
发明内容Summary of the Invention
本发明提供对多信道高阶高保真度立体声响复制音频信号进行编码和/或解码以便获得降噪的改善。具体地,本发明提供对3D音频比率压缩抑制编码噪声解蔽(de-masking)的方式。The present invention provides for encoding and/or decoding of multi-channel high-order Ambisonics audio signals in order to obtain improved noise reduction. In particular, the present invention provides a way to de-mask 3D audio ratio compression suppression coding noise.
本发明描述使(不期望的)噪声去掩蔽效应最小化的自适应离散球面谐波变换(aDSHT)的技术。此外,描述如何可以将aDSHT集成在压缩编码器架构中。所描述的技术至少对于HOA信号是特别有利的。本发明的一个优点是减少要传输的边信息(sideinformation)的量。原则上,仅需要传输旋转轴和旋转角。可以通过所传输的信道的数量,间接地用信号通知DSHT采样网格。与需要传输多于一半的相关矩阵的其它方法(例如Karhunen Loève变换(KLT))相比,该边信息的量非常小。The present invention describes a technique for an adaptive discrete spherical harmonic transform (aDSHT) that minimizes (undesirable) noise demasking effects. Furthermore, it is described how the aDSHT can be integrated into a compression coder architecture. The described technique is particularly advantageous, at least for HOA signals. One advantage of the present invention is that the amount of side information to be transmitted is reduced. In principle, only the rotation axis and the rotation angle need to be transmitted. The DSHT sampling grid can be signaled indirectly via the number of channels transmitted. Compared to other methods, such as the Karhunen Loève transform (KLT), which require the transmission of more than half of the correlation matrix, the amount of this side information is very small.
根据本发明的一个实施例,用于对多信道HOA音频信号进行编码以便降噪的方法包括以下步骤:使用逆自适应DSHT对信道进行解相关,所述逆自适应DSHT包括旋转运算和逆DSHT(iDSHT),所述旋转运算旋转iDSHT的空间采样网格;对每个经解相关的信道进行感知编码;对旋转信息进行编码,所述旋转信息包括定义所述旋转运算的参数;以及,传输或存储经感知编码的音频信道和经编码的旋转信息。使用逆自适应DSHT对信道进行解相关的步骤在原则上是空间编码步骤。According to one embodiment of the present invention, a method for encoding a multi-channel HOA audio signal for noise reduction includes the following steps: decorrelating the channels using an inverse adaptive DSHT, the inverse adaptive DSHT including a rotation operation and an inverse DSHT (iDSHT), the rotation operation rotating the spatial sampling grid of the iDSHT; perceptually encoding each decorrelated channel; encoding rotation information, the rotation information including parameters defining the rotation operation; and transmitting or storing the perceptually encoded audio channels and the encoded rotation information. The step of decorrelating the channels using the inverse adaptive DSHT is, in principle, a spatial encoding step.
根据本发明的一个实施例,用于对具有降低的噪声的经编码的多信道HOA音频信号进行解码的方法包括以下步骤:接收经编码的多信道HOA音频信号和信道旋转信息;对所接收的数据进行解压缩,其中使用感知解码;使用自适应DSHT(aDSHT)对每个信道进行空间解码,使经感知解码和空间解码的信道相关,其中执行根据所述旋转信息的aDSHT的空间采样网格的旋转;以及,对相关的经感知解码和空间解码的信道进行矩阵化,其中获得映射到扬声器位置的可再现的音频信号。According to one embodiment of the present invention, a method for decoding an encoded multi-channel HOA audio signal with reduced noise comprises the following steps: receiving the encoded multi-channel HOA audio signal and channel rotation information; decompressing the received data, wherein perceptual decoding is used; spatially decoding each channel using adaptive DSHT (aDSHT), correlating the perceptually decoded and spatially decoded channels, wherein a rotation of the spatial sampling grid of the aDSHT according to the rotation information is performed; and matrixing the correlated perceptually decoded and spatially decoded channels, wherein a reproducible audio signal mapped to a speaker position is obtained.
公开一种用于对多信道HOA音频信号进行编码的设备。公开一种用于对多信道HOA音频信号进行解码的设备。Disclosed is a device for encoding a multi-channel HOA audio signal. Disclosed is a device for decoding a multi-channel HOA audio signal.
一方面,计算机可读介质具有可执行的指令,以使计算机执行包括以上公开的步骤的用于进行编码的方法,或者执行包括以上公开的步骤的用于进行解码的方法。在从属权利要求、下面的描述以及附图中公开本发明的有利的实施例。In one aspect, the computer-readable medium has executable instructions to cause a computer to perform a method for encoding comprising the steps disclosed above, or to perform a method for decoding comprising the steps disclosed above. Advantageous embodiments of the invention are disclosed in the dependent claims, the following description and the accompanying drawings.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
参考附图描述本发明的示例性实施例,附图中:Exemplary embodiments of the present invention are described with reference to the accompanying drawings, in which:
图1示出用于对M个系数的块进行比率压缩的已知的编码器和解码器;FIG1 shows a known encoder and decoder for rate compression of a block of M coefficients;
图2示出使用传统的DSHT(离散球面谐波变换)和传统的逆DSHT将HOA信号变换到空间域中的已知的编码器和解码器;FIG2 shows a known encoder and decoder for transforming an HOA signal into the spatial domain using conventional DSHT (Discrete Spherical Harmonic Transform) and conventional inverse DSHT;
图3示出使用自适应DSHT和自适应逆DSHT将HOA信号变换到空间域中的编码器和解码器;FIG3 shows an encoder and decoder for transforming an HOA signal into the spatial domain using adaptive DSHT and adaptive inverse DSHT;
图4示出测试信号;Figure 4 shows a test signal;
图5示出在编码器和解码器构建块中使用的码本的球面采样位置的示例;FIG5 shows an example of spherical sampling positions of a codebook used in encoder and decoder building blocks;
图6示出信号自适应DSHT构建块(pE和pD);Figure 6 shows the signal adaptive DSHT building blocks (pE and pD);
图7示出本发明的第一实施例;FIG7 shows a first embodiment of the present invention;
图8示出编码处理和解码处理的流程图;以及FIG8 shows a flowchart of the encoding process and the decoding process; and
图9示出本发明的第二实施例。FIG9 shows a second embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
图2示出使用逆DSHT将HOA信号变换到空间域中的已知的系统。对信号进行使用iDSHT 21的变换、比率压缩E1/解压缩D1,并且使用DSHT 24重新变换到系数域S24。与此不同,图3示出根据本发明的一个实施例的系统:已知的解决方案的DSHT处理块被替换为分别控制逆自适应DSHT和自适应DSHT的处理块31、34。在比特流bs内传输边信息SI。该系统包括用于对多信道HOA音频信号进行编码的设备的元件以及用于对多信道HOA音频信号进行解码的设备的元件。Figure 2 shows a known system that transforms an HOA signal into the spatial domain using inverse DSHT. The signal undergoes transformation using iDSHT 21, rate compression E1/decompression D1, and retransformation to the coefficient domain S24 using DSHT 24. In contrast, Figure 3 shows a system according to one embodiment of the present invention: the DSHT processing block of the known solution is replaced with processing blocks 31 and 34 that control inverse adaptive DSHT and adaptive DSHT, respectively. Side information SI is transmitted within the bitstream bs. The system includes elements of an apparatus for encoding multi-channel HOA audio signals and elements of an apparatus for decoding multi-channel HOA audio signals.
在一个实施例中,用于对多信道HOA音频信号进行编码以便降噪的设备ENC包括使用逆自适应DSHT(iaDSHT)对信道B进行解相关的解相关器31,所述逆自适应DSHT包括旋转运算单元311和逆DSHT(iDSHT)310。旋转运算单元旋转iDSHT的空间采样网格。解相关器31提供经解相关的信道Wsd和包括旋转信息的边信息SI。此外,该设备包括用于对每个经解相关的信道Wsd进行感知编码的感知编码器32以及用于对旋转信息进行编码的边信息编码器。旋转信息包括定义所述旋转运算的参数。感知编码器32提供经感知编码的音频信道和经编码的旋转信息,从而降低数据率。最后,用于进行编码的设备包括用于从经感知编码的音频信道和经编码的边信息创建比特流bs以及用于传输或存储比特流bs的接口装置320。In one embodiment, an apparatus ENC for encoding a multi-channel HOA audio signal for noise reduction includes a decorrelator 31 for decorrelating channel B using an inverse adaptive DSHT (iaDSHT), which includes a rotation operation unit 311 and an inverse DSHT (iDSHT) 310. The rotation operation unit rotates the spatial sampling grid of the iDSHT. The decorrelator 31 provides decorrelated channels Wsd and side information SI including rotation information. Furthermore, the apparatus includes a perceptual encoder 32 for perceptually encoding each decorrelated channel Wsd and a side information encoder for encoding the rotation information. The rotation information includes parameters defining the rotation operation. The perceptual encoder 32 provides the perceptually encoded audio channel and the encoded rotation information, thereby reducing the data rate. Finally, the apparatus for encoding includes an interface device 320 for creating a bitstream bs from the perceptually encoded audio channel and the encoded side information, and for transmitting or storing the bitstream bs.
用于对具有降低的噪声的多信道HOA音频信号进行解码的设备DEC包括:用于接收经编码的多信道HOA音频信号和信道旋转信息的接口装置330;以及用于对所接收的数据进行解压缩的解压缩模块33,其包括用于对每个信道进行感知解码的感知解码器。解压缩模块33提供所恢复的经感知解码的信道W’sd和所恢复的边信息SI’。此外,用于进行解码的设备包括:使用自适应DSHT(aDSHT)使经感知解码的信道W’sd相关的相关器34,其中执行DSHT和根据所述旋转信息的DSHT的空间采样网格的旋转;以及用于对相关的经感知解码的信道进行矩阵化的混合器MX,其中获得映射到扬声器位置的可再现的音频信号。在相关器34内的DSHT单元340中,至少可以执行aDSHT。在一个实施例中,在网格旋转单元341中完成空间采样网格的旋转,这在原则上重新计算原始的DSHT采样点。在另一实施例中,在DSHT单元340内执行旋转。The apparatus DEC for decoding a multi-channel HOA audio signal with reduced noise includes an interface 330 for receiving an encoded multi-channel HOA audio signal and channel rotation information; and a decompression module 33 for decompressing the received data, including a perceptual decoder for perceptually decoding each channel. The decompression module 33 provides recovered perceptually decoded channels W'sd and recovered side information SI'. Furthermore, the apparatus for decoding includes a correlator 34 for correlating the perceptually decoded channels W'sd using adaptive DSHT (aDSHT), wherein DSHT and a rotation of the DSHT spatial sampling grid based on the rotation information are performed; and a mixer MX for matrixing the correlated perceptually decoded channels, thereby obtaining a reproducible audio signal mapped to speaker positions. At least aDSHT can be performed in a DSHT unit 340 within the correlator 34. In one embodiment, the rotation of the spatial sampling grid is performed in a grid rotation unit 341, which essentially recalculates the original DSHT sampling points. In another embodiment, the rotation is performed within the DSHT unit 340.
下面给出定义和描述去掩蔽的数学模型。假设给定的离散时间多信道信号包括I个信道xi(m),i=1,...,I,其中m表示时间样本索引(time sample index)。个体信号可以是实数值或复数值。考虑以时间样本索引mSTART+1:开始的M个样本的帧,其中假设个体信号是固定的。根据下式在矩阵内布置对应的样本:The mathematical model for defining and describing demasking is given below. Assume that a given discrete-time multichannel signal comprises I channels x i (m), i = 1, ..., I, where m represents the time sample index. The individual signals can be real or complex values. Consider a frame of M samples starting with time sample index m START + 1:, where the individual signals are assumed to be stationary. The corresponding samples are arranged in the matrix according to the following formula:
X:=[x(mSTART+1),...,x(mSTART+M)] (1)X: =[x(m START +1),...,x(m START +M)] (1)
其中in
x(l):=[x1(m),...,xI(m)]T (2)x(l):=[x 1 (m),...,x I (m)] T (2)
其中(·)T表示转置。对应的经验相关矩阵由下式给出:where (·) T denotes the transpose. The corresponding empirical correlation matrix is given by:
∑X:=X XH (3)∑ X :=XX H (3)
其中(·)H表示联合复共轭和转置。where (·) H represents the joint complex conjugate and transpose.
现在假设多信道信号帧已被编码,从而在重构时引入编码误差噪声。因此,用表示的经重构的帧样本的矩阵根据下式由真实样本矩阵X和编码噪声分量E构成:Now assume that the multi-channel signal frame has been encoded, thereby introducing coding error noise during reconstruction. Therefore, the matrix of reconstructed frame samples represented by is composed of the real sample matrix X and the coding noise component E according to the following formula:
其中in
E:=[e(mSTART+1),...,e(mSTART+L)] (5)E:=[e(m START +1),...,e(m START +L)] (5)
并且and
e(m):=[e1(m),...,eI(m)]T (6)e(m):=[e 1 (m),...,e I (m)] T (6)
因为假设每个信道已经被独立地编码,所以对于i=1,..,I,可以假设编码噪声信号ei(m)彼此独立。利用该特性以及噪声信号是零均值的假设,噪声信号的经验相关矩阵由如下对角矩阵给出:Since each channel is assumed to have been independently coded, it can be assumed that the coded noise signals ei (m) are independent of each other for i=1, .., I. Using this property and the assumption that the noise signal is zero-mean, the empirical correlation matrix of the noise signal is given by the following diagonal matrix:
这里,表示对角矩阵,在其对角线上具有经验噪声信号幂Here, represents a diagonal matrix with the empirical noise signal power on its diagonal
另外的基本假设是,执行编码使得对于每个信道满足预定义的信噪比(SNR)。在不失一般性的情况下,假设预定义的SNR是对于每个信道相等的,亦即:Another basic assumption is that the encoding is performed so that a predefined signal-to-noise ratio (SNR) is satisfied for each channel. Without loss of generality, it is assumed that the predefined SNR is equal for each channel, that is:
其中in
从现在开始,考虑将经重构的信号矩阵化为J个新信号yj(m),j=1,...,J。在不引入任何编码误差的情况下,经矩阵化的信号的样本矩阵可以表示为:From now on, consider matrixing the reconstructed signal into J new signals y j (m), j=1, ..., J. Without introducing any coding error, the sample matrix of the matrixed signal can be expressed as:
Y=A X (11)Y=A X (11)
其中表示混合矩阵,并且其中where denotes the mixing matrix, and where
Y:=[y(mSTART+1),...,y(mSTART+M)] (12)Y: =[y(m START +1),...,y(m START +M)] (12)
其中in
y(m):=[y1(m),...,yJ(m)]T (13)y(m):=[y 1 (m),...,y J (m)] T (13)
然而,由于编码噪声,经矩阵化的信号的样本矩阵被给出为:However, due to coding noise, the sample matrix of the matrixed signal is given by:
其中,N是包含经矩阵化的噪声信号的样本的矩阵。其可以表示为:Where N is a matrix containing samples of the matrixed noise signal. It can be expressed as:
N=AE (15)N=AE (15)
N=[n(mSTART+1) ... n(mSTART+M) (16)N=[n(m START +1) ... n(m START +M) (16)
其中in
n(m):=[n1(m) ... nJ(m)]T (17)n(m):=[n 1 (m) ... n J (m)] T (17)
是在时间样本索引m时的所有的经矩阵化的噪声信号的矢量。is the vector of all matrixed noise signals at time sample index m.
使用等式(11),经矩阵化的无噪声信号的经验相关矩阵可以用公式表示为:Using equation (11), the empirical correlation matrix of the matrixed noise-free signal can be formulated as:
∑Y=A∑XAH (18)∑ Y =A∑ X A H (18)
因此,作为∑Y的对角线上的第j个元素的第j个经矩阵化的无噪声信号的经验幂(empirical power)可以写为:Therefore, the empirical power of the j-th matrixed noise-free signal, which is the j-th element on the diagonal of ∑ Y, can be written as:
其中aj是根据下式的AH的第j列:where aj is the jth column of AH according to the following formula:
AH=[a1,...,aJ] (20)A H =[a 1 ,...,a J ] (20)
类似地,利用等式(15),经矩阵化的噪声信号的经验相关矩阵可以写为:Similarly, using equation (15), the empirical correlation matrix of the matrixed noise signal can be written as:
∑N=A∑EAH (21)∑ N =A∑ E A H (21)
作为∑N的对角线上的第j个元素的第j个经矩阵化的噪声信号的经验幂由下式给出:The empirical power of the jth matrixed noise signal, which is the jth element on the diagonal of ∑ N, is given by:
因此,对于通过下式定义的经矩阵化的信号的经验SNR,Therefore, for the empirical SNR of the matrixed signal defined by,
可以使用等式(19)和(22)重新用公式表示为:It can be reformulated using equations (19) and (22) as:
通过将∑X如下地分解成其对角线分量和非对角线分量:By decomposing ∑ X into its diagonal and off-diagonal components as follows:
以及as well as
并且通过利用从假设(7)和(9)以及在所有信道上的SNR常量得到的如下特性:And by using the following properties from assumptions (7) and (9) and constant SNR on all channels:
最终获得关于经矩阵化的信号的经验SNR的所期望的表达式:Finally we obtain the desired expression for the empirical SNR of the matrixed signal:
从该表达式可以看出,从预定义的SNR(SNRx),通过乘以取决于信号相关矩阵∑X的对角线分量和非对角线分量的项来获得该SNR。具体地,如果信号xi(m)彼此不相关,使得∑X,NG变成零矩阵,则经矩阵化的信号的经验SNR等于预定义的SNR,亦即:As can be seen from this expression, the SNR is obtained from the predefined SNR (SNR x ) by multiplying the terms that depend on the diagonal and off-diagonal components of the signal correlation matrix Σ X. Specifically, if the signals x i (m) are uncorrelated with each other, so that Σ X,NG becomes a zero matrix, then the empirical SNR of the matrixed signal is equal to the predefined SNR, that is:
对于所有的j=1,...,J,如果∑X,NG=0I×I (30)For all j = 1, ..., J, if ∑ X,NG = 0 I × I (30)
其中0I×I表示具有I个行和I个列的零矩阵。也就是说,如果信号xi(m)是相关的,则经矩阵化的信号的经验SNR可能偏离预定义的SNR。在最差的情况下,可能比SNRx低得多。这种现象在此被称为矩阵化时的噪声去掩蔽。Where 0 I×I represents a zero matrix with I rows and I columns. That is, if the signals x i (m) are correlated, the empirical SNR of the matrixed signal may deviate from the predefined SNR. In the worst case, it may be much lower than the SNR x . This phenomenon is referred to herein as noise unmasking during matrixing.
下面的部分给出对高阶高保真度立体声响复制(HOA)的简要介绍,并定义要处理的信号(数据率压缩)。The following section gives a brief introduction to Higher Order Ambisonics (HOA) and defines the signals to be processed (data rate compression).
高阶高保真度立体声响复制(HOA)基于对在被假设为无声源的所关注的紧密区域内的声场的描述。在该情况下,在时间t时和在所关注的区域内的(以球面坐标的)位置x=[r,θ,φ]T处的声压p(t,x)的时空行为在物理上完全由齐次波动等式来确定。可以示出,相对于时间的声压的傅立叶变换,亦即,High-order Ambisonics (HOA) is based on the description of the sound field in a compact region of interest that is assumed to be free of sound sources. In this case, the spatiotemporal behavior of the sound pressure p(t, x) at time t and at position x = [r, θ, φ] T (in spherical coordinates) within the region of interest is physically determined entirely by homogeneous wave equations. It can be shown that the Fourier transform of the sound pressure with respect to time, i.e.,
其中ω表示角频率(并且对应于),where ω represents the angular frequency (and corresponds to),
可以根据[10]展开为球面谐波级数(SHs):It can be expanded into spherical harmonic series (SHs) according to [10]:
在等式(32)中,cs表示声音的速度,并且表示角波数。此外,jn(·)指示第一类的n阶球面贝塞耳(Bessel)函数,表示n阶m次球面谐波(SH)。关于声场的完整信息实际上包含在声场系数内。In equation (32), cs represents the speed of sound, and represents the angular wave number. In addition, jn (·) indicates the nth-order spherical Bessel function of the first kind, representing the nth-order mth spherical harmonic (SH). Complete information about the sound field is actually contained in the sound field coefficients.
应当注意到,SHs一般是复数值的函数。然而,通过它们的适当的线性组合,能够获得实数值的函数,并且关于这些函数,能够进行展开。It should be noted that SHs are generally complex-valued functions. However, through appropriate linear combinations of them, real-valued functions can be obtained, and expansion can be performed on these functions.
与等式(32)中的压力声场描述相关地,源场(source rield)可以被定义为:In conjunction with the pressure field description in equation (32), the source field can be defined as:
其中,源场或幅值密度(amplitude density)[9]D(kcs,Ω)取决于角波数和角方向Ω=[θ,φ]T。源场可以包括远场/近场、离散/连续的源[1]。根据下式[1],源场系数与声场系数相关:The source field or amplitude density [9] D( kcs , Ω) depends on the angular wave number and angular direction Ω = [θ, φ] T. The source field can include far field/near field, discrete/continuous sources [1]. The source field coefficient is related to the acoustic field coefficient according to the following formula [1]:
其中是第二类的球面汉克尔(Hankel)函数,rs是离开原点的源距离。where is the spherical Hankel function of the second kind and rs is the source distance from the origin.
可以在频域或时域中将HOA域中的信号表示为源场或声场系数的逆傅立叶变换。下面的描述将假设使用有限数量的源场系数的时域表示:The signal in the HOA domain can be represented in the frequency domain or time domain as the inverse Fourier transform of the source field or sound field coefficients. The following description will assume the use of a time domain representation of a finite number of source field coefficients:
所述有限数量:(33)中的无穷级数在n=N处被截断。截断对应于空间带宽限制。系数(或HOA信道)的数量由下式给出:The finite number: The infinite series in (33) is truncated at n = N. The truncation corresponds to the spatial bandwidth limitation. The number of coefficients (or HOA channels) is given by:
O3D=(N+1)2对于3D (36)O 3D = (N+1) 2 for 3D (36)
或者对于仅仅2D的描述,由O2D=2N+1给出。系数包括用于由扬声器进行稍后再现的一个时间样本m的音频信息。它们可以被存储或传输,并且因此是数据率压缩的主体。系数的单个时间样本m可以由具有O3D个元素的矢量b(m)表示:Or for a 2D-only description, this is given by O 2D = 2N + 1. The coefficients contain the audio information for one time sample m for later reproduction by the loudspeaker. They can be stored or transmitted and are therefore the subject of data rate compression. A single time sample m of the coefficients can be represented by a vector b(m) with O 3D elements:
并且通过矩阵B表示M个时间样本的块:And a block of M time samples is represented by the matrix B:
B:=[b(mSTART+1),b(mSTART+2),..,b(mSTART+M)] (38)B: =[b(m START +1), b(m START +2), .., b(m START +M)] (38)
可以通过圆形谐波的展开来得到声场的二维表示。这可以被看作是使用固定的倾斜系数的不同加权以及减小到O2D个系数(m=±n)的集合的上述一般描述的特殊情况。因此,所有下面的考虑也适用于2D表示,然后术语球面(sphere)需要替换为术语圆形(circle)。A two-dimensional representation of the sound field can be obtained by expanding the circular harmonics. This can be seen as a special case of the general description above using different weightings of fixed tilt coefficients and reducing the set to O 2D coefficients (m=±n). Therefore, all the following considerations also apply to the 2D representation, and the term sphere needs to be replaced by the term circle.
下面描述从HOA系数域到基于信道的空间域的变换,反之亦然。可以对单位球面上的l个离散的空间样本位置Ωl=[θl,φl]T使用时域HOA系数重写等式(33):The following describes the transformation from the HOA coefficient domain to the channel-based spatial domain, and vice versa. Equation (33) can be rewritten using the time domain HOA coefficients for l discrete spatial sample locations Ω l = [θ l , φ l ] T on the unit sphere:
假设Lsd=(N+1)2个球面样本位置Ωl,这可以针对HOA数据块B以矢量标记来重写:Assuming L sd =(N+1) 2 spherical sample positions Ω l , this can be rewritten in vector notation for the HOA data block B:
W=Ψi B (36)W=Ψ i B (36)
其中,W:=[w(MSTART+1),w(MSTART+2),..,W(mSTART+M)],并且表示Lsd个多信道信号的单个时间样本,矩阵其中矢量如果非常规律地选择球面样本位置,则存在矩阵Ψf,其中:where W = [w(M START +1), w(M START +2), .. , W(m START +M)] and represents a single time sample of the L sd multichannel signal, the matrix where the vector If the spherical sample positions are chosen very regularly, then there exists a matrix Ψ f where:
ΨfΨi=I, (37)Ψ f Ψ i =I, (37)
其中,I是O3D×O3D的单位矩阵。然后,到等式(36)的对应变换可以定义为:where I is the identity matrix of O 3D ×O 3D . Then, the corresponding transformation to equation (36) can be defined as:
B=ΨfW (38)B=Ψ f W (38)
等式(38)将Lsd个球面信号变换到系数域,并可重写为正向变换(forwardtransform):Equation (38) transforms the L sd spherical signals into the coefficient domain and can be rewritten as the forward transform:
B=DSHT{W), (39)B=DSHT{W), (39)
其中,DSHT{}表示离散球面谐波变换。对应的逆变换将O3D系数信号变换到空间域以形成Lsd个基于信道的信号,并且等式(36)变成:Where DSHT{} denotes discrete spherical harmonic transform. The corresponding inverse transform transforms the O 3D coefficient signal into the spatial domain to form L sd channel-based signals, and equation (36) becomes:
W=iDSHT{B} (40)W=iDSHT{B} (40)
这里,离散球面谐波变换的该定义对于关于HOA数据的数据率压缩的考虑是足够的,因为开始于给出的系数B并且仅关注B=DSHT{iDSHT{B}}的情况。在[2]中给出了离散球面谐波变换的更严格的定义。可以在[3]、[4]、[6]、[5]中回顾DSHT的适当的球面样本位置以及得到这样的位置的过程。在图5中示出采样网格的示例。Here, this definition of the discrete spherical harmonic transform is sufficient for considerations of data rate compression for HOA data, since we start with a given coefficient B and focus only on the case where B = DSHT{iDSHT{B}}. A more rigorous definition of the discrete spherical harmonic transform is given in [2]. Appropriate spherical sample positions for the DSHT and the process of deriving such positions can be reviewed in [3], [4], [6], [5]. An example of a sampling grid is shown in FIG5.
具体地,图5示出在编码器和解码器构建块pE、pD中使用的码本的球面采样位置的示例,即,在图5a)中对于LSd=4,在图5b)中对于LSd=9,在图5c)中对于LSd=16,并且在图5d)中对于LSd=25。5 shows examples of spherical sampling positions of the codebook used in the encoder and decoder building blocks pE, pD, i.e., for L Sd = 4 in FIG. 5 a), for L Sd = 9 in FIG. 5 b), for L Sd = 16 in FIG. 5 c), and for L Sd = 25 in FIG. 5 d).
下面描述高阶高保真度立体声响复制系数数据的比率压缩和噪声去掩蔽。首先,定义测试信号以强调下面使用的一些特性。The following describes ratio compression and noise demasking of higher order Ambisonics coefficient data. First, a test signal is defined to emphasize some of the properties used below.
位于方向上的单个远场源由M个离散时间样本的矢量g=[g(m),...,g(M)]T表示,并且可以通过编码由HOA系数的块表示:A single far-field source located in the direction is represented by a vector of M discrete-time samples g = [g(m), ..., g(M)] T and can be represented by a block of HOA coefficients by encoding:
Bg=y gT (45)B g =yg T (45)
其中,矩阵Bg类似于等式(38),并且编码矢量由在方向上评估的共轭复数球面谐波构成(如果使用实数值的SH,则该共轭无效)。测试信号可以被看作HOA信号的最简单的情况。更复杂的信号由许多这种信号的叠加构成。where the matrix Bg is similar to equation (38) and the code vectors consist of the conjugated complex spherical harmonics evaluated in the direction (this conjugation is invalid if real-valued SH is used). The test signal can be viewed as the simplest case of an HOA signal. More complex signals consist of the superposition of many such signals.
考虑HOA信道的直接压缩,下面示出为何在HOA系数信道被压缩时出现噪声去掩蔽。实际的HOA数据块B的O3D系数信道的直接压缩和解压缩将引入类似于等式(4)的编码噪声E:Considering the direct compression of the HOA channel, the following shows why noise demasking occurs when the HOA coefficient channel is compressed. The direct compression and decompression of the O 3D coefficient channel of the actual HOA data block B will introduce coding noise E similar to equation (4):
假设如等式(9)中的常量为了在扬声器上重放该信号,需要呈现该信号。可以将该处理描述为:Assuming the constants in equation (9), the signal needs to be rendered in order to reproduce it on the loudspeaker. The process can be described as:
其中,解码矩阵(并且AH=[a1,...,aL]),并且矩阵保持L个扬声器信号的M个时间样本。这类似于(14)。应用所有上述考虑,扬声器信道l的SNR可以描述为(类似于等式(29)):where the decoding matrix is (and A H = [a 1 , ..., a L ]), and the matrix holds M time samples of the L loudspeaker signals. This is similar to (14). Applying all the above considerations, the SNR of loudspeaker channel l can be described (similar to equation (29)):
其中,是第o个对角线元素,并且∑B,NG保持:where is the oth diagonal element and ∑ B,NG holds:
∑B=B BH (49)∑ B =BB H (49)
的非对角线元素。The off-diagonal elements of .
解码矩阵A不应当受到影响(因为其应当能够针对任意的扬声器布局进行解码),因此矩阵∑B需要变成对角线以获得通过等式(45)和(49),(B=Bg),∑B=y gHg yH=c yyH变成具有常量标量值的非对角线c=gTg。与相比,在扬声器信道处的信噪比降低。但是由于声源信号g和扬声器布局在编码阶段通常都是未知的,所以系数信道的直接有损压缩可能导致不可控制的去掩蔽效应,特别是对于低数据率。The decoding matrix A should not be affected (since it should be able to decode for any loudspeaker layout), so the matrix ∑ B needs to be diagonal to obtain (B = B g ), ∑ B = yg H gy H = c yy H and becomes non-diagonal with a constant scalar value c = g T g via equations (45) and (49). Compared to , the signal-to-noise ratio at the loudspeaker channel is reduced. However, since the sound source signal g and the loudspeaker layout are usually unknown at the encoding stage, direct lossy compression of the coefficient channel may lead to uncontrollable demasking effects, especially for low data rates.
下面描述为何在使用DSHT之后在空间域中压缩HOA系数时出现噪声去掩蔽。The following describes why noise demasking occurs when compressing HOA coefficients in the spatial domain after using DSHT.
在压缩之前使用等式(36)中给出的球面谐波变换将HOA系数数据的当前块B变换到空间域中:The current block B of HOA coefficient data is transformed into the spatial domain using the spherical harmonic transform given in equation (36) before compression:
Wsd=Ψi B (50)W sd =Ψ i B (50)
其中,逆变换矩阵Ψi与LSd≥O3D个空间样本位置有关,并且空间信号矩阵对这些进行压缩和解压缩,并且添加量化噪声(类似于等式(4)):where the inverse transform matrix Ψ i is associated with L Sd ≥ O 3D spatial sample positions, and the spatial signal matrix compresses and decompresses these, and adds quantization noise (similar to equation (4)):
其中,编码噪声分量E根据等式(5)。再次假设对于所有空间信道均恒定的SNR,即SNBSd。使用变换矩阵Ψf将该信号变换到系数域等式(42),其具有特性(41):ΨfΨi=I。系数的新的块变成:where the coding noise component E is according to equation (5). Again, a constant SNR for all spatial channels, namely SNB Sd , is assumed. This signal is transformed into the coefficient domain equation (42) using the transformation matrix Ψf , which has the property (41): ΨfΨi =I. The new block of coefficients becomes:
通过应用解码矩阵AD:将该信号呈现为L个扬声器信号这可以使用(52)和A=ADΨf来重写:The signal is rendered as L loudspeaker signals by applying the decoding matrix AD : This can be rewritten using (52) and A = AD Ψ f :
这里,A变成具有的混合矩阵。等式(53)应当被看作类似于等式(14)。再次应用所有上述考虑,扬声器信道l的SNR可以被描述为(类似于等式(29)):Here, A becomes a mixing matrix with . Equation (53) should be viewed as analogous to Equation (14). Applying all the above considerations again, the SNR of loudspeaker channel l can be described (similar to Equation (29)) as:
其中,是第l个对角线元素,并且保持:where is the l-th diagonal element and holds:
的非对角线元素。The off-diagonal elements of .
因为决不会影响AD(由于其应当可以针对任意的扬声器布局来呈现),并且因此决不会对A有任何影响,所以需要变成接近于对角线以保持所期望的SNR:使用来自等式(45)(B=Bg)的简单测试信号,变成:Since it never affects AD (as it should be present for any loudspeaker layout), and therefore never has any effect on A, it needs to become close to diagonal to maintain the desired SNR: using a simple test signal from equation (45) (B = Bg ), this becomes:
其中,c=gTg恒定。使用固定的球面谐波变换(Ψi、Ψf固定),可以只在非常稀少的情况下变成对角线,并且变得更差,如上文所述,项取决于系数信号空间特性。因此,球面域中的HOA系数的低比率有损压缩可能导致SNR的降低和不可控制的去掩蔽效应。where c = g T g is constant. Using a fixed spherical harmonic transform (Ψ i , Ψ f fixed), it can only become diagonal in very rare cases and worse, as described above, the term depends on the coefficient signal space characteristics. Therefore, low-rate lossy compression of HOA coefficients in the spherical domain may lead to a reduction in SNR and uncontrollable demasking effects.
本发明的基本思想是通过使用自适应DSHT(aDSHT)来最小化噪声去掩蔽,自适应DSHT由与HOA输入信号的空间特性有关的DSHT的空间采样网格的旋转以及DSHT本身构成。The basic idea of the present invention is to minimize noise demasking by using an adaptive DSHT (aDSHT), which consists of a rotation of the spatial sampling grid of the DSHT related to the spatial characteristics of the HOA input signal and the DSHT itself.
下面描述具有与HOA系数的数量O3D相匹配的许多球面位置LSd的信号自适应DSHT(aDSHT),(36)。首先,选择如传统的非自适应DSHT中的默认球面样本网格。对于M个时间样本的块,旋转球面样本网格使得最小化项Next, we describe a signal-adaptive DSHT (aDSHT) with a number of spherical positions L Sd that matches the number of HOA coefficients O 3D . First, we select a default spherical sample grid as in conventional non-adaptive DSHT. For a block of M time samples, we rotate the spherical sample grid so that the term
的对数,其中,是(具有矩阵行索引l和列索引j)的元素的绝对值,并且是的对角线元素。这等于最小化等式(54)的项The logarithm of , where is the absolute value of the elements of (with matrix row index l and column index j), and is the diagonal element of . This is equivalent to minimizing the term of equation (54)
直观化地,如图4所示,该处理对应于以单个空间样本位置匹配最强的源方向的方式的DSHT的球面采样网格的旋转。使用来自等式(45)(B=Bg)的简单测试信号,可以示出等式(55)的项WSd变成矢量(其中,除了一个元素之外的所有元素都接近于零)。因此,变成接近于对角线,并且可以保持所期望的SNRIntuitively, as shown in Figure 4, this process corresponds to a rotation of the spherical sampling grid of the DSHT in such a way that a single spatial sample position matches the strongest source direction. Using a simple test signal from equation (45) (B = Bg ), it can be shown that the term W Sd of equation (55) becomes a vector (where all elements except one are close to zero). Therefore, it becomes close to a diagonal and the desired SNR can be maintained.
图4示出被变换到空间域的测试信号Bg。在图4a)中,使用默认的采样网格,并且在图4b)中,使用aDSHT的旋转的网格。通过对应的样本位置周围的Voronoi单元的颜色/灰度变化示出空间信道的相关的的值(以dB)。空间结构的每个单元表示采样点,并且单元的亮度/暗度表示信号强度。如同在图4b)中可以看到的,发现最强的源方向,并且旋转采样网格,使得侧面(side)之一(亦即,单个空间样本位置)匹配最强的源方向。将该侧面图示为白色(对应于强的源方向),而其它侧面是暗的(对应于低的源方向)。在图4a)中,亦即,在旋转之前,没有侧面匹配最强的源方向,并且若干侧面是更深/更浅的灰色,这意味着在相应的采样点处接收到相当大的(但不是最大的)强度的音频信号。Figure 4 shows the test signal B g transformed into the spatial domain. In Figure 4a), the default sampling grid is used, and in Figure 4b), a rotated grid using aDSHT is used. The values (in dB) associated with the spatial channels are shown by the color/grayscale changes of the Voronoi cells around the corresponding sample positions. Each cell of the spatial structure represents a sampling point, and the brightness/darkness of the cell represents the signal strength. As can be seen in Figure 4b), the strongest source direction is found, and the sampling grid is rotated so that one of the sides (i.e., a single spatial sample position) matches the strongest source direction. This side is illustrated as white (corresponding to a strong source direction), while the other sides are dark (corresponding to a weak source direction). In Figure 4a), that is, before the rotation, no side matches the strongest source direction, and several sides are darker/lighter gray, which means that a considerable (but not maximal) intensity audio signal is received at the corresponding sampling point.
下面描述在压缩编码器和解码器内使用的aDSHT的主要构建块。The main building blocks of aDSHT used within the compression encoder and decoder are described below.
在图6中示出编码器和解码器处理构建块pE和pD的细节。两个模块拥有作为DSHT的基础的相同的球面采样位置网格的码本。最初,使用系数的数量O3D根据通用码本选择具有LSd=O3D个位置的模块pE中的基础网格。必须将LSd传输给块pD进行初始化以选择与图3中所指示的相同的基础采样位置网格。通过矩阵描述基础采样网格,其中Ωl=[θl,φl]T定义单位球面上的位置。如上文所述,图5示出基本网格的示例。Figure 6 shows details of the encoder and decoder processing building blocks pE and pD. Both modules share the same codebook for the spherical sampling position grid that underlies the DSHT. Initially, a base grid with L Sd = O 3D positions is selected in module pE according to the common codebook using the number of coefficients O 3D . L Sd must be transferred to block pD for initialization, selecting the same base sampling position grid as indicated in Figure 3. The base sampling grid is described by the matrix Ω l = [θ l , φ l ] T , where Ω l = [θ l , φ l ] T defines the positions on the unit sphere. As mentioned above, Figure 5 shows an example of a base grid.
对旋转发现块(构建块“发现最佳旋转”)320的输入是系数矩阵B。该构建块负责旋转基础采样网格,使得等式(57)的值最小化。该旋转用“轴-角”表示来表示,并且将与该旋转有关的压缩的轴ψrot和旋转角输出到该构建块作为边信息SI。可以通过从原点到单位球面上的位置的单位矢量来描述旋转轴ψrot。在球面坐标中,这可以通过两个角来结合:ψrot=[θaxis,φaxis]T,具有不需要传输的一个隐含的相关半径。通过用信号通知重用先前使用的值以创建边信息SI的特殊逃逸模式(escape pattern)对三个角θaxis、φaxis、进行量化和熵编码。The input to the rotation discovery block (building block "find best rotation") 320 is the coefficient matrix B. This building block is responsible for rotating the base sampling grid so that the value of equation (57) is minimized. The rotation is represented by an "axis-angle" representation, and the compressed axis ψrot and the rotation angle associated with the rotation are output to the building block as side information SI. The rotation axis ψrot can be described by a unit vector from the origin to a position on the unit sphere. In spherical coordinates, this can be combined by two angles: ψrot = [ θaxis , φaxis ] T , with an implicit associated radius that does not need to be transmitted. The three angles θaxis , φaxis , are quantized and entropy coded by a special escape pattern that signals the reuse of previously used values to create the side information SI.
构建块“构建Ψi”330将旋转轴和角解码为知并且将该旋转应用于基础采样网格以得出旋转网格其输出从矢量得出的iDSHT矩阵The building block "Build Ψ i " 330 decodes the rotation axis and angle into known and applies the rotation to the base sampling grid to obtain a rotated grid whose output is the iDSHT matrix derived from the vector
在构建块“iDSHT”310中,通过WSd=ΨiB将HOA系数数据的实际块B变换到空间域中。In the building block "iDSHT" 310, the actual block B of HOA coefficient data is transformed into the spatial domain by WSd = ΨiB .
解码处理块pD的构建块“构建Ψf”350接收旋转轴和角并将其解码为知并且将该旋转应用于基础采样网格以得出旋转网格通过用矢量得到iDSHT矩阵并且在解码侧计算DSHT矩阵Ψf=Ψi -1。The building block "Build Ψ f " 350 of the decoding processing block pD receives and decodes the rotation axis and angle into known and applies the rotation to the base sampling grid to obtain a rotated grid by using the vector iDSHT matrix and calculating the DSHT matrix Ψ f =Ψ i −1 on the decoding side.
在解码器处理块34内的构模块“DSHT”340中,将空间域数据的实际块变换回到系数域数据的块:In the building block "DSHT" 340 within the decoder processing block 34, the actual block of spatial domain data is transformed back into a block of coefficient domain data:
下面描述包括压缩编解码器的总体架构的各种有利的实施例。第一实施例使用单个aDSHT。第二实施例使用谱带中的多个aDSHT。Various advantageous embodiments of the overall architecture including the compression codec are described below. A first embodiment uses a single aDSHT. A second embodiment uses multiple aDSHTs in a spectrum band.
在图7中示出第一(“基本”)实施例。具有O3D个系数信道b(m)的索引m的HOA时间样本首先被存储在缓冲器71中以形成M个样本的块和时间索引μ。在上述的构建块pE72中使用自适应iDSHT,将B(μ)变换到空间域。将空间信号块WSd(μ)输入到LSd个音频压缩单声道(mono)编码器73(如AAC或mp3编码器)或单个AAC多信道编码器(LSd个信道)。比特流S73包括具有集成的边信息SI的多个编码器比特流帧的复用的帧或集成了边信息SI(优选地作为辅助数据)的单个多信道比特流。A first ("basic") embodiment is shown in Figure 7. HOA time samples with index m of O 3D coefficient channels b(m) are first stored in a buffer 71 to form blocks of M samples and time index μ. B(μ) is transformed to the spatial domain using adaptive iDSHT in the building block pE72 described above. The spatial signal block W Sd (μ) is input to L Sd audio compression mono encoders 73 (such as AAC or mp3 encoders) or a single AAC multi-channel encoder (L Sd channels). The bitstream S73 comprises a multiplexed frame of multiple encoder bitstream frames with integrated side information SI or a single multi-channel bitstream with integrated side information SI (preferably as auxiliary data).
在一个实施例中,相应的压缩解码器构建块包括用于将比特流S73分用为LSd个比特流和边信息SI并且将该比特流馈送给LSd个单声道解码器的分用器D1,将它们解码为具有M个样本的LSd个空间音频信道以形成块并且将和SI馈送给pD。在不对比特流进行复用的另一实施例中,压缩解码器构建块包括接收器74,接收器74用于接收比特流并且将其解码为LSd个多信道信号对SI解包,并且将和SI馈送给pD。In one embodiment, the corresponding compression decoder building block includes a demultiplexer D1 for demultiplexing the bitstream S73 into L Sd bitstreams and side information SI and feeding the bitstream to L Sd mono decoders, decoding them into L Sd spatial audio channels with M samples to form blocks and feeding the sum SI to pD. In another embodiment where the bitstream is not multiplexed, the compression decoder building block includes a receiver 74 for receiving the bitstream and decoding it into L Sd multichannel signals, unpacking the SI, and feeding the sum SI to pD.
在解码器处理块pD 75中,使用自适应DSHT和SI将变换到系数域,以形成HOA信号的块B(μ),其被存储在缓冲器76中以便解帧,以形成系数的时间信号b(m)。In the decoder processing block pD 75, φ is transformed to the coefficient domain using adaptive DSHT and SI to form a block of HOA signals B(μ), which are stored in a buffer 76 for de-framing to form the temporal signal of coefficients b(m).
在某些条件下,上述的第一实施例可能具有两个缺点:首先,由于空间信号分布的改变,可能存在来自先前块(即,来自块μ至μ+1)的组块伪像(blocking artifact);其次,可能同时存在多于一个的强信号,并且aDSHT的解相关效应可能相当小。Under certain conditions, the first embodiment described above may have two disadvantages: first, due to the change in spatial signal distribution, there may be blocking artifacts from previous blocks (i.e., from blocks μ to μ+1); second, there may be more than one strong signal at the same time, and the decorrelation effect of aDSHT may be quite small.
在工作于频域中的第二实施例中解决两个缺点。aDSHT应用于组合多个频带数据的标度因子带数据。通过利用重叠添加(Overlay Add,OLA)处理重叠时频变换(TFT)的块来避免组块伪像。可以通过使用本发明在J个谱带内以传输SIj的数据率中的增大的开销的成本来实现改善的信号解相关。In a second embodiment operating in the frequency domain, both drawbacks are addressed. A DSHT is applied to scale factor band data that combines multiple frequency bands. Blocking artifacts are avoided by processing overlapping time-frequency transform (TFT) blocks using Overlay Add (OLA). Improved signal decorrelation can be achieved using the present invention within J spectral bands at the cost of increased overhead in the data rate of transmitting SI j .
下面描述图9所示的第二实施例的一些更多的细节:对信号b(m)的每个系数信道进行时频变换(TFT)912。广泛使用的TFT的示例是修正余弦变换(MDCT)。在TFT成帧单元911中,构造50%的重叠数据块(块索引μ)。TFT块变换单元912执行块变换。在谱带化单元913中,组合TFT频带以形成J个新的谱带和有关的信号其中KJ表示带j中的频率系数的数量。在多个处理模块914中处理这些谱带。对于这些谱带中的每一个,存在一个创建信号和边信息SIj的处理块pEj。谱带可以匹配有损音频压缩方法的谱带(如AAC/mp3标度因子带),或者具有更粗糙的粒度。在后者的情况下,不利用TFT块915的信道无关的有损音频压缩需要重新布置所述带化。处理块914操作如同将恒定的比特率分配给每个音频信道的频域中的LSd多信道音频编码器。在比特流包装块916中格式化比特流。The second embodiment shown in FIG. 9 is described in some further detail below: A time-frequency transform (TFT) 912 is performed on each coefficient channel of the signal b(m). A widely used example of a TFT is the modified cosine transform (MDCT). In the TFT framing unit 911, 50% overlapping data blocks (block index μ) are constructed. The TFT block transform unit 912 performs the block transform. In the spectral banding unit 913, the TFT bands are combined to form J new spectral bands and associated signals, where K J represents the number of frequency coefficients in band j. These spectral bands are processed in a plurality of processing modules 914. For each of these spectral bands, there is a processing block pE j that creates the signal and side information SI j . The spectral bands can match the spectral bands of a lossy audio compression method (such as AAC/MP3 scale factor bands) or have a coarser granularity. In the latter case, channel-independent lossy audio compression without the TFT block 915 requires rearranging the banding. The processing block 914 operates like an LSd multi-channel audio encoder in the frequency domain, assigning a constant bit rate to each audio channel. The bitstream is formatted in the bitstream wrapper block 916 .
解码器接收或存储比特流(至少其若干部分),将其解包921,并且将用于音频数据馈送给不利用TFT进行信道无关的音频解码的多信道音频解码器922,并且将边信息SIj馈送给多个解码处理块pDj 923。用于不利用TFT进行信道无关的音频解码的音频解码器922对音频信息进行解码,并且格式化J个谱带信号作为给解码处理块pDj923的输入,其中,将这些信号变换到HOA系数域以形成在去谱带化块924中,重组J个谱带以匹配TFT的带化。将它们变换到iTFT和OLA块925中的时域,该块使用块重叠的重叠添加(OLA)处理。最后,在TFT解帧块926中,iTFT和OLA模块925的输出被解帧,以创建信号The decoder receives or stores the bitstream (at least parts thereof), unpacks it 921, and feeds the audio data to a multi-channel audio decoder 922 for channel-independent audio decoding without TFT, and feeds the side information SI j to a plurality of decoding processing blocks pD j 923. The audio decoder 922 for channel-independent audio decoding without TFT decodes the audio information and formats J spectral band signals as input to the decoding processing blocks pD j 923, where these signals are transformed into the HOA coefficient domain to form the J spectral bands in the de-banding block 924, which reorganizes the J spectral bands to match the banding of the TFT. They are transformed into the time domain in the iTFT and OLA block 925, which uses an overlap-add (OLA) process with block overlap. Finally, in the TFT de-framing block 926, the output of the iTFT and OLA module 925 is deframed to create the signal
本发明基于如下发现:由信道之间的互相关性产生SNR增加。感知编码器仅考虑出现在每个个体单信道信号内的编码噪声掩蔽效应。然而,这种效应典型地是非线性的。因此,在将这样的单信道矩阵化为新的信号时,可能发生噪声去掩蔽。这是通常在矩阵化运算之后编码噪声增大的原因。The present invention is based on the discovery that the SNR increase is caused by the cross-correlation between channels. Perceptual encoders only consider the coding noise masking effect that occurs within each individual single-channel signal. However, this effect is typically nonlinear. Therefore, when such a single channel is matrixed into a new signal, noise demasking may occur. This is the reason why coding noise usually increases after the matrixing operation.
本发明提出通过使不需要的噪声去掩蔽效应最小化的自适应离散球面谐波变换(aDSHT)对信道进行解相关。aDSHT被集成在压缩编码器和解码器架构内。因为其包括针对HOA输入信号的空间特性来调节DSHT的空间采样网格的旋转运算,所以其是自适应的。aDSHT包括自适应旋转和实际的传统DSHT。实际的DSHT是可以如现有技术中描述的那样地构造的矩阵。对该矩阵应用自适应旋转,从而导致信道间相关性的最小化,并且因此导致矩阵化之后的SNR增加的最小化。通过自动搜索运算(而不是分析地)发现旋转轴和角。对旋转轴和角进行编码和传输,以使得能够在解码之后和在矩阵化之前进行重新相关,其中使用逆自适应DSHT(iaDSHT)。The present invention proposes to decorrelate the channels by an adaptive discrete spherical harmonic transform (aDSHT) that minimizes the unwanted noise demasking effect. The aDSHT is integrated into the compression encoder and decoder architecture. Because it includes a rotation operation that adjusts the spatial sampling grid of the DSHT to the spatial characteristics of the HOA input signal, it is adaptive. The aDSHT includes adaptive rotation and actual traditional DSHT. The actual DSHT is a matrix that can be constructed as described in the prior art. Adaptive rotation is applied to this matrix, resulting in minimization of inter-channel correlation and, therefore, minimization of the SNR increase after matrixing. The rotation axis and angle are found by an automatic search operation (rather than analytically). The rotation axis and angle are encoded and transmitted to enable re-correlation after decoding and before matrixing, using an inverse adaptive DSHT (iaDSHT).
在一个实施例中,执行时频变换(TFT)和谱带化,并且将aDSHT/iaDSHT独立地应用于每个谱带。In one embodiment, time-frequency transform (TFT) and spectral banding are performed, and aDSHT/iaDSHT is applied to each spectral band independently.
图8a)示出本发明的一个实施例中的用于对多信道HOA音频信号进行编码以便降噪的方法的流程图。图8b)示出本发明的一个实施例中的用于对多信道HOA音频信号进行解码以便降噪的方法的流程图。FIG8 a ) shows a flow chart of a method for encoding a multi-channel HOA audio signal for noise reduction in one embodiment of the present invention. FIG8 b ) shows a flow chart of a method for decoding a multi-channel HOA audio signal for noise reduction in one embodiment of the present invention.
在图8a)所示的实施例中,用于对多信道HOA音频信号进行编码以便降噪的方法包括以下步骤:使用逆自适应DSHT对信道进行解相关81,所述逆自适应DSHT包括旋转运算和逆DSHT 812,所述旋转运算旋转811iDSHT的空间采样网格;对每个经解相关的信道进行感知编码82;对(作为边信息SI的)旋转信息进行编码83,所述旋转信息包括定义所述旋转运算的参数;以及,传输或存储84经感知编码的音频信道和经编码的旋转信息。In the embodiment shown in FIG8 a ), a method for encoding a multi-channel HOA audio signal for noise reduction comprises the following steps: decorrelating 81 the channels using an inverse adaptive DSHT comprising a rotation operation and an inverse DSHT 812, wherein the rotation operation rotates 811 the spatial sampling grid of the iDSHT; perceptually encoding 82 each decorrelated channel; encoding 83 rotation information (as side information SI) comprising parameters defining the rotation operation; and transmitting or storing 84 the perceptually encoded audio channels and the encoded rotation information.
在一个实施例中,逆自适应DSHT包括以下步骤:选择初始的默认球面样本网格;确定最强的源方向;以及,对M个时间样本的块,旋转球面样本网格,使得单个空间样本位置匹配最强的源方向。In one embodiment, inverse adaptive DSHT comprises the steps of: selecting an initial default spherical sample grid; determining the strongest source direction; and, for blocks of M temporal samples, rotating the spherical sample grid so that a single spatial sample position matches the strongest source direction.
在一个实施例中,旋转球面样本网格,使得以下项的对数最小化:In one embodiment, the spherical sample grid is rotated so as to minimize the logarithm of:
其中,是(具有矩阵行索引l和列索引j)的元素的绝对值,并且是的对角线元素,其中并且WSd是音频信道的数量乘以处理样本的块的数量的矩阵,并且WSd是aDSHT的结果。where W Sd is the absolute value of the elements of (with matrix row index l and column index j), and W Sd is the diagonal elements of W Sd, where W Sd is a matrix of the number of audio channels multiplied by the number of blocks of processed samples, and W Sd is the result of aDSHT.
在图8b)所示的实施例中,一种用于对具有降低的噪声的经编码的多信道HOA音频信号进行解码的方法包括以下步骤:接收85经编码的多信道HOA音频信号和信道旋转信息(在边信息SI内);对接收的数据进行解压缩86,其中使用感知解码;使用自适应DSHT对每个信道进行空间解码87,其中执行DSHT 872和根据所述旋转信息的DSHT的空间采样网格的旋转871,并且其中对经感知解码的信道进行重新相关;以及,对重新相关的经感知解码的信道进行矩阵化88,其中获得映射到扬声器位置的可再现的音频信号。In the embodiment shown in FIG8 b ), a method for decoding an encoded multi-channel HOA audio signal with reduced noise comprises the following steps: receiving 85 an encoded multi-channel HOA audio signal and channel rotation information (within side information SI); decompressing 86 the received data, wherein perceptual decoding is used; spatially decoding 87 each channel using adaptive DSHT, wherein DSHT 872 and a rotation 871 of the spatial sampling grid of the DSHT according to the rotation information are performed, and wherein the perceptually decoded channels are re-correlated; and matrixing 88 the re-correlated perceptually decoded channels, wherein a reproducible audio signal mapped to a loudspeaker position is obtained.
在一个实施例中,自适应DSHT包括以下步骤:选择自适应DSHT的初始的默认球面样本网格;以及,对M个时间样本的块,根据所述旋转信息来旋转球面样本网格。In one embodiment, the adaptive DSHT comprises the steps of: selecting an initial default spherical sample grid for the adaptive DSHT; and, for blocks of M time samples, rotating the spherical sample grid according to the rotation information.
在一个实施例中,旋转信息是具有三个分量的空间矢量注意,旋转轴ψrot可以用单位矢量来描述。In one embodiment, the rotation information is a space vector with three components. Note that the rotation axis ψ rot can be described by a unit vector.
在一个实施例中,旋转信息是由3个角构成的矢量:θaxis、φaxis、其中,θaxis、φaxis定义关于具有在球面坐标中的一个隐含半径的旋转轴的信息,并且定义绕该轴的旋转角。In one embodiment, the rotation information is a vector consisting of three angles: θ axis , φ axis , wherein θ axis , φ axis define information about a rotation axis with an implicit radius in spherical coordinates and define a rotation angle about the axis.
在一个实施例中,通过用信号通知(亦即,指示)重用先前的值以便创建边信息(SI)的逃逸模式(亦即,专用比特模式),对角进行量化和熵编码。In one embodiment, the corners are quantized and entropy coded by signaling (ie, indicating) an escape mode (ie, a dedicated bit pattern) that reuses previous values to create side information (SI).
在一个实施例中,一种用于对多信道HOA音频信号进行编码以便降噪的设备包括:解相关器,用于使用逆自适应DSHT对信道进行解相关,所述逆自适应DSHT包括旋转运算和逆DSHT(iDSHT),其中旋转运算旋转iDSHT的空间采样网格;感知编码器,用于对每个经解相关的信道进行感知编码;边信息编码器,用于对旋转信息进行编码,所述旋转信息包括定义所述旋转运算的参数;以及接口,用于传输或存储经感知编码的音频信道和经编码的旋转信息。In one embodiment, a device for encoding a multi-channel HOA audio signal for noise reduction includes: a decorrelator for decorrelating channels using an inverse adaptive DSHT, wherein the inverse adaptive DSHT includes a rotation operation and an inverse DSHT (iDSHT), wherein the rotation operation rotates the spatial sampling grid of the iDSHT; a perceptual encoder for perceptually encoding each decorrelated channel; a side information encoder for encoding rotation information, wherein the rotation information includes parameters defining the rotation operation; and an interface for transmitting or storing the perceptually encoded audio channels and the encoded rotation information.
在一个实施例中,一种用于对具有降低的噪声的多信道HOA音频信号进行解码的设备包括:接口装置330,用于接收经编码的多信道HOA音频信号和信道旋转信息;解压缩模块33,用于通过使用用于对每个信道进行感知解码的感知解码器对接收的数据进行解压缩;相关器34,用于对经感知解码的信道进行重新相关,其中执行DSHT和根据所述旋转信息的DSHT的空间采样网格的旋转;以及混合器,用于对相关的经感知解码的信道进行矩阵化,其中获得映射到扬声器位置的可再现的音频信号。原则上,相关器34用作空间解码器。In one embodiment, an apparatus for decoding a multi-channel HOA audio signal with reduced noise includes: an interface device 330 for receiving an encoded multi-channel HOA audio signal and channel rotation information; a decompression module 33 for decompressing the received data using a perceptual decoder for perceptually decoding each channel; a correlator 34 for re-correlating the perceptually decoded channels, wherein a DSHT and a rotation of the spatial sampling grid of the DSHT according to the rotation information are performed; and a mixer for matrixing the correlated perceptually decoded channels, wherein a reproducible audio signal mapped to a speaker position is obtained. In principle, the correlator 34 functions as a spatial decoder.
在一个实施例中,一种用于对具有降低的噪声的多信道HOA音频信号进行解码的设备包括:接口装置330,用于接收经编码的多信道HOA音频信号和信道旋转信息;解压缩模块33,用于通过用于对每个信道进行感知解码的感知解码器对接收的数据进行解压缩;相关器34,用于使用aDSHT对经感知解码的信道进行相关,其中执行DSHT和根据所述旋转信息的DSHT的空间采样网格的旋转;以及混合器MX,用于对相关的经感知解码的信道进行矩阵化,其中获得映射到扬声器位置的可再现音频信号。In one embodiment, an apparatus for decoding a multi-channel HOA audio signal with reduced noise includes: an interface device 330 for receiving an encoded multi-channel HOA audio signal and channel rotation information; a decompression module 33 for decompressing the received data by a perceptual decoder for perceptually decoding each channel; a correlator 34 for correlating the perceptually decoded channels using aDSHT, wherein the DSHT and a rotation of the spatial sampling grid of the DSHT according to the rotation information are performed; and a mixer MX for matrixing the correlated perceptually decoded channels, wherein a reproducible audio signal mapped to a speaker position is obtained.
在一个实施例中,用于进行解码的设备中的自适应DSHT包括用于选择自适应DSHT的初始的默认样本网格的装置、用于对M个时间样本的块根据所述旋转信息旋转默认球面样本网格的旋转处理装置、以及用于对旋转的球面样本网格执行DSHT的变换处理装置。In one embodiment, the adaptive DSHT in the apparatus for decoding includes means for selecting an initial default sample grid for the adaptive DSHT, a rotation processing means for rotating the default spherical sample grid according to the rotation information for a block of M time samples, and a transform processing means for performing DSHT on the rotated spherical sample grid.
在一个实施例中,用于进行解码的设备中的相关器34包括用于使用自适应DSHT同时对每个信道进行空间解码的多个空间解码单元922,还包括用于执行去谱带化的去谱带化单元924、以及用于通过重叠添加处理执行逆时频变换的iTFT和OLA单元925,其中所述去谱带化单元将其输出提供给iTFT和OLA单元。In one embodiment, the correlator 34 in the apparatus for decoding includes a plurality of spatial decoding units 922 for spatially decoding each channel simultaneously using adaptive DSHT, a de-banding unit 924 for performing de-banding, and an iTFT and OLA unit 925 for performing inverse time-frequency transform by overlap-add processing, wherein the de-banding unit provides its output to the iTFT and OLA unit.
在所有实施例中,术语降低的噪声至少涉及避免编码噪声去掩蔽。In all embodiments, the term reduced noise at least relates to avoiding coding noise demasking.
对音频信号的感知编码表示适合于对音频的人类感知的编码。应当注意,在对音频信号进行感知编码时,通常不对宽带音频信号样本而是在与人类感知有关的个体频带中执行量化。因此,信号功率与量化噪声之间的比率可以在个体频带之间变化。因此,感知编码通常包括减少冗余和/或无关信息,而空间编码通常涉及信道之间的空间关系。Perceptual coding of audio signals refers to coding that is tailored to human perception of the audio. It should be noted that when perceptually coding an audio signal, quantization is typically performed not on the wideband audio signal samples, but rather in individual frequency bands relevant to human perception. Consequently, the ratio between signal power and quantization noise can vary between individual frequency bands. Thus, perceptual coding typically involves reducing redundant and/or irrelevant information, while spatial coding typically involves the spatial relationship between channels.
上述的技术可以被看作是对使用Karhunen-Loève变换(KLT)的解相关的替代。本发明的一个优点是极大地减少了边信息量,边信息仅包括三个角。KLT需要块相关矩阵的系数作为边信息,因此需要多得多的数据。此外,在此公开的技术允许对旋转进行调整(或微调),以便减少进行到下一个处理块时的过渡伪像(transition artifact)。这有利于后续的感知编码的压缩质量。The above-described technique can be viewed as an alternative to decorrelation using the Karhunen-Loève transform (KLT). One advantage of the present invention is that the amount of side information is greatly reduced, consisting of only three corners. The KLT requires the coefficients of the block correlation matrix as side information, and therefore requires significantly more data. In addition, the techniques disclosed herein allow for adjustments (or fine-tuning) of the rotation to reduce transition artifacts when proceeding to the next processing block. This improves the compression quality of subsequent perceptual coding.
表1提供aDSHT与KLT之间的直接比较。尽管存在一些相似性,但是aDSHT提供了超过KLT的显著优点。Table 1 provides a direct comparison between aDSHT and KLT. Despite some similarities, aDSHT offers significant advantages over KLT.
表1 aDSHT对KLT的比较Table 1 Comparison of aDSHT and KLT
虽然已经示出、描述和指出对本发明的优选的实施例应用的基础的新颖的特征,但是应当理解,本领域的技术人员可以在所描述的设备和方法中,在所公开的装置的形式和细节以及在其操作方面,进行各种省略和替代和变化,而不脱离本发明的精神。显然旨在以基本相同的方式执行基本相同的功能以获得相同的结果的那些元件的所有组合都在本发明的范围内。还充分地预期和设想到从一个所描述的实施例到另一个所描述的实施例的元件的替换。While there have been shown, described, and pointed out the novel features underlying the preferred embodiments of the present invention, it will be understood that various omissions, substitutions, and changes may be made by those skilled in the art in the described apparatus and methods, in the form and details of the disclosed devices, and in their operation, without departing from the spirit of the invention. It is apparent that all combinations of elements intended to perform substantially the same function in substantially the same manner to achieve the same results are within the scope of the present invention. Substitution of elements from one described embodiment to another is fully contemplated and intended.
应当理解的是,仅仅通过示例对本发明进行了描述,可以对细节进行修改,而不脱离本发明的范围。It will be understood that the present invention has been described by way of example only and modifications of detail may be made without departing from the scope of the invention.
在本说明书和(适当之处)权利要求书和附图中公开的每个特征可以独立地或以任何适当的组合来提供。Each feature disclosed in the description and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.
特征可以在适当的情况下被实现为硬件、软件或这二者的组合。连接可以在可应用的情况下被实现为无线连接或者有线的(不必是直接或专用的)连接。Features may be implemented as hardware, software or a combination of both where appropriate.Connections may be implemented as wireless connections or wired (not necessarily direct or dedicated) connections where applicable.
在权利要求中出现的标号仅作为示例,而不应当具有对权利要求的范围的限定效果。Reference signs appearing in the claims are by way of example only and shall have no limiting effect on the scope of the claims.
引用的参考文献Cited references
[1]T.D.Abhayapala。Generalized framework for spherical microphonearrays:Spatial and frequency decomposition。IEEE International Conference onAcoustics,Speech,and Signal Processing(ICASSP)会议,(接受的)第X卷,页,2008年4月,拉斯维加斯,美国。[1] T.D.Abhayapala. Generalized framework for spherical microphone arrays: Spatial and frequency decomposition. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Volume X (Accepted), pp. 196-2008, Las Vegas, USA, April 2008.
[2]James R.Driscoll和Dennis M.Healy Jr.。Computing fourier transformsand convolutions on the 2-sphere。Advances in Applied Mathematics,15:202-250,1994年。[2] James R. Driscoll and Dennis M. Healy Jr.. Computing fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics, 15:202-250, 1994.
[3]Fliege。Integration nodes for the sphere,http://www.personal.soton.ac.uk/jf1w07/nodes/nodes.html[3]Fliege. Integration nodes for the sphere, http://www.personal.soton.ac.uk/jf1w07/nodes/nodes.html
[4]Fliege和Ulrike Maier。A two-stage approach for computingcubature formulae for the sphere。技术报告,Fachbereich Mathematik,多特蒙德大学,1999年。[4] Fliege and Ulrike Maier. A two-stage approach for computingcubature formulae for the sphere. Technical report, Fachbereich Mathematik, University of Dortmund, 1999.
[5]R.H.Hardin和N.J.A.Sloane。网页:Spherical designs,spherical t-designs。http://www2.research.att.com/-njas/sphdesigns[5] R.H.Hardin and N.J.A.Sloane. Webpage: Spherical designs, spherical t-designs. http://www2.research.att.com/-njas/sphdesigns
[6]R.H.Hardin和N.J.A.Sloane。Mclaren′s improved snub cube and othernew spherical designs in three dimensiohs。Discrete and ComputationalGeometry,15:429-441,1996年。[6] R.H. Hardin and N.J.A. Sloane. McLaren's improved snub cube and other new spherical designs in three dimensions. Discrete and Computational Geometry, 15: 429-441, 1996.
[7]Erik Hellerud、lan Burnett、Audun Solvang和U.Peter Svensson.Encodinghigher order Ambisonics with AAC。第124届AES会议,阿姆斯特丹,2008年5月。[7] Erik Hellerud, Ian Burnett, Audun Solvang, and U. Peter Svensson. Encoding higher order ambisonics with AAC. 124th AES Convention, Amsterdam, May 2008.
[8]Peter Jax、Jan-Mark Batke、Johannes Boehm和Sven Kordon。Perceptualcoding of HOA signals in spatial domain。欧洲专利申请EP2469741A1(PD100051)。[8]Peter Jax, Jan-Mark Batke, Johannes Boehm, and Sven Kordon. Perceptual coding of HOA signals in spatial domain. European patent application EP2469741A1 (PD100051).
[9]Boaz Rafaely。Plane-wave decomposition of the sound field on asphere by spherical convolution。J.Acoust.Soc.Am.,4(116):2149-2157,2004年10月。[9]Boaz Rafaely. Plane-wave decomposition of the sound field on asphere by spherical convolution. J. Acoust. Soc. Am., 4(116): 2149-2157, October 2004.
[10]Earl G.Williams。Fourier Acoustics,Applied Mathematical Sciences第93卷。Academic Press,1999年。[10]Earl G.Williams. Fourier Acoustics, Applied Mathematical Sciences Volume 93. Academic Press, 1999.
Claims (5)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP12305861.2 | 2012-07-16 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1241131A1 HK1241131A1 (en) | 2018-06-01 |
| HK1241131B true HK1241131B (en) | 2021-09-30 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107424618B (en) | Method, apparatus and computer readable medium for decoding HOA audio signals | |
| HK1241131B (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242834B (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1241130B (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242835B (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242833B (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242835A1 (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242833A1 (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1241131A1 (en) | Method, apparatus and computer readable medium for decoding hoa audio signals | |
| HK1242834A1 (en) | Method, apparatus and computer readable medium for decoding hoa audio signals |