CN107886960B - A kind of audio signal reconstruction method and device - Google Patents
A kind of audio signal reconstruction method and device Download PDFInfo
- Publication number
- CN107886960B CN107886960B CN201610877571.2A CN201610877571A CN107886960B CN 107886960 B CN107886960 B CN 107886960B CN 201610877571 A CN201610877571 A CN 201610877571A CN 107886960 B CN107886960 B CN 107886960B
- Authority
- CN
- China
- Prior art keywords
- audio signal
- transform coefficient
- coefficient corresponding
- sparse transform
- sparse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/005—Correction of errors induced by the transmission channel, if related to the coding algorithm
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Stereophonic System (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
本发明实施例公开了一种音频信号重建方法及装置,所述方法包括:获取至少两个音频信号对应的压缩数据;对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据;获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数;对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。采用本发明实施例,可提高音频信号的质量。
Embodiments of the present invention disclose an audio signal reconstruction method and device. The method includes: acquiring compressed data corresponding to at least two audio signals; and performing inverse quantization on the compressed data corresponding to at least two audio signals, thereby obtaining at least two audio signals. The measurement data corresponding to the audio signals; obtaining measurement matrices, and jointly reconstructing the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrices corresponding to the at least two audio signals; Sparse inverse transform to obtain at least two audio signals. By adopting the embodiments of the present invention, the quality of the audio signal can be improved.
Description
技术领域technical field
本发明涉及通信技术领域,尤其涉及一种音频信号重建方法及装置。The present invention relates to the field of communication technologies, and in particular, to an audio signal reconstruction method and device.
背景技术Background technique
随着通信技术的发展,高质量视听体验的需求日渐强烈,需要高质量声场信息重建的场景越来越多,如:远程会议,电影及大型网络游戏等。近年来,压缩感知(CompressedSensing,CS)理论充分考虑了信号的稀疏特性,利用信号的结构特征压缩与重建信号,现已成为信号处理领域的研究热点。CS理论指出,由于信号压缩的根本目标在于去除其所包含的冗余成分,对于自身即存在冗余的信号,可以直接获取其压缩表示(即压缩数据),省略对大量无用信号的采样。故此,当CS理论作用于音频信号时,可实现音频信号的采样与压缩过程的合并,从而极大地简化整个压缩过程,并在某种意义上突破了香农采样定理的瓶颈,使得高分辨率的音频信号的采集成为可能。With the development of communication technology, the demand for high-quality audio-visual experience is increasing day by day, and more and more scenes need to be reconstructed with high-quality sound field information, such as: remote conferences, movies and large-scale online games. In recent years, Compressed Sensing (CS) theory fully considers the sparse characteristics of signals, and uses the structural features of signals to compress and reconstruct signals, which has become a research hotspot in the field of signal processing. The CS theory points out that since the fundamental goal of signal compression is to remove the redundant components contained in it, the compressed representation (ie, compressed data) can be directly obtained for a signal that has redundancy in itself, omitting the sampling of a large number of useless signals. Therefore, when the CS theory acts on the audio signal, it can realize the combination of the sampling and compression process of the audio signal, which greatly simplifies the entire compression process, and breaks through the bottleneck of Shannon's sampling theorem in a sense, making high-resolution sampling and compression. The acquisition of audio signals becomes possible.
传统的音频信号重建方法具体为:获取音频信号对应的压缩数据;对音频信号对应的压缩数据进行反量化,得到音频信号对应的测量数据;根据音频信号对应的测量数据和测量矩阵,对该音频信号进行重建。其中反量化得到的音频信号对应的测量数据为欠定方程组,由于欠定方程组存在至少两组解,重建后的音频信号是上述至少两组解中的任意一组解,则重建后的音频信号与采集到的原始音频信号之间的相似度较低,导致重建后的音频信号的质量较差。The traditional audio signal reconstruction method is specifically as follows: obtaining compressed data corresponding to the audio signal; performing inverse quantization on the compressed data corresponding to the audio signal to obtain measurement data corresponding to the audio signal; signal is reconstructed. The measurement data corresponding to the audio signal obtained by inverse quantization is a system of underdetermined equations. Since there are at least two sets of solutions in the system of underdetermined equations, the reconstructed audio signal is any one of the above at least two sets of solutions, then the reconstructed The similarity between the audio signal and the collected original audio signal is low, resulting in poor quality of the reconstructed audio signal.
发明内容SUMMARY OF THE INVENTION
本申请提供一种音频信号重建方法及装置,可提高音频信号的质量。The present application provides an audio signal reconstruction method and device, which can improve the quality of the audio signal.
第一方面提供了一种音频信号重建方法,所述方法包括:获取至少两个音频信号对应的压缩数据;对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据;获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数;对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。A first aspect provides an audio signal reconstruction method, the method includes: acquiring compressed data corresponding to at least two audio signals; inverse quantizing the compressed data corresponding to the at least two audio signals, so as to obtain at least two audio signals corresponding to obtain the measurement matrix, and jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrix corresponding to the at least two audio signals; perform sparse inverse transformation on the sparse transformation coefficients corresponding to the at least two audio signals , get at least two audio signals.
具体实现中,终端获取到至少两个音频信号对应的压缩数据之后,可以对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据,终端还可以获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,并对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号,可提高至少两个音频信号的质量。In a specific implementation, after obtaining the compressed data corresponding to the at least two audio signals, the terminal may perform inverse quantization on the compressed data corresponding to the at least two audio signals, thereby obtaining measurement data corresponding to the at least two audio signals, and the terminal may also obtain the measurement data. matrix, according to the measurement data corresponding to the at least two audio signals and the measurement matrix, jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals, and perform sparse inverse transformation on the sparse transformation coefficients corresponding to the at least two audio signals to obtain at least two Audio signal, which can improve the quality of at least two audio signals.
在上述技术方案中,可选的,至少两个音频信号可以包括第一音频信号和第二音频信号,则终端根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the at least two audio signals may include a first audio signal and a second audio signal, and the terminal jointly reconstructs the at least two audio signals according to the measurement data and measurement matrix corresponding to the at least two audio signals The corresponding sparse transformation coefficients, specifically:
根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the second audio signal is calculated according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix.
在上述技术方案中,可选的,第一音频信号可以对应第一通道,第二音频信号可以对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal may correspond to the first channel, the second audio signal may correspond to the second channel, and the first audio signal and the second audio signal are audio signals obtained by collecting in the same time period, Then, the terminal calculates the sparse transformation coefficient corresponding to the second audio signal according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, which may be specifically:
根据第一音频信号对应的稀疏变换系数中的第一幅度,确定第二音频信号对应的先验稀疏变换系数中的第二幅度,并将第二幅度作为第二音频信号对应的稀疏变换系数中幅度的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的幅度,其中第一幅度与第二幅度之比为第一音频信号对应的麦克风距离声源的距离的对数与第二音频信号对应的麦克风距离声源的距离的对数之比。According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, determine the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second amplitude as the sparse transformation coefficient corresponding to the second audio signal. Amplitude prior, according to the measurement data corresponding to the second audio signal and the measurement matrix, calculate the amplitude in the sparse transform coefficient corresponding to the second audio signal, wherein the ratio of the first amplitude to the second amplitude is the microphone corresponding to the first audio signal The ratio of the logarithm of the distance from the sound source to the logarithm of the distance of the microphone corresponding to the second audio signal from the sound source.
在上述技术方案中,可选的,第一音频信号可以对应第一通道,第二音频信号可以对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal may correspond to the first channel, the second audio signal may correspond to the second channel, and the first audio signal and the second audio signal are audio signals obtained by collecting in the same time period, Then, the terminal calculates the sparse transformation coefficient corresponding to the second audio signal according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, which may be specifically:
根据第一音频信号对应的稀疏变换系数中的第一相位,确定第二音频信号对应的先验稀疏变换系数中的第二相位,并将第二相位作为第二音频信号对应的稀疏变换系数中相位的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的相位,其中第一相位与第二相位之比为第一音频信号对应的麦克风距离声源的距离与第二音频信号对应的麦克风距离声源的距离之比。According to the first phase in the sparse transformation coefficient corresponding to the first audio signal, determine the second phase in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second phase as the second phase in the sparse transformation coefficient corresponding to the second audio signal The prior of the phase, according to the measurement data corresponding to the second audio signal and the measurement matrix, calculate the phase in the sparse transformation coefficient corresponding to the second audio signal, wherein the ratio of the first phase to the second phase is the microphone corresponding to the first audio signal The ratio of the distance from the sound source to the distance from the microphone corresponding to the second audio signal from the sound source.
在上述技术方案中,可选的,第一音频信号可以对应第一通道,第二音频信号可以对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal may correspond to the first channel, the second audio signal may correspond to the second channel, and the first audio signal and the second audio signal are audio signals obtained by collecting in the same time period, Then, the terminal calculates the sparse transformation coefficient corresponding to the second audio signal according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, which may be specifically:
将第一音频信号对应的稀疏变换系数中的第一频率,确定为第二音频信号对应的先验稀疏变换系数中的第二频率,并将第二频率作为第二音频信号对应的稀疏变换系数中频率的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的频率。Determine the first frequency in the sparse transformation coefficient corresponding to the first audio signal as the second frequency in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second frequency as the sparse transformation coefficient corresponding to the second audio signal The prior of the medium frequency is to calculate the frequency in the sparse transform coefficient corresponding to the second audio signal according to the measurement data corresponding to the second audio signal and the measurement matrix.
在上述技术方案中,可选的,第一音频信号与第二音频信号可以对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal and the second audio signal may correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods. The sparse transformation coefficient corresponding to the signal, the measurement data and the measurement matrix corresponding to the second audio signal, and the sparse transformation coefficient corresponding to the second audio signal is calculated, which may be specifically:
根据第一音频信号对应的稀疏变换系数中的第一幅度,确定第二音频信号对应的先验稀疏变换系数中的第二幅度,并将第二幅度作为第二音频信号对应的稀疏变换系数中幅度的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的幅度,其中对应于同一通道的不同时段的音频信号对应的系数变换系数中的幅度与不同时段的音频信号对应的帧的序号呈线性关系。According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, determine the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second amplitude as the sparse transformation coefficient corresponding to the second audio signal. A priori of the amplitude, according to the measurement data corresponding to the second audio signal and the measurement matrix, calculate the amplitude in the sparse transformation coefficient corresponding to the second audio signal, wherein the coefficient transformation coefficients corresponding to the audio signals in different time periods of the same channel are The amplitude has a linear relationship with the sequence numbers of the frames corresponding to the audio signals of different time periods.
在上述技术方案中,可选的,第一音频信号与第二音频信号可以对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal and the second audio signal may correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods. The sparse transformation coefficient corresponding to the signal, the measurement data and the measurement matrix corresponding to the second audio signal, and the sparse transformation coefficient corresponding to the second audio signal is calculated, which may be specifically:
将第一音频信号对应的稀疏变换系数中的第一相位,确定为第二音频信号对应的先验稀疏变换系数中的第二相位,将第二相位作为第二音频信号对应的稀疏变换系数中相位的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的相位。Determining the first phase in the sparse transformation coefficient corresponding to the first audio signal as the second phase in the prior sparse transformation coefficient corresponding to the second audio signal, and using the second phase as the second phase in the sparse transformation coefficient corresponding to the second audio signal A priori of the phase, and according to the measurement data corresponding to the second audio signal and the measurement matrix, the phase in the sparse transformation coefficient corresponding to the second audio signal is calculated.
在上述技术方案中,可选的,第一音频信号与第二音频信号可以对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号,则终端根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the first audio signal and the second audio signal may correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods. The sparse transformation coefficient corresponding to the signal, the measurement data and the measurement matrix corresponding to the second audio signal, and the sparse transformation coefficient corresponding to the second audio signal is calculated, which may be specifically:
根据第一音频信号对应的稀疏变换系数中的第一频率,确定第二音频信号对应的先验稀疏变换系数中的第二频率,并将第二频率作为第二音频信号对应的稀疏变换系数中频率的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的频率,其中第一频率与二频率存在交集,交集中的频率通过对第一频率中的频率进行随机选择得到。According to the first frequency in the sparse transformation coefficient corresponding to the first audio signal, determine the second frequency in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second frequency as the sparse transformation coefficient corresponding to the second audio signal. The prior of the frequency, according to the measurement data corresponding to the second audio signal and the measurement matrix, calculate the frequency in the sparse transformation coefficient corresponding to the second audio signal, wherein the first frequency and the second frequency have an intersection, and the frequency in the intersection is determined by comparing the first frequency and the second frequency. The frequencies in the frequency are randomly selected.
在上述技术方案中,可选的,第一音频信号与第二音频信号为在相邻时段采集获得的音频信号。In the above technical solution, optionally, the first audio signal and the second audio signal are audio signals collected in adjacent time periods.
在上述技术方案中,可选的,至少两个音频信号可以包括第三音频信号,则终端根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,具体可以为:In the above technical solution, optionally, the at least two audio signals may include a third audio signal, and the terminal jointly reconstructs the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrix corresponding to the at least two audio signals , which can be specifically:
根据预设的初始稀疏变换系数、第三音频信号对应的测量数据以及测量矩阵,计算第三音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the third audio signal is calculated according to the preset initial sparse transformation coefficient, the measurement data corresponding to the third audio signal, and the measurement matrix.
第二方面提供了一种计算机存储介质,所述计算机存储介质存储有程序,所述程序执行时包括本申请实施例第一方面提供的音频信号重建方法中全部或部分的步骤。A second aspect provides a computer storage medium, where the computer storage medium stores a program, and when the program is executed, it includes all or part of the steps in the audio signal reconstruction method provided in the first aspect of the embodiments of the present application.
第三方面提供了一种音频信号重建装置,所述装置可以包括压缩数据获取模块、反量化模块、联合重建模块以及稀疏反变换模块,所述装置可以用于实施结合第一方面的部分或全部步骤。A third aspect provides an audio signal reconstruction apparatus, the apparatus may include a compressed data acquisition module, an inverse quantization module, a joint reconstruction module and an inverse sparse transform module, the apparatus may be used to implement part or all of the combination of the first aspect step.
第四方面提供了一种终端,包括处理器以及存储器,处理器可以用于实施结合第一方面的部分或全部步骤。A fourth aspect provides a terminal including a processor and a memory, where the processor can be used to implement some or all of the steps in combination with the first aspect.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.
图1A是本发明实施例中提供的一种麦克风阵列的界面示意图;1A is a schematic interface diagram of a microphone array provided in an embodiment of the present invention;
图1B是本发明实施例中提供的一种传递参数信息的界面示意图;1B is a schematic diagram of an interface for transferring parameter information provided in an embodiment of the present invention;
图1C是本发明实施例中提供的一种重建音频信号的界面示意图;1C is a schematic diagram of an interface for reconstructing an audio signal provided in an embodiment of the present invention;
图1D是本发明实施例中提供的一种采样音频信号的界面示意图;1D is a schematic interface diagram of a sampled audio signal provided in an embodiment of the present invention;
图1E是本发明实施例中提供的一种重建音频信号的流程示意图;1E is a schematic flowchart of a reconstructed audio signal provided in an embodiment of the present invention;
图2是本发明实施例中提供的一种音频信号重建方法的流程示意图;2 is a schematic flowchart of an audio signal reconstruction method provided in an embodiment of the present invention;
图3是本发明另一实施例中提供的一种音频信号重建方法的流程示意图;3 is a schematic flowchart of an audio signal reconstruction method provided in another embodiment of the present invention;
图4是本发明实施例中提供的一种音频信号压缩方法的流程示意图;4 is a schematic flowchart of an audio signal compression method provided in an embodiment of the present invention;
图5是本发明实施例中提供的一种音频信号重建装置的结构示意图;5 is a schematic structural diagram of an audio signal reconstruction apparatus provided in an embodiment of the present invention;
图6是本发明实施例中提供的一种终端的结构示意图。FIG. 6 is a schematic structural diagram of a terminal provided in an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述。The technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention.
本发明实施例提供了一种音频信号重建方法,终端可以获取至少两个音频信号对应的压缩数据,对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据,并获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号,可提高音频信号的质量。An embodiment of the present invention provides an audio signal reconstruction method. A terminal can obtain compressed data corresponding to at least two audio signals, and perform inverse quantization on the compressed data corresponding to the at least two audio signals, thereby obtaining measurements corresponding to the at least two audio signals. data, and obtain a measurement matrix, jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrices corresponding to the at least two audio signals, and perform sparse inverse transformation on the sparse transformation coefficients corresponding to the at least two audio signals, Obtaining at least two audio signals can improve the quality of the audio signals.
具体实现中,终端可以通过由多个麦克风组成的阵列并行对声源采集音频信号,通过任一麦克风采集到的音频信号可以作为一个通道的音频信号,不同通道的音频信号是通过不同麦克风采集到的。其中,多个麦克风组成的阵列可以为直线型、环型或者星型等,具体不受本发明实施例的限制,以图1A所示的麦克风阵列的界面示意图为例,各个麦克风(例如Mic_1、Mic_2、Mic_3…Mic_N)可以排列在声源的右侧,多个麦克风组成的阵列可以为直线型,且各个麦克风之间的距离相同,通过各个麦克风采集到的音频信号可以表示为x(t)=ccos(2πf+θ),其中各个麦克分采集到的音频信号存在空域上的相关性,任一麦克风采集到的音频信号存在时域和频域上的相关性,则终端可以采样得到各个通道的音频信号,对各个通道的音频信号进行稀疏变换(通过FFT或者MDCT等方式)、感知测量(将稀疏变换得到的稀疏变换系数与随机测量矩阵相乘)以及将感知测量得到的感知测量值与音频信号的采集环境对应的噪声矢量进行相加等处理,得到各个通道的压缩数据。In the specific implementation, the terminal can collect audio signals from the sound source in parallel through an array composed of multiple microphones. The audio signal collected by any microphone can be used as the audio signal of one channel, and the audio signals of different channels are collected by different microphones. of. The array composed of multiple microphones may be a linear type, a ring type, or a star type, etc., which is not specifically limited by the embodiment of the present invention. Taking the interface schematic diagram of the microphone array shown in FIG. 1A as an example, each microphone (such as Mic_1, Mic_2, Mic_3...Mic_N) can be arranged on the right side of the sound source, the array composed of multiple microphones can be linear, and the distance between each microphone is the same, the audio signal collected by each microphone can be expressed as x(t) =ccos(2πf+θ), where the audio signal collected by each microphone has spatial correlation, and the audio signal collected by any microphone has correlation in time domain and frequency domain, then the terminal can sample each channel The audio signal of each channel is sparsely transformed (by FFT or MDCT, etc.), perceptual measurement (multiplying the sparse transformation coefficient obtained by the sparse transformation with the random measurement matrix), and the perceptual measurement obtained by the perceptual measurement. The noise vectors corresponding to the acquisition environment of the audio signal are added and processed to obtain the compressed data of each channel.
在需要对音频信号进行重建时,终端可以获取至少两个音频信号对应的压缩数据,对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据,并获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。When the audio signal needs to be reconstructed, the terminal can obtain the compressed data corresponding to at least two audio signals, perform inverse quantization on the compressed data corresponding to the at least two audio signals, so as to obtain the measurement data corresponding to the at least two audio signals, and obtain the compressed data corresponding to the at least two audio signals. The measurement matrix, according to the measurement data corresponding to the at least two audio signals and the measurement matrix, jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals, and perform sparse inverse transformation on the sparse transformation coefficients corresponding to the at least two audio signals to obtain at least two audio signal.
需要说明的是,本发明实施例中对音频信号进行压缩采样的终端和对音频信号进行重建的终端可以为同一终端,例如终端采样得到各个通道的音频信号之后,对各个通道的音频信号进行稀疏变换以及感知测量等处理得到压缩数据,当终端需要对上述音频信号进行重建时,可以对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。可选的,本发明实施例中对音频信号进行压缩采样的终端和对音频信号进行重建的终端可以为不同终端,例如第一终端采样得到各个通道的音频信号之后,对各个通道的音频信号进行稀疏变换以及感知测量等处理得到压缩数据,第一终端可以将压缩数据发送给第二终端,当第二终端需要对上述音频信号进行重建时,可以预测该通道的位于该帧的参数信息,根据该通道的位于该帧的压缩采样数据及其参数信息,对该通道的位于该帧的音频信号进行重建。It should be noted that, in this embodiment of the present invention, the terminal that compresses and samples the audio signal and the terminal that reconstructs the audio signal may be the same terminal. For example, after the terminal samples the audio signal of each channel, the audio signal of each channel is sparsely The compressed data is obtained by processing such as transformation and perceptual measurement. When the terminal needs to reconstruct the above-mentioned audio signal, it can perform inverse quantization on the compressed data corresponding to at least two audio signals, so as to obtain the measurement data corresponding to at least two audio signals. The measurement data and the measurement matrix corresponding to the two audio signals are jointly reconstructed with the sparse transformation coefficients corresponding to the at least two audio signals, and the sparse transformation coefficients corresponding to the at least two audio signals are subjected to sparse inverse transformation to obtain at least two audio signals. Optionally, in this embodiment of the present invention, the terminal that compresses and samples the audio signal and the terminal that reconstructs the audio signal may be different terminals. The compressed data is obtained by processing such as sparse transformation and perceptual measurement. The first terminal can send the compressed data to the second terminal. When the second terminal needs to reconstruct the above audio signal, it can predict the parameter information of the channel located in the frame, according to The compressed sampling data of the channel located in the frame and its parameter information are reconstructed from the audio signal of the channel located in the frame.
请参见图2,图2是本发明实施例中提供的一种音频信号重建方法的流程示意图,如图所示本发明实施例中的音频信号重建方法至少可以包括:Referring to FIG. 2, FIG. 2 is a schematic flowchart of an audio signal reconstruction method provided in an embodiment of the present invention. As shown in the figure, the audio signal reconstruction method in the embodiment of the present invention may at least include:
S201、获取至少两个音频信号对应的压缩数据。S201. Acquire compressed data corresponding to at least two audio signals.
终端可以获取至少两个音频信号对应的压缩数据。具体实现中,终端可以获取终端的存储器中存储的至少两个音频信号对应得压缩数据,例如,终端采集到音频信号之后,对采集到的音频信号进行压缩采样得到压缩数据,并将压缩数据存储到存储器中;又如,其他终端采集到音频信号之后,其他终端对采集到的音频信号进行压缩采样得到压缩数据,进而其他终端将压缩数据发送给该终端,终端将压缩数据存储到存储器中。The terminal may acquire compressed data corresponding to at least two audio signals. In a specific implementation, the terminal may obtain compressed data corresponding to at least two audio signals stored in the memory of the terminal. For example, after the terminal collects the audio signals, it compresses and samples the collected audio signals to obtain compressed data, and stores the compressed data. For another example, after another terminal collects the audio signal, the other terminal compresses and samples the collected audio signal to obtain compressed data, and then the other terminal sends the compressed data to the terminal, and the terminal stores the compressed data in the memory.
S202、对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据。S202. Perform inverse quantization on the compressed data corresponding to the at least two audio signals, thereby obtaining measurement data corresponding to the at least two audio signals.
S203、获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数。S203. Obtain a measurement matrix, and jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrix corresponding to the at least two audio signals.
终端可以获取测量矩阵,并根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数。其中,测量矩阵可以为随机测量矩阵。The terminal may acquire the measurement matrix, and jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals according to the measurement data and the measurement matrix corresponding to the at least two audio signals. The measurement matrix may be a random measurement matrix.
可选的,至少两个音频信号可以包括第一音频信号和第二音频信号,则终端可以根据第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数。具体实现中,终端可以将第一音频信号对应的稀疏变换系数、第二音频信号对应的测量数据以及测量矩阵作为预设重建算法(例如AMP或者GAMP等)的输入,得到的预设重建算法的输出可以为第二音频信号对应的稀疏变换系数。Optionally, the at least two audio signals may include a first audio signal and a second audio signal, and the terminal may calculate the second audio signal according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix. The sparse transform coefficients corresponding to the audio signal. In specific implementation, the terminal may use the sparse transform coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix as the input of a preset reconstruction algorithm (such as AMP or GAMP, etc.), and the obtained preset reconstruction algorithm has The output may be sparse transform coefficients corresponding to the second audio signal.
可选的,音频信号存在空域相关性,当第一音频信号对应第一通道,第二音频信号对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号,则终端可以根据第一音频信号对应的稀疏变换系数中的第一幅度,确定第二音频信号对应的先验稀疏变换系数中的第二幅度,并将第二幅度作为第二音频信号对应的稀疏变换系数中幅度的先验,根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的幅度。其中,第一幅度与第二幅度之比为第一音频信号对应的麦克风距离声源的距离的对数与第二音频信号对应的麦克风距离声源的距离的对数之比。Optionally, the audio signal has a spatial correlation, when the first audio signal corresponds to the first channel, the second audio signal corresponds to the second channel, and the first audio signal and the second audio signal are audio signals acquired during the same period of collection, Then the terminal can determine the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal according to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, and use the second amplitude as the sparse transformation coefficient corresponding to the second audio signal. A priori of the amplitude in the transform coefficient, according to the measurement data corresponding to the second audio signal and the measurement matrix, to calculate the amplitude in the sparse transform coefficient corresponding to the second audio signal. The ratio of the first amplitude to the second amplitude is the ratio of the logarithm of the distance between the microphone corresponding to the first audio signal and the sound source and the logarithm of the distance between the microphone corresponding to the second audio signal and the sound source.
可选的,音频信号存在空域相关性,当第一音频信号对应第一通道,第二音频信号对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号时,终端可以根据第一音频信号对应的稀疏变换系数中的第一相位,确定第二音频信号对应的先验稀疏变换系数中的第二相位,将第二相位作为第二音频信号对应的稀疏变换系数中相位的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的相位。其中,第一相位与第二相位之比为第一音频信号对应的麦克风距离声源的距离与第二音频信号对应的麦克风距离声源的距离之比。Optionally, the audio signal has spatial correlation, when the first audio signal corresponds to the first channel, the second audio signal corresponds to the second channel, and the first audio signal and the second audio signal are audio signals obtained by collecting in the same period of time. , the terminal can determine the second phase in the prior sparse transform coefficient corresponding to the second audio signal according to the first phase in the sparse transform coefficient corresponding to the first audio signal, and use the second phase as the sparse transform corresponding to the second audio signal A priori of the phase in the coefficient, and according to the measurement data corresponding to the second audio signal and the measurement matrix, the phase in the sparse transformation coefficient corresponding to the second audio signal is calculated. The ratio of the first phase to the second phase is the ratio of the distance between the microphone corresponding to the first audio signal and the sound source and the distance between the microphone corresponding to the second audio signal and the sound source.
可选的,音频信号存在空域相关性,当第一音频信号对应第一通道,第二音频信号对应第二通道,且第一音频信号与第二音频信号为在同一时段采集获得的音频信号时,终端可以将第一音频信号对应的稀疏变换系数中的第一频率,确定为第二音频信号对应的先验稀疏变换系数中的第二频率,将第二频率作为第二音频信号对应的稀疏变换系数中频率的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的频率。Optionally, the audio signal has spatial correlation, when the first audio signal corresponds to the first channel, the second audio signal corresponds to the second channel, and the first audio signal and the second audio signal are audio signals obtained by collecting in the same period of time. , the terminal may determine the first frequency in the sparse transformation coefficient corresponding to the first audio signal as the second frequency in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second frequency as the sparse transformation coefficient corresponding to the second audio signal A priori of the frequency in the transform coefficient, and according to the measurement data corresponding to the second audio signal and the measurement matrix, the frequency in the sparse transform coefficient corresponding to the second audio signal is calculated.
可选的,音频信号存在时域的相关性,当第一音频信号与第二音频信号对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号时,终端可以根据第一音频信号对应的稀疏变换系数中的第一幅度,确定第二音频信号对应的先验稀疏变换系数中的第二幅度,将第二幅度作为第二音频信号对应的稀疏变换系数中幅度的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的幅度。其中,对应于同一通道的不同时段的音频信号对应的系数变换系数中的幅度与不同时段的音频信号对应的帧的序号呈线性关系。Optionally, the audio signal has a correlation in the time domain. When the first audio signal and the second audio signal correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods, the terminal can According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, determine the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second amplitude as the amplitude in the sparse transformation coefficient corresponding to the second audio signal , and according to the measurement data corresponding to the second audio signal and the measurement matrix, the amplitude in the sparse transform coefficient corresponding to the second audio signal is calculated. Wherein, the amplitudes in the coefficient transform coefficients corresponding to the audio signals of the same channel in different time periods have a linear relationship with the serial numbers of the frames corresponding to the audio signals of different time periods.
可选的,音频信号存在时域的相关性,当第一音频信号与第二音频信号对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号时,终端可以将第一音频信号对应的稀疏变换系数中的第一相位,确定为第二音频信号对应的先验稀疏变换系数中的第二相位,将第二相位作为第二音频信号对应的稀疏变换系数中相位的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的相位。Optionally, the audio signal has a correlation in the time domain. When the first audio signal and the second audio signal correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods, the terminal can Determining the first phase in the sparse transformation coefficient corresponding to the first audio signal as the second phase in the prior sparse transformation coefficient corresponding to the second audio signal, and using the second phase as the second phase in the sparse transformation coefficient corresponding to the second audio signal A priori of the phase, and according to the measurement data corresponding to the second audio signal and the measurement matrix, the phase in the sparse transformation coefficient corresponding to the second audio signal is calculated.
可选的,音频信号存在时域的相关性,当第一音频信号与第二音频信号对应同一通道,且第一音频信号与第二音频信号为在不同时段采集获得的音频信号时,终端可以根据第一音频信号对应的稀疏变换系数中的第一频率,确定第二音频信号对应的先验稀疏变换系数中的第二频率,将第二频率作为第二音频信号对应的稀疏变换系数中频率的先验,并根据第二音频信号对应的测量数据以及测量矩阵,计算第二音频信号对应的稀疏变换系数中的频率。其中,第一频率与二频率存在交集,交集中的频率通过对第一频率中的频率进行随机选择得到。Optionally, the audio signal has a correlation in the time domain. When the first audio signal and the second audio signal correspond to the same channel, and the first audio signal and the second audio signal are audio signals collected in different time periods, the terminal can According to the first frequency in the sparse transformation coefficient corresponding to the first audio signal, determine the second frequency in the prior sparse transformation coefficient corresponding to the second audio signal, and use the second frequency as the frequency in the sparse transformation coefficient corresponding to the second audio signal and the frequency in the sparse transform coefficient corresponding to the second audio signal is calculated according to the measurement data corresponding to the second audio signal and the measurement matrix. Wherein, there is an intersection between the first frequency and the second frequency, and the frequency in the intersection is obtained by randomly selecting the frequencies in the first frequency.
可选的,当第一音频信号与第二音频信号对应同一通道时,第一音频信号和第二音频信号可以为在相邻时段采集获得的音频信号。示例性的,第一音频信号可以为在第一时段采集获得的音频信号,第二音频信号可以为在第二时段采集获得的音频信号,第一时段可以早于第二时段。Optionally, when the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal may be audio signals collected in adjacent time periods. Exemplarily, the first audio signal may be an audio signal acquired during a first period of time, the second audio signal may be an audio signal acquired during a second period of time, and the first period of time may be earlier than the second period of time.
可选的,至少两个音频信号可以包括第三音频信号,终端可以根据预设的初始稀疏变换系数、第三音频信号对应的测量数据以及测量矩阵,计算第三音频信号对应的稀疏变换系数。Optionally, the at least two audio signals may include a third audio signal, and the terminal may calculate the sparse transformation coefficient corresponding to the third audio signal according to a preset initial sparse transformation coefficient, measurement data corresponding to the third audio signal, and a measurement matrix.
以图1E所示的重建音频信号的流程示意图为例,终端可以获取不同通道位于不同帧的压缩采样数据,并对不同通道位于不同帧的音频信号进行反量化,得到不同通道位于不同帧的音频信号对应的测量数据,当指定通道为第一通道(例如i=1),指定帧为起始帧时,终端可以初始化得到初始稀疏变换系数,当稀疏变换系数传递方向为正向传递方向时(例如FB=1),终端可以根据初始化稀疏变换系数、第一通道位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于起始帧的音频信号对应的稀疏变换系数。Taking the schematic flowchart of reconstructing an audio signal shown in FIG. 1E as an example, the terminal can obtain compressed sampling data with different channels located in different frames, and perform inverse quantization on the audio signals with different channels located in different frames to obtain audio with different channels located in different frames. The measurement data corresponding to the signal, when the designated channel is the first channel (for example, i=1) and the designated frame is the starting frame, the terminal can initialize to obtain the initial sparse transformation coefficients. When the transmission direction of the sparse transformation coefficients is the forward transmission direction ( For example, FB=1), the terminal can calculate the sparse transform coefficient corresponding to the audio signal with the first channel located in the starting frame according to the initialized sparse transform coefficient, the measurement data corresponding to the audio signal with the first channel located in the starting frame, and the measurement matrix.
终端还可以根据第一通道位于起始帧的音频信号对应的稀疏变换系数,确定第二通道(例如i++)位于起始帧的音频信号对应的先验稀疏变换系数,并将第二通道位于起始帧的音频信号对应的先验稀疏变换系数作为第二通道位于起始帧的音频信号对应的稀疏变换系数的先验,根据第二通道位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第二通道位于起始帧的音频信号对应的稀疏变换系数,直至得到最后一个通道(例如i=k)位于起始帧的音频信号对应的稀疏变换系数。The terminal may also determine a priori sparse transformation coefficients corresponding to the audio signal of the second channel (for example, i++) located in the starting frame according to the sparse transform coefficient corresponding to the audio signal of the first channel located in the starting frame, and set the second channel located in the starting frame. The a priori sparse transform coefficient corresponding to the audio signal of the first frame is taken as the prior of the sparse transform coefficient corresponding to the audio signal of the second channel located in the starting frame, according to the measurement data and the measurement matrix corresponding to the audio signal of the second channel located in the starting frame , calculate the sparse transform coefficients corresponding to the audio signal with the second channel located in the starting frame, until the sparse transform coefficients corresponding to the audio signal with the last channel (eg i=k) located in the starting frame are obtained.
终端得到最后一个通道位于起始帧的音频信号对应的稀疏变换系数之后,可以将稀疏变换系数传递方向更新为反向传递方向(例如FB≠1),进而根据最后一个通道位于起始帧的音频信号对应的稀疏变换系数,确定最后一个通道的上一个通道(例如i--)位于起始帧的音频信号对应的先验稀疏变换系数,并将最后一个通道的上一个通道位于起始帧的音频信号对应的先验稀疏变换系数作为最后一个通道的上一个通道位于起始帧的音频信号对应的稀疏变换系数的先验,根据最后一个通道的上一个通道位于起始帧的音频信号对应的测量数据以及测量矩阵,计算最后一个通道的上一个通道位于起始帧的音频信号对应的稀疏变换系数,直至得到第一个通道位于起始帧的音频信号对应的稀疏变换系数。After the terminal obtains the sparse transform coefficient corresponding to the audio signal whose last channel is located in the starting frame, it can update the transfer direction of the sparse transform coefficient to the reverse transfer direction (for example, FB≠1), and then according to the audio signal of the last channel located in the starting frame The sparse transform coefficient corresponding to the signal, determine the a priori sparse transform coefficient corresponding to the audio signal whose last channel (for example, i--) is located in the starting frame, and set the previous channel of the last channel at the starting frame. The prior sparse transform coefficient corresponding to the audio signal is taken as the prior of the sparse transform coefficient corresponding to the audio signal whose last channel is located in the starting frame. The measurement data and the measurement matrix are used to calculate the sparse transformation coefficient corresponding to the audio signal whose last channel is located in the starting frame until the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the starting frame is obtained.
终端得到第一个通道位于起始帧的音频信号对应的稀疏变换系数之后,还可以根据第一个通道位于起始帧的音频信号对应的稀疏变换系数,确定第一通道位于第二帧的音频信号对应的先验稀疏变换系数,并将第一通道位于第二帧的音频信号对应的先验稀疏变换系数作为第一通道位于第二帧的音频信号对应的稀疏变换系数的先验,根据第一通道位于第二帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第二帧的音频信号对应的稀疏变换系数。After the terminal obtains the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the starting frame, it can also determine the audio signal whose first channel is located in the second frame according to the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the starting frame. The prior sparse transform coefficients corresponding to the signal, and the prior sparse transform coefficients corresponding to the audio signals of the first channel located in the second frame are used as the prior of the sparse transform coefficients corresponding to the audio signals of the first channel located in the second frame. For the measurement data and the measurement matrix corresponding to the audio signal of a channel located in the second frame, the sparse transformation coefficient corresponding to the audio signal of the first channel located in the second frame is calculated.
终端计算得到第一通道位于第二帧的音频信号对应的稀疏变换系数之后,还可以根据第一通道位于第二帧的音频信号对应的稀疏变换系数,确定第二通道(例如i++)位于第二帧的音频信号对应的先验稀疏变换系数,并将第二通道位于第二帧的音频信号对应的先验稀疏变换系数作为第二通道位于第二帧的音频信号对应的稀疏变换系数的先验,根据第二通道位于第二帧的音频信号对应的测量数据以及测量矩阵,计算第二通道位于第二帧的音频信号对应的稀疏变换系数,直至得到最后一个通道(例如i=k)位于第二帧的音频信号对应的稀疏变换系数。After the terminal calculates and obtains the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the second frame, it can also determine that the second channel (for example, i++) is located in the second channel according to the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the second frame. The prior sparse transform coefficient corresponding to the audio signal of the frame, and the prior sparse transform coefficient corresponding to the audio signal of the second channel located in the second frame as the prior of the sparse transform coefficient corresponding to the audio signal of the second channel located in the second frame , according to the measurement data and the measurement matrix corresponding to the audio signal of the second channel located in the second frame, calculate the sparse transformation coefficient corresponding to the audio signal of the second channel located in the second frame, until the last channel (for example, i=k) is located in the first The sparse transform coefficients corresponding to the audio signals of two frames.
终端得到最后一个通道位于第二帧的音频信号对应的稀疏变换系数之后,可以将稀疏变换系数传递方向更新为反向传递方向(例如FB≠1),进而根据最后一个通道位于第二帧的音频信号对应的稀疏变换系数,确定最后一个通道的上一个通道(例如i--)位于第二帧的音频信号对应的先验稀疏变换系数,并将最后一个通道的上一个通道位于第二帧的音频信号对应的先验稀疏变换系数作为最后一个通道的上一个通道位于第二帧的音频信号对应的稀疏变换系数的先验,根据最后一个通道的上一个通道位于第二帧的音频信号对应的测量数据以及测量矩阵,计算最后一个通道的上一个通道位于第二帧的音频信号对应的稀疏变换系数,直至得到第一个通道位于第二帧的音频信号对应的稀疏变换系数。After obtaining the sparse transform coefficient corresponding to the audio signal whose last channel is located in the second frame, the terminal can update the transfer direction of the sparse transform coefficient to the reverse transfer direction (for example, FB≠1), and then according to the audio signal of the last channel located in the second frame The sparse transformation coefficient corresponding to the signal, determine the prior sparse transformation coefficient corresponding to the audio signal whose last channel (for example, i--) is located in the second frame, and set the last channel of the last channel in the second frame. The prior sparse transform coefficient corresponding to the audio signal is taken as the prior of the sparse transform coefficient corresponding to the audio signal whose last channel is located in the second frame. The measurement data and the measurement matrix are used to calculate the sparse transformation coefficient corresponding to the audio signal whose last channel is located in the second frame until the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the second frame is obtained.
终端得到第一个通道位于第二帧的音频信号对应的稀疏变换系数之后,还可以根据第一个通道位于第二帧的音频信号对应的稀疏变换系数,确定第一通道位于第三帧的音频信号对应的先验稀疏变换系数,并将第一通道位于第三帧的音频信号对应的先验稀疏变换系数作为第一通道位于第三帧的音频信号对应的稀疏变换系数的先验,根据第一通道位于第三帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第三帧的音频信号对应的稀疏变换系数。After obtaining the sparse transformation coefficient corresponding to the audio signal of the first channel located in the second frame, the terminal may also determine the audio of the first channel located in the third frame according to the sparse transformation coefficient corresponding to the audio signal of the first channel located in the second frame The prior sparse transform coefficients corresponding to the signals, and the prior sparse transform coefficients corresponding to the audio signals of the first channel located in the third frame are used as the prior of the sparse transform coefficients corresponding to the audio signals of the first channel located in the third frame. For the measurement data and the measurement matrix corresponding to the audio signal of a channel located in the third frame, the sparse transformation coefficient corresponding to the audio signal of the first channel located in the third frame is calculated.
同理,终端可以通过上述方式计算得到第一通道位于最后一帧的音频信号对应的稀疏变换系数。假设最后一帧为第t帧,进一步的,终端可以根据第一个通道位于第t帧的音频信号对应的稀疏变换系数,确定第一通道位于第t-1帧的音频信号对应的先验稀疏变换系数,并将第一通道位于第t-1帧的音频信号对应的先验稀疏变换系数作为第一通道位于第t-1帧的音频信号对应的稀疏变换系数的先验,根据第一通道位于第t-1帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第t-1帧的音频信号对应的稀疏变换系数。进一步的,终端可以根据第一个通道位于第t-1帧的音频信号对应的稀疏变换系数,确定第一通道位于第t-2帧的音频信号对应的先验稀疏变换系数,并将第一通道位于第t-2帧的音频信号对应的先验稀疏变换系数作为第一通道位于第t-2帧的音频信号对应的稀疏变换系数的先验,根据第一通道位于第t-2帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第t-2帧的音频信号对应的稀疏变换系数,直至得到第一通道位于起始帧的音频信号对应的稀疏变换系数。Similarly, the terminal can obtain the sparse transform coefficient corresponding to the audio signal whose first channel is located in the last frame by calculating in the above manner. Assuming that the last frame is the t-th frame, further, the terminal may determine the prior sparseness corresponding to the audio signal whose first channel is located at the t-1th frame according to the sparse transformation coefficient corresponding to the audio signal whose first channel is located at the t-th frame transform coefficients, and use the prior sparse transform coefficients corresponding to the audio signal whose first channel is located at the t-1th frame as the prior of the sparse transform coefficients corresponding to the audio signal whose first channel is located at the t-1th frame. The measurement data and the measurement matrix corresponding to the audio signal located in the t-1th frame are used to calculate the sparse transformation coefficient corresponding to the audio signal of the first channel located in the t-1th frame. Further, the terminal may determine the prior sparse transformation coefficient corresponding to the audio signal whose first channel is located in the t-2th frame according to the sparse transformation coefficient corresponding to the audio signal whose first channel is located in the t-1th frame, and convert the first channel into the t-2th frame. The prior sparse transform coefficients corresponding to the audio signal whose channel is located in the t-2th frame is taken as the prior of the sparse transform coefficients corresponding to the audio signal whose first channel is located in the t-2th frame. The measurement data and measurement matrix corresponding to the audio signal are used to calculate the sparse transformation coefficients corresponding to the audio signals whose first channel is located in the t-2th frame, until the sparse transformation coefficients corresponding to the audio signals whose first channel is located in the starting frame are obtained.
S204、对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。S204. Perform sparse inverse transform on the sparse transform coefficients corresponding to the at least two audio signals to obtain at least two audio signals.
终端可以对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号。示例性的,终端可以通过IMDCT或者IFFT等方式对至少两个音频信号对应的稀疏变换系数进行时频逆变换,得到至少两个音频信号。The terminal may perform sparse inverse transform on the sparse transform coefficients corresponding to the at least two audio signals to obtain at least two audio signals. Exemplarily, the terminal may perform inverse time-frequency transform on the sparse transform coefficients corresponding to the at least two audio signals by means of IMDCT or IFFT, to obtain at least two audio signals.
在图2所示的音频信号重建方法中,终端可以获取至少两个音频信号对应的压缩数据,对至少两个音频信号对应的压缩数据进行反量化,从而得到至少两个音频信号对应的测量数据,并获取测量矩阵,根据至少两个音频信号对应的测量数据和测量矩阵,联合重建至少两个音频信号对应的稀疏变换系数,对至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到至少两个音频信号,可提高音频信号的质量。In the audio signal reconstruction method shown in FIG. 2 , the terminal may obtain compressed data corresponding to at least two audio signals, and perform inverse quantization on the compressed data corresponding to at least two audio signals, thereby obtaining measurement data corresponding to at least two audio signals , and obtain the measurement matrix, according to the measurement data and the measurement matrix corresponding to the at least two audio signals, jointly reconstruct the sparse transformation coefficients corresponding to the at least two audio signals, and perform sparse inverse transformation on the sparse transformation coefficients corresponding to the at least two audio signals to obtain At least two audio signals to improve the quality of the audio signal.
请参见图3,图3是本发明实施例中提供的一种音频信号重建方法的流程示意图,如图所示本发明实施例中的音频信号重建方法至少可以包括:Referring to FIG. 3, FIG. 3 is a schematic flowchart of an audio signal reconstruction method provided in an embodiment of the present invention. As shown in the figure, the audio signal reconstruction method in the embodiment of the present invention may at least include:
S301,第一终端对采集到的各个通道的音频信号进行采样,采样得到的各个通道的音频信号为同一时段采集到的。S301, the first terminal samples the collected audio signals of each channel, and the sampled audio signals of each channel are collected in the same time period.
第一终端可以通过各个麦克风采集音频信号,其中各个麦克风采集音频信号的频率相同,即各个麦克风采集到的音频信号的支撑集相同,支撑集可以包括至少一个支撑,支撑用于指示音频信号的频率与指定频率是否相同,例如各个麦克风采集音频信号的频率包括5HZ、7HZ以及10HZ,则各个麦克风采集到的音频信号的支撑集为0000101001,支撑集中的第一个“0”表示该音频信号的频率不为1HZ,支撑集中的第二个“0”表示该音频信号的频率不为2HZ,支撑集中的第三个“0”表示该音频信号的频率不为3HZ,支撑集中的第四个“0”表示该音频信号的频率不为4HZ,支撑集中的第一个“1”表示该音频信号的频率为5HZ,支撑集中的第五个“0”表示该音频信号的频率不为6HZ,支撑集中的第二个“1”表示该音频信号的频率为7HZ,支撑集中的第六个“0”表示该音频信号的频率不为8HZ,支撑集中的第七个“0”表示该音频信号的频率不为9HZ,支撑集中的第三个“1”表示该音频信号的频率为10HZ。The first terminal may collect audio signals through each microphone, wherein the frequencies of the audio signals collected by the microphones are the same, that is, the support sets of the audio signals collected by the microphones are the same, and the support set may include at least one support, which supports the frequency used to indicate the audio signal. Whether the frequency is the same as the specified frequency. For example, the frequency of the audio signal collected by each microphone includes 5HZ, 7HZ and 10HZ, then the support set of the audio signal collected by each microphone is 0000101001, and the first "0" in the support set indicates the frequency of the audio signal Not 1HZ, the second "0" in the support set indicates that the frequency of the audio signal is not 2HZ, the third "0" in the support set indicates that the frequency of the audio signal is not 3HZ, and the fourth "0" in the support set "Indicates that the frequency of the audio signal is not 4HZ, the first "1" in the support set indicates that the frequency of the audio signal is 5HZ, and the fifth "0" in the support set indicates that the frequency of the audio signal is not 6HZ, and the support set The second "1" in the support set indicates that the frequency of the audio signal is 7HZ, the sixth "0" in the support set indicates that the frequency of the audio signal is not 8HZ, and the seventh "0" in the support set indicates the frequency of the audio signal Instead of 9HZ, the third "1" in the support set indicates that the frequency of the audio signal is 10HZ.
其中,采集到的音频信号存在频域的相关性。示例性的,各个通道的音频信号可以表示如下:Among them, the collected audio signal has a correlation in the frequency domain. Exemplarily, the audio signals of each channel can be represented as follows:
x(t)=ccos(2πf+θ)x(t)=ccos(2πf+θ)
其中,t表示通道的音频信号的帧,x(t)表示通道的位于t帧的音频信号,c表示通道的位于t帧的音频信号的幅度,f表示通道的位于t帧的音频信号的频率,θ表示通道的位于t帧的音频信号的相位。第一终端可以确定任一通道的音频信号是由有限个频率成分组成的,且不同时段采集到的位于同一通道的音频信号的频率是稀疏的。where t represents the frame of the audio signal of the channel, x(t) represents the audio signal of the channel located in the t frame, c represents the amplitude of the audio signal of the channel located in the t frame, and f represents the frequency of the audio signal of the channel located in the t frame , θ denotes the phase of the channel's audio signal at frame t. The first terminal may determine that the audio signal of any channel is composed of a limited number of frequency components, and the frequencies of the audio signals located in the same channel collected in different time periods are sparse.
其中,采集到的音频信号存在空域的相关性。示例性的,同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的幅度与该通道的音频信号的采集距离之间呈对数衰减,同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的相位与该通道的音频信号的采集距离之间呈线性变化,同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的支撑相同。Among them, the collected audio signal has spatial correlation. Exemplarily, the amplitudes in the sparse transform coefficients corresponding to the audio signals of each channel collected in the same time period and the collection distance of the audio signals of the channel show logarithmic attenuation, and the audio signals of each channel collected in the same time period correspond to The phase in the sparse transform coefficient varies linearly with the collection distance of the audio signal of the channel, and the supports in the sparse transform coefficient corresponding to the audio signals of each channel collected in the same period are the same.
其中,采集到的音频信号存在时域的相关性。示例性的,不同时段采集到的同一通道的音频信号的幅度线性相关,不同时段采集到的同一通道的音频信号的相位相同,不同时段采集到的同一通道的音频信号的支撑呈伯努利分布。Among them, the collected audio signal has a correlation in the time domain. Exemplarily, the amplitudes of the audio signals of the same channel collected in different time periods are linearly correlated, the audio signals of the same channel collected in different time periods have the same phase, and the support of the audio signals of the same channel collected in different time periods is a Bernoulli distribution. .
第一终端可以将任一麦克风采集到的音频信号作为一个通道的音频信号,并对采集到的各个通道的音频信号进行采样,采样得到同一时段采集到的各个通道的音频信号,例如第一终端采样得到各个通道位于起始帧的音频信号,或者采样得到各个通道位于第二帧的音频信号,或者采样得到各个通道位于最后一帧的音频信号,等等。The first terminal may use the audio signal collected by any microphone as the audio signal of one channel, and sample the collected audio signals of each channel, and obtain the audio signals of each channel collected in the same period by sampling, for example, the first terminal Sampling to obtain the audio signal of each channel located in the starting frame, or sampling to obtain the audio signal of each channel located in the second frame, or sampling to obtain the audio signal of each channel located in the last frame, and so on.
S302,第一终端对采样得到的各个通道的音频信号进行稀疏变换,得到稀疏变换系数。S302, the first terminal performs sparse transformation on the sampled audio signals of each channel to obtain sparse transformation coefficients.
第一终端可以对采样得到的各个通道的音频信号进行稀疏变换,得到稀疏变换系数,即将采样得到的同一时段采集到的各个通道的音频信号进行时频变换,例如第一终端可以通过FFT算法或者MDCT算法对采样得到的同一时段采集到的各个通道的音频信号进行稀疏变换,得到稀疏变换系数。The first terminal may perform sparse transformation on the sampled audio signals of each channel to obtain sparse transformation coefficients, that is, perform time-frequency transformation on the sampled audio signals of each channel collected in the same time period. For example, the first terminal may use an FFT algorithm or The MDCT algorithm performs sparse transformation on the audio signals of each channel collected in the same period of sampling, and obtains sparse transformation coefficients.
S303,第一终端将预设测量矩阵与稀疏变换系数相乘,得到感知测量值。S303: The first terminal multiplies the preset measurement matrix by the sparse transformation coefficient to obtain a perceptual measurement value.
第一终端得到同一时段采集到的各个通道的音频信号的稀疏变换系数之后,可以将预设测量矩阵与稀疏变数系数相乘,得到感知测量值。其中,预设测量矩阵的行数小于列数,可实现亚奈奎斯特采样,以便在重建音频信号过程中,可以无失真地恢复音频信号。After obtaining the sparse transformation coefficients of the audio signals of each channel collected in the same period, the first terminal may multiply the preset measurement matrix by the sparse variable coefficients to obtain the perceptual measurement value. Wherein, the number of rows of the preset measurement matrix is smaller than the number of columns, so that sub-Nyquist sampling can be implemented, so that in the process of reconstructing the audio signal, the audio signal can be restored without distortion.
S304,第一终端将感知测量矩阵与噪声矢量进行相加,得到压缩数据。S304, the first terminal adds the perceptual measurement matrix and the noise vector to obtain compressed data.
第一终端得到感知测量值之后,可以对感知测量值进行矩阵量化,得到压缩采样数据。例如,第一终端可以获取各个通道的音频信号的采集环境对应的噪声矢量,将感知测量矩阵与噪声矢量进行相加,得到压缩数据。After obtaining the sensing measurement value, the first terminal may perform matrix quantization on the sensing measurement value to obtain compressed sampling data. For example, the first terminal may acquire the noise vector corresponding to the collection environment of the audio signal of each channel, and add the perceptual measurement matrix and the noise vector to obtain compressed data.
示例性的,第一终端对各通道位于同一帧的音频信号进行处理得到压缩数据,可以通过如下公式表示:Exemplarily, the first terminal processes audio signals whose channels are located in the same frame to obtain compressed data, which can be expressed by the following formula:
其中,Ym×N表示N个通道m个帧的压缩采样数据,Am×n表示预设测量矩阵,表示稀疏变换系数,W表示各个通道的音频信号的采集环境对应的噪声矢量,其中,Xn×N表示N个通道m个帧的音频信号,m<n。Among them, Y m×N represents the compressed sampling data of m frames of N channels, A m×n represents the preset measurement matrix, represents the sparse transformation coefficient, W represents the noise vector corresponding to the acquisition environment of the audio signal of each channel, where, X n×N represents the audio signal of N channels and m frames, and m<n.
具体实现中,第一终端可以采样得到各个通道位于起始帧的音频信号,对采样得到的各个通道位于起始帧的音频信号进行稀疏变换,得到稀疏变换系数,将预设测量矩阵与该稀疏变换系数相乘,得到感知测量值,将感知测量值与噪声矢量相加,得到各个通道位于起始帧的压缩数据。进一步的,第一终端可以采样得到各个通道位于第二帧的音频信号,对采样得到的各个通道位于第二帧的音频信号进行稀疏变换,得到稀疏变换系数,将预设测量矩阵与该稀疏变换系数相乘,得到感知测量值,将感知测量值与噪声矢量相加,得到各个通道位于第二帧的压缩数据。进一步的,第一终端可以采样得到各个通道位于第三帧的音频信号,对采样得到的各个通道位于第三帧的音频信号进行稀疏变换,得到稀疏变换系数,将预设测量矩阵与该稀疏变换系数相乘,得到感知测量值,将感知测量值与噪声矢量相加,得到各个通道位于第三帧的压缩数据,直至得到各个通道位于最后一帧的压缩数据。In a specific implementation, the first terminal may sample the audio signals whose channels are located in the starting frame, perform sparse transformation on the sampled audio signals whose channels are located in the starting frame, obtain sparse transformation coefficients, and combine the preset measurement matrix with the sparse audio signal. The transformation coefficients are multiplied to obtain the perceptual measurement value, and the perceptual measurement value is added to the noise vector to obtain the compressed data of each channel at the starting frame. Further, the first terminal can sample the audio signals whose channels are located in the second frame, perform sparse transformation on the sampled audio signals whose channels are located in the second frame, obtain sparse transformation coefficients, and combine the preset measurement matrix with the sparse transformation. The coefficients are multiplied to obtain the perceptual measurement value, and the perceptual measurement value is added to the noise vector to obtain the compressed data of each channel in the second frame. Further, the first terminal may sample the audio signals whose channels are located in the third frame, perform sparse transformation on the sampled audio signals whose channels are located in the third frame, obtain sparse transformation coefficients, and combine the preset measurement matrix with the sparse transformation. The coefficients are multiplied to obtain the perceptual measurement value, and the perceptual measurement value is added to the noise vector to obtain the compressed data of each channel in the third frame until the compressed data of each channel in the last frame is obtained.
S305,第一终端将压缩数据发送给第二终端。S305, the first terminal sends the compressed data to the second terminal.
第一终端获取到各个通道的压缩数据之后,可以将各个通道的压缩采样数据发送给第二终端。可选的,第一终端可以在获取到各个通道位于不同帧的压缩采样数据之后,将各个通道位于不同帧的压缩采样数据发送给第二终端。After acquiring the compressed data of each channel, the first terminal may send the compressed sampling data of each channel to the second terminal. Optionally, after acquiring the compressed sampling data of each channel located in a different frame, the first terminal may send the compressed sampling data of each channel located in a different frame to the second terminal.
S306,第二终端根据一个通道的位于指定帧的音频信号对应的稀疏变换系数,确定该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数。S306, the second terminal determines a priori sparse transformation coefficients corresponding to the audio signal of the next channel of the channel corresponding to the audio signal of the specified frame according to the sparse transformation coefficient of one channel corresponding to the audio signal located in the specified frame.
S307,第二终端根据该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数、该通道的下一个通道位于指定帧的音频信号对应的测量数据以及测量矩阵,得到该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数。S307, the second terminal obtains the a priori sparse transformation coefficient corresponding to the audio signal whose next channel of the channel is located in the specified frame, the measurement data and the measurement matrix corresponding to the audio signal whose next channel of the channel is located in the specified frame, and the measurement matrix of the channel. The sparse transform coefficients corresponding to the audio signal whose next channel is at the specified frame.
可选的,第二终端可以将该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数、该通道的下一个通道位于指定帧的音频信号对应的测量数据以及测量矩阵作为贝叶斯算法的输入,得到该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数。Optionally, the second terminal may use the a priori sparse transform coefficient corresponding to the audio signal whose next channel of the channel is located in the specified frame, the measurement data corresponding to the audio signal whose next channel of the channel is located in the specified frame, and the measurement matrix as the shell. The input of the Yess algorithm is to obtain the sparse transform coefficient corresponding to the audio signal whose next channel of this channel is located in the specified frame.
示例性的,贝叶斯算法可以表示如下:Exemplarily, the Bayesian algorithm can be expressed as follows:
其中,p(x,θ,c,s|y)表示基于y得到x、θ、c以及s的概率,x表示音频信号对应的稀疏变换系数,θ表示音频信号对应的稀疏变换系数中的相位,c表示音频信号对应的稀疏变换系数中的幅度,s表示音频信号对应的稀疏变换系数中的频率,y表示音频信号对应的测量数据;T表示各个通道的音频信号的帧长,M表示测量矩阵的行数,N表示测量矩阵的列数;表示基于x(t)得到的概率,表示第m通道的位于第t帧的音频信号对应的测量数据,x(t)表示第t帧的音频信号对应的稀疏变换系数;表示基于以及sn得到的概率,表示第n通道的位于第t帧的音频信号对应的稀疏变换系数中的相位,表示第n通道的位于第t帧的音频信号对应的稀疏变换系数中的幅度,sn表示位于第n通道的音频信号对应的稀疏变换系数中的频率;表示基于得到的概率,表示第n通道的位于第t-1帧的音频信号对应的稀疏变换系数中的相位;表示基于得到的概率,表示第n通道的位于第t-1帧的音频信号对应的稀疏变换系数中的幅度;p(sn)表示第n通道的音频信号对应的稀疏变换系数中的频率的概率。Among them, p(x, θ, c, s|y) represents the probability of obtaining x, θ, c and s based on y, x represents the sparse transformation coefficient corresponding to the audio signal, and θ represents the phase in the sparse transformation coefficient corresponding to the audio signal , c represents the amplitude in the sparse transformation coefficient corresponding to the audio signal, s represents the frequency in the sparse transformation coefficient corresponding to the audio signal, y represents the measurement data corresponding to the audio signal; T represents the frame length of the audio signal of each channel, and M represents the measurement The number of rows of the matrix, N represents the number of columns of the measurement matrix; means based on x (t) to get The probability, represents the measurement data corresponding to the audio signal of the t-th frame of the m-th channel, and x (t) represents the sparse transform coefficient corresponding to the audio signal of the t-th frame; means based on and sn get The probability, represents the phase of the nth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, Represents the amplitude in the sparse transformation coefficient corresponding to the audio signal of the nth channel located in the tth frame, and sn represents the frequency in the sparse transformation coefficient corresponding to the audio signal located in the nth channel; means based on get The probability, represents the phase in the sparse transform coefficient corresponding to the audio signal of the t-1th frame of the nth channel; means based on get The probability, represents the amplitude of the sparse transform coefficient corresponding to the audio signal of the t-1th channel of the nth channel; p(s n ) represents the probability of the frequency in the sparse transform coefficient corresponding to the audio signal of the nth channel.
其中,消息传递的方式可以包括正向传递方式和反向传递方式,其中,正向传递方式即根据一个通道位于指定帧的音频信号对应的稀疏变换系数,确定该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数,并根据该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数、该通道的下一个通道位于指定帧的音频信号对应的测量数据以及测量矩阵,得到该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数。反向传递即根据一个通道位于指定帧的音频信号对应的稀疏变换系数,确定该通道的上一个通道位于指定帧的音频信号对应的先验稀疏变换系数,并根据该通道的上一个通道位于指定帧的音频信号对应的先验稀疏变换系数、该通道的上一个通道位于指定帧的音频信号对应的测量数据以及测量矩阵,得到该通道的上一个通道位于指定帧的音频信号对应的稀疏变换系数。以图1B所示的传递参数信息的界面示意图为例,表示第一通道位于第t帧的音频信号对应的测量数据,表示第m通道位于第t帧的音频信号对应的测量数据,表示第n通道位于第t帧的音频信号对应的稀疏变换系数,表示第k通道位于第t帧的音频信号对应的稀疏变换系数,表示第n通道位于第t帧的音频信号对应的稀疏变换系数中的频率,表示第k通道位于第t帧的音频信号对应的稀疏变换系数中的频率,表示第n通道位于第t帧的音频信号对应的稀疏变换系数中的相位,表示第k通道位于第t帧的音频信号对应的稀疏变换系数中的相位,表示第n通道的位于第t帧的音频信号对应的稀疏变换系数中的幅度,表示第k通道的位于第t帧的音频信号对应的稀疏变换系数中的幅度,例如,第二终端基于得到第二终端可以根据获取第n通道位于第t帧的音频信号对应的稀疏变换系数中的频率(即),第n通道位于第t帧的音频信号对应的稀疏变换系数中的相位(即),第n通道位于第t帧的音频信号对应的稀疏变换系数中的幅度(即),进而根据音频信号时域近场算法,基于第n通道位于第t帧的音频信号对应的稀疏变换系数中的频率、相位以及幅度,确定第n通道位于第t+1帧的音频信号对应的先验稀疏变换系数,并根据第n通道位于第t+1帧的音频信号对应的先验稀疏变换系数、第n通道位于第t+1帧的音频信号对应的测量数据以及测量矩阵,得到第n通道位于第t+1帧的音频信号对应的稀疏变换系数,例如第n通道位于第t+1帧的音频信号对应的稀疏变换系数中的相位(即)等。又如,第二终端根据第n通道位于第t+1帧的音频信号对应的稀疏变换系数,,确定第n通道位于第t帧的音频信号对应的先验稀疏变换系数,并根据第n通道位于第t帧的音频信号对应的先验稀疏变换系数、第n通道位于第t帧的音频信号对应的测量数据以及测量矩阵,得到第n通道位于第t帧的音频信号对应的稀疏变换系数。需要说明的是,本发明实施例中不同时段采集到的同一通道的音频信号对应的稀疏变换系数在同一通道内传递,可提高通道的冗余性。The message transmission method may include a forward transmission method and a reverse transmission method, wherein the forward transmission method is to determine that the next channel of the channel is located in the specified frame according to the sparse transformation coefficient corresponding to the audio signal of the channel located in the specified frame. The a priori sparse transform coefficient corresponding to the audio signal of the channel, and the a priori sparse transform coefficient corresponding to the audio signal whose next channel of this channel is located in the specified frame, the measurement data corresponding to the audio signal whose next channel is located in the specified frame, and Measure the matrix to obtain the sparse transform coefficients corresponding to the audio signal whose next channel of this channel is located in the specified frame. Reverse transfer is to determine the prior sparse transformation coefficient corresponding to the audio signal of the audio signal whose previous channel is located in the specified frame according to the sparse transformation coefficient corresponding to the audio signal of a channel located in the specified frame, and according to the previous channel of the channel located in the specified frame. The prior sparse transformation coefficients corresponding to the audio signal of the frame, the measurement data and measurement matrix corresponding to the audio signal whose previous channel is located in the specified frame, and the sparse transformation coefficient corresponding to the audio signal whose previous channel is located in the specified frame are obtained . Taking the schematic diagram of the interface for transferring parameter information shown in FIG. 1B as an example, represents the measurement data corresponding to the audio signal whose first channel is located in the t-th frame, represents the measurement data corresponding to the audio signal of the mth channel located in the tth frame, represents the sparse transform coefficient corresponding to the audio signal of the nth channel located in the tth frame, represents the sparse transform coefficient corresponding to the audio signal of the kth channel located in the tth frame, represents the frequency of the nth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, represents the frequency of the kth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, represents the phase of the nth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, represents the phase of the kth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, represents the amplitude in the sparse transform coefficient corresponding to the audio signal of the nth channel located in the tth frame, represents the amplitude of the kth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame, for example, the second terminal is based on get The second terminal can be based on Obtain the frequency of the sparse transform coefficient corresponding to the audio signal of the nth channel in the tth frame (ie ), the phase of the nth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame (ie ), the amplitude of the nth channel in the sparse transform coefficient corresponding to the audio signal of the tth frame (ie ), and then according to the time-domain near-field algorithm of the audio signal, based on the frequency, phase and amplitude of the sparse transformation coefficients corresponding to the audio signal of the t-th channel in the t-th frame, it is determined that the n-th channel is located in the t+1-th frame of the audio signal corresponding to and according to the prior sparse transformation coefficients corresponding to the audio signal of the nth channel located in the t+1th frame, the measurement data and the measurement matrix corresponding to the audio signal of the nth channel located in the t+1th frame, we get The sparse transform coefficient corresponding to the audio signal whose channel n is located at frame t+1, for example, the phase in the sparse transform coefficient corresponding to the audio signal whose channel is located at frame t+1 (ie )Wait. For another example, the second terminal determines, according to the sparse transformation coefficients corresponding to the audio signal of the nth channel located in the t+1th frame, the prior sparse transformation coefficients corresponding to the audio signal of the nth channel located in the tth frame, and according to the nth channel The prior sparse transform coefficients corresponding to the audio signal located in the t-th frame, the measurement data and the measurement matrix corresponding to the audio signal of the n-th channel located in the t-th frame, are obtained. It should be noted that, in the embodiment of the present invention, the sparse transformation coefficients corresponding to the audio signals of the same channel collected in different time periods are transmitted in the same channel, which can improve the redundancy of the channels.
以图1C所示的重建音频信号的界面示意图为例,采集到的音频信号的帧长为t,通道数量为K,第二终端可以预置第一通道(Chann 1)的位于起始帧(Frame1)的音频信号对应的初始稀疏变换系数,根据第一通道的位于起始帧的音频信号对应的初始稀疏变换系数、第一通道的位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于起始帧的音频信号对应的稀疏变换系数。进一步的,第二终端可以将第一通道位于起始帧的音频信号对应的稀疏变换系数作为音频信号空域近场算法的输入,得到第二通道(Chann 2)的位于起始帧的音频信号对应的先验稀疏变换系数,根据第二通道的位于起始帧的音频信号对应的先验稀疏变换系数、第二通道的位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第二通道位于起始帧的音频信号对应的稀疏变换系数。进一步的,第二终端可以将第二通道位于起始帧的音频信号对应的稀疏变换系数作为音频信号空域近场算法的输入,得到第三通道(Chann 3)的位于起始帧的音频信号对应的先验稀疏变换系数,根据第三通道的位于起始帧的音频信号对应的先验稀疏变换系数、第三通道的位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第三通道位于起始帧的音频信号对应的稀疏变换系数。Taking the interface schematic diagram of the reconstructed audio signal shown in FIG. 1C as an example, the frame length of the collected audio signal is t, the number of channels is K, and the second terminal can preset the first channel (Chann 1) at the start frame ( Frame1) the corresponding initial sparse transformation coefficient of the audio signal, according to the initial sparse transformation coefficient corresponding to the audio signal of the first channel located in the initial frame, the measurement data and the measurement matrix corresponding to the audio signal of the first channel located in the initial frame, Calculate the sparse transform coefficients corresponding to the audio signal whose first channel is located in the starting frame. Further, the second terminal can use the sparse transform coefficient corresponding to the audio signal of the first channel in the initial frame as the input of the spatial near-field algorithm of the audio signal, and obtain the corresponding audio signal of the second channel (Chann 2) in the initial frame. The prior sparse transform coefficients of The sparse transform coefficients corresponding to the audio signal at the start frame. Further, the second terminal can use the sparse transform coefficient corresponding to the audio signal of the second channel in the initial frame as the input of the spatial near-field algorithm of the audio signal, and obtain the corresponding audio signal of the third channel (Chann 3) in the initial frame. The a priori sparse transformation coefficients of The sparse transform coefficients corresponding to the audio signal at the start frame.
可选的,第二终端得到第K通道(Chann K)的位于起始帧的音频信号对应的稀疏变换系数之后,可以将第K通道的位于起始帧的音频信号对应的稀疏变换系数作为音频信号空域近场算法的输入,得到第K-1通道的位于起始帧的音频信号对应的先验稀疏变换系数,根据第K-1通道的位于起始帧的音频信号对应的先验稀疏变换系数、第K-1通道的位于起始帧的音频信号对应的测量数据以及测量矩阵,计算第K-1通道位于起始帧的音频信号对应的稀疏变换系数,直至得到第一通道的位于起始帧的音频信号对应的稀疏变换系数。Optionally, after obtaining the sparse transform coefficient corresponding to the audio signal of the Kth channel (Chann K) located in the starting frame, the second terminal may use the sparse transform coefficient corresponding to the audio signal located in the starting frame of the Kth channel as the audio signal. The input of the near-field algorithm in the signal space domain is to obtain the a priori sparse transformation coefficient corresponding to the audio signal located in the starting frame of the K-1th channel, according to the prior sparse transformation corresponding to the audio signal located in the starting frame of the K-1th channel coefficient, measurement data and measurement matrix corresponding to the audio signal of the K-1th channel located in the starting frame, calculate the sparse transformation coefficient corresponding to the audio signal of the K-1th channel located in the starting frame, until the first channel is located in the starting frame. The sparse transform coefficients corresponding to the audio signal of the first frame.
可选的,第二终端得到第一通道的位于起始帧的音频信号对应的稀疏变换系数之后,可以将第一通道位于起始帧的音频信号对应的稀疏变换系数作为音频信号时域近场算法的输入,得到第一通道的位于第二帧的音频信号对应的先验稀疏变换系数,根据第一通道的位于第二帧的音频信号对应的先验稀疏变换系数、第一通道的位于第二帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第二帧的音频信号对应的稀疏变换系数。进一步的,第二终端可以将第一通道位于第二帧的音频信号对应的稀疏变换系数作为音频信号时域近场算法的输入,得到第一通道的位于第三帧的音频信号对应的先验稀疏变换系数,根据第一通道的位于第三帧的音频信号对应的先验稀疏变换系数、第一通道的位于第三帧的音频信号对应的测量数据以及测量矩阵,计算第一通道位于第三帧的音频信号对应的稀疏变换系数,直至得到第一个通道的位于第t帧(Frame t)的音频信号对应的稀疏变换系数。Optionally, after obtaining the sparse transform coefficient corresponding to the audio signal of the first channel located in the starting frame, the second terminal may use the sparse transform coefficient corresponding to the audio signal located in the starting frame of the first channel as the time-domain near field of the audio signal. The input of the algorithm is to obtain the a priori sparse transformation coefficient corresponding to the audio signal of the first channel located in the second frame, according to the a priori sparse transformation coefficient corresponding to the audio signal of the first channel located in the second frame, For the measurement data and measurement matrix corresponding to the audio signals of the two frames, the sparse transformation coefficients corresponding to the audio signals of the first channel located in the second frame are calculated. Further, the second terminal may use the sparse transform coefficient corresponding to the audio signal of the first channel located in the second frame as the input of the audio signal time-domain near-field algorithm, and obtain the prior corresponding to the audio signal of the first channel located in the third frame. The sparse transformation coefficient is calculated according to the prior sparse transformation coefficient corresponding to the audio signal of the first channel located in the third frame, the measurement data corresponding to the audio signal of the first channel located in the third frame, and the measurement matrix to calculate the first channel located in the third frame. The sparse transform coefficients corresponding to the audio signals of the frames are obtained until the sparse transform coefficients corresponding to the audio signals of the first channel located in the t-th frame (Frame t) are obtained.
可选的,第二终端得到第一个通道的位于第t帧的音频信号对应的稀疏变换系数之后,可以将第一个通道的位于第t帧的音频信号对应的稀疏变换系数作为音频信号时域近场算法的输入,得到第一个通道的位于第t-1个帧(Frame t-1)的音频信号对应的先验稀疏变换系数,根据第一个通道的位于第t-1个帧的音频信号对应的先验稀疏变换系数、第一个通道的位于第t-1个帧的音频信号对应的测量数据以及测量矩阵,计算第一个通道的位于第t-1个帧的音频信号对应的稀疏变换系数,直至得到第一个通道的位于第一个帧的音频信号对应的稀疏变换系数。Optionally, after obtaining the sparse transform coefficient corresponding to the audio signal located in the t-th frame of the first channel, the second terminal may use the sparse transform coefficient corresponding to the audio signal located in the t-th frame of the first channel as the audio signal. The input of the domain near-field algorithm, the a priori sparse transform coefficient corresponding to the audio signal located in the t-1th frame (Frame t-1) of the first channel is obtained, according to the t-1th frame of the first channel. The a priori sparse transformation coefficients corresponding to the audio signal of the first channel, the measurement data and the measurement matrix corresponding to the audio signal of the first channel located in the t-1th frame, and the audio signal of the first channel located in the t-1th frame is calculated. The corresponding sparse transform coefficients are obtained until the sparse transform coefficients corresponding to the audio signal located in the first frame of the first channel are obtained.
其中,音频信号空域近场算法具体可以包括:同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的幅度与通道的音频信号的采集距离之间呈对数衰减,同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的相位与通道的音频信号的采集距离之间呈线性变化,同一时段采集到的各个通道的音频信号对应的稀疏变换系数中的支撑相同。The spatial-domain near-field algorithm of the audio signal may specifically include: a logarithmic attenuation between the amplitudes in the sparse transform coefficients corresponding to the audio signals of each channel collected in the same period and the collection distance of the audio signals of the channels, and the The phase in the sparse transformation coefficients corresponding to the audio signals of each channel varies linearly with the collection distance of the audio signals of the channels, and the supports in the sparse transformation coefficients corresponding to the audio signals of each channel collected in the same period are the same.
其中,音频信号时域近场算法具体可以包括:不同时段采集到的同一通道的音频信号对应的稀疏变换系数中的幅度线性相关,不同时段采集到的同一通道的音频信号对应的稀疏变换系数中的相位相同,不同时段采集到的同一通道的音频信号对应的稀疏变换系数中的支撑呈伯努利分布。The time-domain near-field algorithm of the audio signal may specifically include: linear correlation of amplitudes in the sparse transformation coefficients corresponding to the audio signals of the same channel collected in different time periods; The phase of the same channel is the same, and the supports in the sparse transform coefficients corresponding to the audio signals of the same channel collected in different time periods are Bernoulli distribution.
S308,第二终端对该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数进行稀疏反变换,得到该通道的下一个通道位于指定帧的音频信号。S308, the second terminal performs sparse inverse transformation on the sparse transform coefficient corresponding to the audio signal whose next channel of the channel is located in the specified frame, to obtain the audio signal whose next channel of the channel is located in the specified frame.
在图3所示的音频信号重建方法中,第一终端采样得到同一时段采集到的各个通道的音频信号,第一终端对采样得到的各个通道的音频信号进行稀疏变换以及感知测量得到压缩数据,第二终端获取到第一终端发送的压缩数据之后,根据一个通道的位于指定帧的音频信号对应的稀疏变换系数,确定该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数,根据该通道的下一个通道位于指定帧的音频信号对应的先验稀疏变换系数、该通道的下一个通道位于指定帧的音频信号对应的测量数据以及测量矩阵,得到该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数,对该通道的下一个通道位于指定帧的音频信号对应的稀疏变换系数进行稀疏反变换,得到该通道的下一个通道位于指定帧的音频信号,可提高音频信号的质量。In the audio signal reconstruction method shown in FIG. 3 , the first terminal samples the audio signals of each channel collected in the same period, and the first terminal performs sparse transformation and perceptual measurement on the sampled audio signals of each channel to obtain compressed data, After acquiring the compressed data sent by the first terminal, the second terminal determines, according to the sparse transform coefficients of one channel corresponding to the audio signal located in the specified frame, the prior sparse transform coefficients corresponding to the audio signal located in the specified frame of the next channel of the channel , according to the a priori sparse transform coefficients corresponding to the audio signal whose next channel is located in the specified frame, the measurement data corresponding to the audio signal whose next channel is located in the specified frame, and the measurement matrix, obtain the next channel of the channel located in The sparse transformation coefficient corresponding to the audio signal of the specified frame, perform sparse inverse transformation on the sparse transformation coefficient corresponding to the audio signal whose next channel of the channel is located in the specified frame, and obtain the audio signal whose next channel of this channel is located in the specified frame, which can improve the quality of the audio signal.
请参见图4,图4是本发明另一实施例中提供的一种音频信号压缩方法的流程示意图,如图所示本发明实施例中的音频信号压缩方法至少可以包括:Referring to FIG. 4, FIG. 4 is a schematic flowchart of an audio signal compression method provided in another embodiment of the present invention. As shown in the figure, the audio signal compression method in the embodiment of the present invention may at least include:
S401,终端采集多个通道的音频信号。S401, the terminal collects audio signals of multiple channels.
S402,终端通过余弦窗采样得到同一时段采集到的各个通道的音频信号。S402, the terminal obtains the audio signals of each channel collected in the same time period through cosine window sampling.
终端可以通过余弦窗采样得到同一时段采集到的各个通道的音频信号。以图1D所示的采样音频信号的界面示意图为例,终端采集到K个通道的音频信号,各个采集到的音频信号的帧长为t,则终端可以通过余弦窗采样得到K个通道位于起始帧的音频信号,可选的,终端可以通过余弦窗采样得到K个通道位于第t帧的音频信号,等等。The terminal can obtain the audio signals of each channel collected in the same period through cosine window sampling. Taking the interface schematic diagram of the sampled audio signal shown in FIG. 1D as an example, the terminal collects audio signals of K channels, and the frame length of each collected audio signal is t, then the terminal can sample through the cosine window to obtain K channels are located at the starting point. The audio signal of the first frame, optionally, the terminal can obtain the audio signal of the t-th frame with K channels by sampling through the cosine window, and so on.
可选的,终端通过余弦窗采样得到同一时段采集到的各个通道的音频信号之后,可以将采样得到的同一时段采集到的各个通道的音频信号存储到预设缓存区中。另外,当预设缓存区中的音频信号的数据量大于预设缓存区的容量时,终端可以获取各个音频信号的存储时间,删除存储时间最早的音频信号。Optionally, after the terminal obtains the audio signals of each channel collected in the same time period through cosine window sampling, the terminal may store the sampled audio signals of each channel collected in the same time period into a preset buffer area. In addition, when the data volume of the audio signals in the preset buffer area is greater than the capacity of the preset buffer area, the terminal can obtain the storage time of each audio signal, and delete the audio signal with the earliest storage time.
S403,终端对采样得到的各个通道的音频信号进行稀疏变换,得到稀疏变换系数。S403: The terminal performs sparse transformation on the sampled audio signals of each channel to obtain sparse transformation coefficients.
S404,终端将预设测量矩阵与稀疏变换系数相乘,得到感知测量值。S404, the terminal multiplies the preset measurement matrix by the sparse transformation coefficient to obtain a perceptual measurement value.
S405,终端将感知测量值与噪声矢量相加,得到压缩数据。S405, the terminal adds the sensing measurement value and the noise vector to obtain compressed data.
S406,终端判断采样得到的音频信号是否位于最后一帧中。S406, the terminal determines whether the sampled audio signal is located in the last frame.
终端可以判断是否采样得到各个通道位于最后一帧的音频信号,当采样得到的音频信号不是位于最后一帧时,终端可以进一步执行步骤S407;当采样得到的音频信号位于最后一帧时,终端可以进一步执行步骤S408。The terminal can determine whether to sample the audio signal of each channel located in the last frame, and when the sampled audio signal is not located in the last frame, the terminal can further perform step S407; when the sampled audio signal is located in the last frame, the terminal can Step S408 is further executed.
S407,当采样得到的音频信号不是位于最后一帧时,终端将余弦窗进行平移。S407, when the sampled audio signal is not located in the last frame, the terminal shifts the cosine window.
例如,第一终端通过余弦窗采样得到各个通道位于初始帧的音频信号,获取到各个通道位于起始帧的压缩采样数据之后,当采样得到的音频信号不是位于最后一帧时,第一终端可以将余弦窗进行平移,进一步执行步骤S402~S405,即采样得到各个通道位于第二帧的音频信号,对各个通道位于第二帧的音频信号进行稀疏变换以及感知测量等处理,得到各个通道位于第二帧的压缩数据;当采样得到的音频信号不是位于最后一帧时,终端可以将余弦窗进行平移,采样得到各个通道位于第三帧的音频信号,对各个通道位于第三帧的音频信号进行稀疏变换以及感知测量等处理,得到各个通道位于第三帧的压缩数据,直至得到各个通道位于最后一帧的压缩数据。For example, the first terminal obtains audio signals whose channels are located in the initial frame through cosine window sampling, and after obtaining the compressed sampling data of each channel located in the initial frame, when the sampled audio signal is not located in the last frame, the first terminal can The cosine window is shifted, and steps S402 to S405 are further performed, that is, the audio signals of each channel located in the second frame are sampled, and the audio signals of each channel located in the second frame are subjected to sparse transformation and perceptual measurement. Compressed data of two frames; when the sampled audio signal is not in the last frame, the terminal can shift the cosine window to obtain the audio signal of each channel in the third frame by sampling, and perform the audio signal of each channel in the third frame. Through sparse transformation and perceptual measurement, the compressed data of each channel in the third frame is obtained until the compressed data of each channel in the last frame is obtained.
S408,当采样得到的音频信号位于最后一帧时,终端输出结束标志。S408, when the sampled audio signal is in the last frame, the terminal outputs an end flag.
当终端通过余弦窗采样得到的音频信号位于最后一帧时,终端可以输出结束标志,触发终端将各个通道位于不同帧的压缩数据发送给其他终端。可选的,终端可以在对各个通道位于同一帧的音频信号进行处理得到各个通道位于同一帧的压缩数据之后,将该压缩数据发送给其他终端。When the audio signal sampled by the terminal through the cosine window is located in the last frame, the terminal can output an end flag to trigger the terminal to send the compressed data whose channels are located in different frames to other terminals. Optionally, the terminal may send the compressed data to other terminals after processing the audio signals whose channels are located in the same frame to obtain compressed data whose channels are located in the same frame.
示例性的,终端可以通过由32个麦克风组成的阵列对声源进行采集,得到32个通道的音频信号,采集得到的各个通道的音频信号的帧长为16384。余弦窗为50%交叠。预设测量矩阵可以为5641*16384的归一化高斯随机矩阵,音频信号的采集环境对应的噪声矢量可以为均值为0,方差为1的高斯白噪声,预设缓存区的数据量为16384*32B,终端对音频信号进行处理得到压缩采样数据过程中的压缩比例为1:3,即一个通道位于指定帧的音频信号的数据量为300KB时,对该音频信号进行处理得到的压缩采样数据的数据量为100KB。Exemplarily, the terminal may collect sound sources through an array composed of 32 microphones to obtain audio signals of 32 channels, and the frame length of the collected audio signals of each channel is 16384. The cosine windows are 50% overlapped. The preset measurement matrix can be a normalized Gaussian random matrix of 5641*16384, the noise vector corresponding to the acquisition environment of the audio signal can be Gaussian white noise with a mean of 0 and a variance of 1, and the data volume of the preset buffer area is 16384* 32B, the compression ratio in the process of compressing the sampled data obtained by the terminal processing the audio signal is 1:3, that is, when the data volume of the audio signal whose channel is located in the specified frame is 300KB, the compression ratio of the compressed sampled data obtained by processing the audio signal is 1:3. The amount of data is 100KB.
在图4所示的音频信号压缩方法中,终端通过余弦窗采样得到同一时段采集到的各个通道的音频信号,对采样得到的各个通道的音频信号进行稀疏变换,得到稀疏变换系数,将预设测量矩阵与稀疏变换系数相乘,得到感知测量值,将感知测量值与噪声矢量相加,得到压缩采样数据,将余弦窗进行平移,得到各个通道位于指定帧的下一帧的音频信号,进而得到各个通道位于指定帧的下一帧的压缩数据,可建立有效的稀疏模型,对音频信号进行压缩。In the audio signal compression method shown in FIG. 4 , the terminal obtains the audio signals of each channel collected in the same time period through cosine window sampling, and performs sparse transformation on the sampled audio signals of each channel to obtain sparse transformation coefficients, and the preset The measurement matrix is multiplied by the sparse transformation coefficient to obtain the perceptual measurement value, the perceptual measurement value is added to the noise vector, the compressed sampling data is obtained, and the cosine window is shifted to obtain the audio signal of each channel located in the next frame of the specified frame, and then The compressed data of each channel located in the next frame of the specified frame can be obtained, and an effective sparse model can be established to compress the audio signal.
请参见图5,图5是本发明实施例中提供的一种音频信号重建装置的结构示意图,其中本发明实施例提供的音频信号重建装置至少可以包括压缩数据获取模块501、反量化模块502、联合重建模块503以及稀疏反变换模块504,其中:Referring to FIG. 5, FIG. 5 is a schematic structural diagram of an audio signal reconstruction apparatus provided in an embodiment of the present invention, wherein the audio signal reconstruction apparatus provided in an embodiment of the present invention may at least include a compressed
压缩数据获取模块501,用于获取至少两个音频信号对应的压缩数据;A compressed
反量化模块502,用于对所述至少两个音频信号对应的压缩数据进行反量化,从而得到所述至少两个音频信号对应的测量数据;an
联合重建模块503,用于获取测量矩阵,根据所述至少两个音频信号对应的测量数据和所述测量矩阵,联合重建所述至少两个音频信号对应的稀疏变换系数;a
稀疏反变换模块504,用于对所述至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到所述至少两个音频信号。The sparse
可选的,至少两个音频信号包括第一音频信号和第二音频信号,所述联合重建模块503,具体用于:Optionally, the at least two audio signals include a first audio signal and a second audio signal, and the
根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the second audio signal is calculated according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix.
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The
根据所述第一音频信号对应的稀疏变换系数中的第一幅度,确定所述第二音频信号对应的先验稀疏变换系数中的第二幅度,所述第一幅度与所述第二幅度之比为所述第一音频信号对应的麦克风距离声源的距离的对数与所述第二音频信号对应的麦克风距离所述声源的距离的对数之比;According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal is determined, and the difference between the first amplitude and the second amplitude is The ratio is the ratio of the logarithm of the distance between the microphone corresponding to the first audio signal and the sound source and the logarithm of the distance between the microphone corresponding to the second audio signal and the sound source;
将所述第二幅度作为所述第二音频信号对应的稀疏变换系数中幅度的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的幅度。The second amplitude is used as a priori of the amplitude in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding The magnitudes in the sparse transform coefficients of .
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The
根据所述第一音频信号对应的稀疏变换系数中的第一相位,确定所述第二音频信号对应的先验稀疏变换系数中的第二相位,所述第一相位与所述第二相位之比为所述第一音频信号对应的麦克风距离声源的距离与所述第二音频信号对应的麦克风距离所述声源的距离之比;According to the first phase in the sparse transform coefficient corresponding to the first audio signal, the second phase in the prior sparse transform coefficient corresponding to the second audio signal is determined, and the difference between the first phase and the second phase is The ratio is the ratio of the distance between the microphone corresponding to the first audio signal and the sound source and the distance between the microphone corresponding to the second audio signal and the sound source;
将所述第二相位作为所述第二音频信号对应的稀疏变换系数中相位的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的相位。Taking the second phase as a priori of the phase in the sparse transform coefficient corresponding to the second audio signal, and calculating the corresponding measurement data of the second audio signal and the measurement matrix according to the second audio signal The phase in the sparse transform coefficients of .
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The
将所述第一音频信号对应的稀疏变换系数中的第一频率,确定为所述第二音频信号对应的先验稀疏变换系数中的第二频率;determining the first frequency in the sparse transform coefficient corresponding to the first audio signal as the second frequency in the prior sparse transform coefficient corresponding to the second audio signal;
将所述第二频率作为所述第二音频信号对应的稀疏变换系数中频率的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的频率。The second frequency is used as the prior of the frequency in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding frequency of the second audio signal. The frequencies in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals acquired in different time periods, and the
根据所述第一音频信号对应的稀疏变换系数中的第一幅度,确定所述第二音频信号对应的先验稀疏变换系数中的第二幅度,对应于同一通道的不同时段的音频信号对应的系数变换系数中的幅度与所述不同时段的音频信号对应的帧的序号呈线性关系;According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal is determined, which corresponds to the audio signals of the same channel in different time periods. The amplitudes in the coefficient transform coefficients have a linear relationship with the sequence numbers of the frames corresponding to the audio signals of different time periods;
将所述第二幅度作为所述第二音频信号对应的稀疏变换系数中幅度的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的幅度。The second amplitude is used as a priori of the amplitude in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding The magnitudes in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals acquired in different time periods, and the
将所述第一音频信号对应的稀疏变换系数中的第一相位,确定为所述第二音频信号对应的先验稀疏变换系数中的第二相位;determining the first phase in the sparse transform coefficient corresponding to the first audio signal as the second phase in the prior sparse transform coefficient corresponding to the second audio signal;
将所述第二相位作为所述第二音频信号对应的稀疏变换系数中相位的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的相位。Taking the second phase as a priori of the phase in the sparse transform coefficient corresponding to the second audio signal, and calculating the corresponding measurement data of the second audio signal and the measurement matrix according to the second audio signal The phase in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述联合重建模块503根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数,具体用于:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals acquired in different time periods, and the
根据所述第一音频信号对应的稀疏变换系数中的第一频率,确定所述第二音频信号对应的先验稀疏变换系数中的第二频率,所述第一频率与所述二频率存在交集,所述交集中的频率通过对所述第一频率中的频率进行随机选择得到;According to the first frequency in the sparse transform coefficient corresponding to the first audio signal, the second frequency in the prior sparse transform coefficient corresponding to the second audio signal is determined, and the first frequency and the second frequency have an intersection , the frequencies in the intersection set are obtained by randomly selecting the frequencies in the first frequency;
将所述第二频率作为所述第二音频信号对应的稀疏变换系数中频率的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的频率。The second frequency is used as the prior of the frequency in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding frequency of the second audio signal. The frequencies in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号为在相邻时段采集获得的音频信号。Optionally, the first audio signal and the second audio signal are audio signals collected in adjacent time periods.
可选的,所述至少两个音频信号包括第三音频信号,所述联合重建模块503根据所述至少两个音频信号对应的测量数据和所述测量矩阵,联合重建所述至少两个音频信号对应的稀疏变换系数,具体用于:Optionally, the at least two audio signals include a third audio signal, and the
根据预设的初始稀疏变换系数、所述第三音频信号对应的测量数据以及所述测量矩阵,计算所述第三音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the third audio signal is calculated according to the preset initial sparse transformation coefficient, the measurement data corresponding to the third audio signal, and the measurement matrix.
具体的,本发明实施例中介绍的音频信号重建装置可以用以实施本发明结合图2~图4介绍的音频信号重建方法实施例中的部分或全部流程。Specifically, the audio signal reconstruction apparatus introduced in the embodiments of the present invention may be used to implement part or all of the processes in the audio signal reconstruction method embodiments of the present invention introduced in conjunction with FIG. 2 to FIG. 4 .
请参见图6,图6是本发明实施例中提供的一种终端的结构示意图。如图6所示,该终端可以包括:处理器601、存储器602以及网络接口603。处理器601连接到存储器602以及网络接口603,例如处理器601可以通过总线连接到存储器602以及网络接口603。Referring to FIG. 6, FIG. 6 is a schematic structural diagram of a terminal provided in an embodiment of the present invention. As shown in FIG. 6 , the terminal may include: a
其中,处理器601可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP)等。The
存储器602具体可以用于存储多个通道的音频信号等。存储器602可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);存储器也可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-statedrive,SSD);存储器还可以包括上述种类的存储器的组合。The
网络接口603,用于与其他终端进行通信,例如接收其他终端发送的压缩数据。网络接口603可选的可以包括标准的有线接口、无线接口(如WI-FI接口)等。The
获取至少两个音频信号对应的压缩数据;obtain compressed data corresponding to at least two audio signals;
对所述至少两个音频信号对应的压缩数据进行反量化,从而得到所述至少两个音频信号对应的测量数据;performing inverse quantization on the compressed data corresponding to the at least two audio signals, thereby obtaining measurement data corresponding to the at least two audio signals;
获取测量矩阵,根据所述至少两个音频信号对应的测量数据和所述测量矩阵,联合重建所述至少两个音频信号对应的稀疏变换系数;acquiring a measurement matrix, and jointly reconstructing sparse transform coefficients corresponding to the at least two audio signals according to the measurement data corresponding to the at least two audio signals and the measurement matrix;
对所述至少两个音频信号对应的稀疏变换系数进行稀疏反变换,得到所述至少两个音频信号。Sparse inverse transform is performed on the sparse transform coefficients corresponding to the at least two audio signals to obtain the at least two audio signals.
可选的,所述至少两个音频信号包括第一音频信号和第二音频信号,所述根据所述至少两个音频信号对应的测量数据和所述测量矩阵,联合重建所述至少两个音频信号对应的稀疏变换系数包括:Optionally, the at least two audio signals include a first audio signal and a second audio signal, and the at least two audio signals are jointly reconstructed according to the measurement data corresponding to the at least two audio signals and the measurement matrix. The sparse transform coefficients corresponding to the signal include:
根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the second audio signal is calculated according to the sparse transformation coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix.
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The calculation of the sparse transform coefficient corresponding to the second audio signal according to the sparse transform coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix includes:
根据所述第一音频信号对应的稀疏变换系数中的第一幅度,确定所述第二音频信号对应的先验稀疏变换系数中的第二幅度,所述第一幅度与所述第二幅度之比为所述第一音频信号对应的麦克风距离声源的距离的对数与所述第二音频信号对应的麦克风距离所述声源的距离的对数之比;According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal is determined, and the difference between the first amplitude and the second amplitude is The ratio is the ratio of the logarithm of the distance between the microphone corresponding to the first audio signal and the sound source and the logarithm of the distance between the microphone corresponding to the second audio signal and the sound source;
将所述第二幅度作为所述第二音频信号对应的稀疏变换系数中幅度的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的幅度。The second amplitude is used as a priori of the amplitude in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding The magnitudes in the sparse transform coefficients of .
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The calculation of the sparse transform coefficient corresponding to the second audio signal according to the sparse transform coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix includes:
根据所述第一音频信号对应的稀疏变换系数中的第一相位,确定所述第二音频信号对应的先验稀疏变换系数中的第二相位,所述第一相位与所述第二相位之比为所述第一音频信号对应的麦克风距离声源的距离与所述第二音频信号对应的麦克风距离所述声源的距离之比;According to the first phase in the sparse transform coefficient corresponding to the first audio signal, the second phase in the prior sparse transform coefficient corresponding to the second audio signal is determined, and the difference between the first phase and the second phase is The ratio is the ratio of the distance between the microphone corresponding to the first audio signal and the sound source and the distance between the microphone corresponding to the second audio signal and the sound source;
将所述第二相位作为所述第二音频信号对应的稀疏变换系数中相位的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的相位。Taking the second phase as a priori of the phase in the sparse transform coefficient corresponding to the second audio signal, and calculating the corresponding measurement data of the second audio signal and the measurement matrix according to the second audio signal The phase in the sparse transform coefficients of .
可选的,所述第一音频信号对应第一通道,所述第二音频信号对应第二通道,所述第一音频信号与所述第二音频信号为在同一时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal corresponds to a first channel, the second audio signal corresponds to a second channel, and the first audio signal and the second audio signal are audio signals collected in the same time period, so The calculation of the sparse transform coefficient corresponding to the second audio signal according to the sparse transform coefficient corresponding to the first audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix includes:
将所述第一音频信号对应的稀疏变换系数中的第一频率,确定为所述第二音频信号对应的先验稀疏变换系数中的第二频率;determining the first frequency in the sparse transform coefficient corresponding to the first audio signal as the second frequency in the prior sparse transform coefficient corresponding to the second audio signal;
将所述第二频率作为所述第二音频信号对应的稀疏变换系数中频率的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的频率。The second frequency is used as the prior of the frequency in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding frequency of the second audio signal. The frequencies in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals collected in different time periods, and the The sparse transform coefficient corresponding to an audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, and calculating the sparse transform coefficient corresponding to the second audio signal includes:
根据所述第一音频信号对应的稀疏变换系数中的第一幅度,确定所述第二音频信号对应的先验稀疏变换系数中的第二幅度,对应于同一通道的不同时段的音频信号对应的系数变换系数中的幅度与所述不同时段的音频信号对应的帧的序号呈线性关系;According to the first amplitude in the sparse transformation coefficient corresponding to the first audio signal, the second amplitude in the prior sparse transformation coefficient corresponding to the second audio signal is determined, which corresponds to the audio signals of the same channel in different time periods. The amplitudes in the coefficient transform coefficients have a linear relationship with the sequence numbers of the frames corresponding to the audio signals of different time periods;
将所述第二幅度作为所述第二音频信号对应的稀疏变换系数中幅度的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的幅度。The second amplitude is used as a priori of the amplitude in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding The magnitudes in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals collected in different time periods, and the The sparse transform coefficient corresponding to an audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, and calculating the sparse transform coefficient corresponding to the second audio signal includes:
将所述第一音频信号对应的稀疏变换系数中的第一相位,确定为所述第二音频信号对应的先验稀疏变换系数中的第二相位;determining the first phase in the sparse transform coefficient corresponding to the first audio signal as the second phase in the prior sparse transform coefficient corresponding to the second audio signal;
将所述第二相位作为所述第二音频信号对应的稀疏变换系数中相位的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的相位。Taking the second phase as a priori of the phase in the sparse transform coefficient corresponding to the second audio signal, and calculating the corresponding measurement data of the second audio signal and the measurement matrix according to the second audio signal The phase in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号对应同一通道,所述第一音频信号与所述第二音频信号为在不同时段采集获得的音频信号,所述根据所述第一音频信号对应的稀疏变换系数、所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数包括:Optionally, the first audio signal and the second audio signal correspond to the same channel, the first audio signal and the second audio signal are audio signals collected in different time periods, and the The sparse transform coefficient corresponding to an audio signal, the measurement data corresponding to the second audio signal, and the measurement matrix, and calculating the sparse transform coefficient corresponding to the second audio signal includes:
根据所述第一音频信号对应的稀疏变换系数中的第一频率,确定所述第二音频信号对应的先验稀疏变换系数中的第二频率,所述第一频率与所述二频率存在交集,所述交集中的频率通过对所述第一频率中的频率进行随机选择得到;According to the first frequency in the sparse transform coefficient corresponding to the first audio signal, the second frequency in the prior sparse transform coefficient corresponding to the second audio signal is determined, and the first frequency and the second frequency have an intersection , the frequencies in the intersection set are obtained by randomly selecting the frequencies in the first frequency;
将所述第二频率作为所述第二音频信号对应的稀疏变换系数中频率的先验,并根据所述第二音频信号对应的测量数据以及所述测量矩阵,计算所述第二音频信号对应的稀疏变换系数中的频率。The second frequency is used as the prior of the frequency in the sparse transform coefficient corresponding to the second audio signal, and the corresponding measurement data of the second audio signal and the measurement matrix are used to calculate the corresponding frequency of the second audio signal. The frequencies in the sparse transform coefficients of .
可选的,所述第一音频信号与所述第二音频信号为在相邻时段采集获得的音频信号。Optionally, the first audio signal and the second audio signal are audio signals collected in adjacent time periods.
可选的,所述至少两个音频信号包括第三音频信号,所述根据所述至少两个音频信号对应的测量数据和所述测量矩阵,联合重建所述至少两个音频信号对应的稀疏变换系数包括:Optionally, the at least two audio signals include a third audio signal, and the sparse transform corresponding to the at least two audio signals is jointly reconstructed according to the measurement data corresponding to the at least two audio signals and the measurement matrix. Factors include:
根据预设的初始稀疏变换系数、所述第三音频信号对应的测量数据以及所述测量矩阵,计算所述第三音频信号对应的稀疏变换系数。The sparse transformation coefficient corresponding to the third audio signal is calculated according to the preset initial sparse transformation coefficient, the measurement data corresponding to the third audio signal, and the measurement matrix.
具体的,本发明实施例中介绍的终端可以用以实施本发明结合图2~图4介绍的音频信号重建方法实施例中的部分或全部流程。Specifically, the terminal introduced in the embodiments of the present invention may be used to implement part or all of the processes in the embodiments of the audio signal reconstruction method described in conjunction with FIG. 2 to FIG. 4 of the present invention.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包括于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不是必须针对相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structures, materials, or features are included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise expressly and specifically defined.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的程序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包括、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器,只读存储器,可擦除可编辑只读存储器,光纤装置,以及便携式光盘只读存储器。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。Logic and/or steps represented in flowcharts or otherwise described herein, for example, may be considered a program listing of executable instructions for implementing the logical functions, and may be embodied in any computer-readable medium to For use with an instruction execution system, apparatus or apparatus (such as a computer-based system, a system including a processor, or other system that can fetch instructions from and execute instructions from an instruction execution system, apparatus or apparatus) or in conjunction with such instruction execution system, apparatus or apparatus equipment used. For the purposes of this specification, a "computer-readable medium" can be any device that can contain, store, communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or apparatus. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections with one or more wiring (electronic devices), portable computer disk cartridges (magnetic devices), random access memory, read only memory , Erasable Editable ROM, Fiber Optic Devices, and Portable Optical ROM. In addition, the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列,现场可编程门阵列等。It should be understood that various parts of the present invention may be implemented in hardware, software, firmware or a combination thereof. In the above-described embodiments, various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or a combination of the following techniques known in the art: Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays, field programmable gate arrays, etc.
此外,在本发明各个实施例中的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, the modules in the various embodiments of the present invention may be implemented in the form of hardware, and may also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it should be understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Embodiments are subject to variations, modifications, substitutions and variations.
Claims (22)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610877571.2A CN107886960B (en) | 2016-09-30 | 2016-09-30 | A kind of audio signal reconstruction method and device |
PCT/CN2017/103534 WO2018059409A1 (en) | 2016-09-30 | 2017-09-26 | Audio signal reconstruction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610877571.2A CN107886960B (en) | 2016-09-30 | 2016-09-30 | A kind of audio signal reconstruction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107886960A CN107886960A (en) | 2018-04-06 |
CN107886960B true CN107886960B (en) | 2020-12-01 |
Family
ID=61763647
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610877571.2A Active CN107886960B (en) | 2016-09-30 | 2016-09-30 | A kind of audio signal reconstruction method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107886960B (en) |
WO (1) | WO2018059409A1 (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101281749A (en) * | 2008-05-22 | 2008-10-08 | 上海交通大学 | Scalable Speech and Tone Joint Coding Apparatus and Decoding Apparatus |
CN102138177A (en) * | 2008-07-30 | 2011-07-27 | 法国电信 | Reconstruction of multi-channel audio data |
WO2012023864A1 (en) * | 2010-08-20 | 2012-02-23 | Industrial Research Limited | Surround sound system |
US8489403B1 (en) * | 2010-08-25 | 2013-07-16 | Foundation For Research and Technology—Institute of Computer Science ‘FORTH-ICS’ | Apparatuses, methods and systems for sparse sinusoidal audio processing and transmission |
WO2014046916A1 (en) * | 2012-09-21 | 2014-03-27 | Dolby Laboratories Licensing Corporation | Layered approach to spatial audio coding |
US20140310010A1 (en) * | 2011-11-14 | 2014-10-16 | Electronics And Telecommunications Research Institute | Apparatus for encoding and apparatus for decoding supporting scalable multichannel audio signal, and method for apparatuses performing same |
CN104934038A (en) * | 2015-06-09 | 2015-09-23 | 天津大学 | Spatial audio encoding-decoding method based on sparse expression |
CN105229729A (en) * | 2013-05-24 | 2016-01-06 | 杜比国际公司 | Audio Encoders and Decoders |
CN105659320A (en) * | 2013-10-21 | 2016-06-08 | 杜比国际公司 | Audio encoder and decoder |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6006179A (en) * | 1997-10-28 | 1999-12-21 | America Online, Inc. | Audio codec using adaptive sparse vector quantization with subband vector classification |
SG149871A1 (en) * | 2004-03-01 | 2009-02-27 | Dolby Lab Licensing Corp | Multichannel audio coding |
US8190425B2 (en) * | 2006-01-20 | 2012-05-29 | Microsoft Corporation | Complex cross-correlation parameters for multi-channel audio |
CN101155440A (en) * | 2007-09-17 | 2008-04-02 | 昊迪移通(北京)技术有限公司 | Three-dimensional around sound effect technology aiming at double-track audio signal |
CN102080001A (en) * | 2009-12-01 | 2011-06-01 | 济南开发区星火科学技术研究所 | Rare earth naphthenate-containing octane number improver for gasoline |
CN103152298B (en) * | 2013-03-01 | 2015-07-22 | 哈尔滨工业大学 | Blind signal reconstruction method based on distribution-type compressed sensing system |
CN105393304B (en) * | 2013-05-24 | 2019-05-28 | 杜比国际公司 | Audio encoding and decoding methods, media, and audio encoders and decoders |
EP2830334A1 (en) * | 2013-07-22 | 2015-01-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Multi-channel audio decoder, multi-channel audio encoder, methods, computer program and encoded audio representation using a decorrelation of rendered audio signals |
CN103745727A (en) * | 2013-12-25 | 2014-04-23 | 南京邮电大学 | Compressed sensing method of noise-containing voice signal |
CN103944578B (en) * | 2014-03-28 | 2017-08-22 | 电子科技大学 | A kind of reconstructing method of multi signal |
EP2980795A1 (en) * | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoding and decoding using a frequency domain processor, a time domain processor and a cross processor for initialization of the time domain processor |
CN105828266A (en) * | 2016-03-11 | 2016-08-03 | 苏州奇梦者网络科技有限公司 | Signal processing method and system for microphone array |
-
2016
- 2016-09-30 CN CN201610877571.2A patent/CN107886960B/en active Active
-
2017
- 2017-09-26 WO PCT/CN2017/103534 patent/WO2018059409A1/en active Application Filing
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101281749A (en) * | 2008-05-22 | 2008-10-08 | 上海交通大学 | Scalable Speech and Tone Joint Coding Apparatus and Decoding Apparatus |
CN102138177A (en) * | 2008-07-30 | 2011-07-27 | 法国电信 | Reconstruction of multi-channel audio data |
WO2012023864A1 (en) * | 2010-08-20 | 2012-02-23 | Industrial Research Limited | Surround sound system |
US8489403B1 (en) * | 2010-08-25 | 2013-07-16 | Foundation For Research and Technology—Institute of Computer Science ‘FORTH-ICS’ | Apparatuses, methods and systems for sparse sinusoidal audio processing and transmission |
US20140310010A1 (en) * | 2011-11-14 | 2014-10-16 | Electronics And Telecommunications Research Institute | Apparatus for encoding and apparatus for decoding supporting scalable multichannel audio signal, and method for apparatuses performing same |
WO2014046916A1 (en) * | 2012-09-21 | 2014-03-27 | Dolby Laboratories Licensing Corporation | Layered approach to spatial audio coding |
CN105229729A (en) * | 2013-05-24 | 2016-01-06 | 杜比国际公司 | Audio Encoders and Decoders |
CN105659320A (en) * | 2013-10-21 | 2016-06-08 | 杜比国际公司 | Audio encoder and decoder |
CN104934038A (en) * | 2015-06-09 | 2015-09-23 | 天津大学 | Spatial audio encoding-decoding method based on sparse expression |
Non-Patent Citations (2)
Title |
---|
single-channel and multi-channel sinusoidal audio coding using compressed sensing;Anthony Griffin等;《IEEE Transactions on Audio,Speech,and Language Processing》;20101109;全文 * |
压缩传感算法研究与音频应用;张雯;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120815;全文 * |
Also Published As
Publication number | Publication date |
---|---|
WO2018059409A1 (en) | 2018-04-05 |
CN107886960A (en) | 2018-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110677711B (en) | Video dubbing method and device, electronic equipment and computer readable medium | |
KR102257695B1 (en) | Sound field re-creation device, method, and program | |
US9201580B2 (en) | Sound alignment user interface | |
CN101621514B (en) | Network data compressing method, network system and synthesis center equipment | |
US10638221B2 (en) | Time interval sound alignment | |
CN112257459B (en) | Language translation model training method, translation method, device and electronic equipment | |
Wu et al. | Compressive sensing‐based speech enhancement in non‐sparse noisy environments | |
CN102770913B (en) | Sparse audio | |
CN114861790B (en) | Optimization method, system and device for federated learning compressed communication | |
CN112116903A (en) | Method and device for generating speech synthesis model, storage medium and electronic equipment | |
CN113921032A (en) | Training method and device for audio processing model, and audio processing method and device | |
CN111309962A (en) | Method and device for extracting audio clip and electronic equipment | |
CN107895580B (en) | Method and device for reconstructing audio signal | |
CN110209658B (en) | Data cleaning method and device | |
CN104064191A (en) | Audio mixing method and device | |
WO2016141732A1 (en) | Method and device for determining inter-channel time difference parameter | |
CN116708806A (en) | Encoding method, decoding method and device applicable to spatial pictures | |
CN107886960B (en) | A kind of audio signal reconstruction method and device | |
CN113096670B (en) | Audio data processing method, device, equipment and storage medium | |
CN114245117A (en) | Multi-sampling rate multiplexing network reconstruction method, device, equipment and storage medium | |
CN115116469B (en) | Feature representation extraction methods, devices, equipment, media and program products | |
CN115103191B (en) | Image processing method, device, equipment and storage medium | |
CN117594059A (en) | Audio repair method and device, storage medium and electronic equipment | |
CN113436644B (en) | Sound quality evaluation method, device, electronic equipment and storage medium | |
CN113823312B (en) | Speech enhancement model generation method and device, and speech enhancement method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |