CN111837119A

CN111837119A - A Sound Signal Separation Method Based on Semi-Non-Negative Matrix Decomposition

Info

Publication number: CN111837119A
Application number: CN201980012799.7A
Authority: CN
Inventors: 韩威; 周松斌; 刘伟鑫; 李昌; 刘忆森; 邱泽帆
Original assignee: Guangdong Institute of Intelligent Manufacturing
Current assignee: Guangdong Institute of Intelligent Manufacturing
Priority date: 2019-05-09
Filing date: 2019-05-09
Publication date: 2020-10-27
Anticipated expiration: 2039-05-09
Also published as: CN111837119B; WO2020223952A1

Abstract

A method of sound signal separation with semi-non-negative matrix factorization, comprising: calculating a fourier transform result of the single-channel mixed sound signal and calculating a frequency spectrum thereof (S1); performing semi-nonnegative matrix factorization (S2) on the frequency spectrum of the mixed signal; obtaining respective initial spectra of the source signals from the spectra of the mixed signals (S3); performing semi-nonnegative matrix factorization (S4) on an initial spectrum of the source signal; obtaining a frequency spectrum of the source signal according to the frequency spectrum of the mixed signal, the characteristic matrix and the coefficient matrix corresponding to the frequency spectrum of the mixed signal, and the characteristic matrix and the coefficient matrix corresponding to the source signal (S5); obtaining a fourier transform result of the source signal from the fourier transform result of the mix signal and the frequency spectrum of the source signal (S6); the source signal is separated from the single-channel mixed sound signal by performing an inverse fourier transform on the fourier transform result of the source signal to obtain the source signal (S7). The method can be used for separating single-channel mixed sound signals formed by mixing source sound signals which are overlapped in frequency domain and are not necessarily independent.

Description

A Sound Signal Separation Method Based on Semi-Non-Negative Matrix Decomposition

技术领域technical field

本发明涉及单通道混合声音信号分离技术领域，具体涉及一种基于半非负矩阵分解的声音信号分离方法。The invention relates to the technical field of single-channel mixed sound signal separation, in particular to a sound signal separation method based on semi-non-negative matrix decomposition.

背景技术Background technique

声学技术已在医学诊断、产品质量检测、设备状态监测、机械性能试验、声学事件分类等领域被广泛的研究和应用。由于其应用环境的声场可能较为复杂，采集到的声音信号往往是目标声音和环境噪声的混合信号。因此，一般需要首先从混合声音信号中提取出目标声音以进行后续的信号处理和分析。此外，鉴于体积大小、设备造价、安装问题等条件的限制，可能只能安装一个声音传感器或者是最好安装一个声音传感器，这就要求从单一麦克风采集到的单通道混合声音信号中提取出目标声音信号。单通道混合声音信号分离是解决上述任务的常用方法。Acoustic technology has been widely researched and applied in the fields of medical diagnosis, product quality inspection, equipment condition monitoring, mechanical performance test, and acoustic event classification. Because the sound field of its application environment may be complex, the collected sound signal is often a mixed signal of target sound and environmental noise. Therefore, it is generally necessary to first extract the target sound from the mixed sound signal for subsequent signal processing and analysis. In addition, in view of the limitations of size, equipment cost, installation problems and other conditions, it is possible to install only one sound sensor or it is better to install one sound sensor, which requires extracting the target from the single-channel mixed sound signal collected by a single microphone sound signal. Single-channel mixed sound signal separation is a common method for solving the above tasks.

机械工业第三设计研究院申请的中国发明专利“基于Hilbert变换的欠定声音信号分离方法及装置”提出采用Hilbert变换进行欠定声音信号分离，没有说明是针对单通道混合声音信号，且其方法不适用于频域混叠的情况。中科院嘉兴中心微系统所分中心申请的中国发明专利“无线传感器网络中基于粒子滤波的多车辆声信号分离方法”以及山东大学申请的中国发明专利“基于联合近似对角化盲源分离算法的电力设备故障音检测方法”均是针对麦克风阵列采集的多通道混合声音信号进行声音分离。The Chinese invention patent "Method and Device for Separation of Underdetermined Sound Signals Based on Hilbert Transform" applied by the Third Design and Research Institute of Machinery Industry proposes to use Hilbert transform to separate underdetermined sound signals. Not suitable for frequency domain aliasing. The Chinese invention patent "Multi-vehicle acoustic signal separation method based on particle filtering in wireless sensor network" and the Chinese invention patent applied by Shandong University "Based on the joint approximate diagonalization blind source separation algorithm" "Equipment fault sound detection method" is to separate the sound for the multi-channel mixed sound signal collected by the microphone array.

安徽理工大学王康等人发表的论文“基于变分模态分解的单通道信号盲源分离方法”提出一种基于变分模态分解的单通道信号盲源分离方法：首先采用变分模态分解实现单通道观测信号的升维，并估计源信号数目，然后再进行信号的盲源分离。该方法涉及到升维操作(即将单通道信号映射为多通道信号)，且不适用于频域混叠的情况。解放军信息工程大学郭一鸣等人发表的论文“基于SIC的单通道同频混合信号低复杂度盲分离算法”采用过采样构造多通道条件，进而构造出信道矩阵，利用连续干扰抵消算法实现单通道同频混合信号的盲分离。该方法同样涉及到将单通道信号映射为多通道信号后再进行信号分离，且其受时延差和接收信号过采样倍数影响较大，存在解调盲区。江南大学杨海兰等人发表的论文“单通道通信信号盲分离算法”给出了一种基于希尔伯特黄变换和独立分量分析的单通道通信信号盲源分离算法。该方法不适用于频域混叠的情况。解放军理工大学朱会杰等人发表的论文“基于移不变稀疏编码的单通道机械信号盲源分离”对特征反复出现的机械信号，提出一种基于移不变稀疏编码的单通道盲源分离方法，算法中将源信号看成多个基与系数的卷积，能够根据信号的统计分布，利用信号自身特征自适应地学习到匹配的基和稀疏的系数。该方法针对的是特征反复出现的机械信号。The paper "Blind Source Separation Method for Single-Channel Signals Based on Variational Mode Decomposition" published by Wang Kang et al. from Anhui University of Science and Technology proposed a blind source separation method for single-channel signals based on variational modal decomposition: first, the variational mode decomposition method was used. Decomposition realizes the dimension-raising of the single-channel observation signal, estimates the number of source signals, and then performs blind source separation of the signal. This method involves dimension-raising operations (that is, mapping a single-channel signal to a multi-channel signal), and is not suitable for the case of frequency domain aliasing. The paper "Single-channel same-frequency mixed signal low-complexity blind separation algorithm based on SIC" published by Guo Yiming et al. of PLA University of Information Engineering uses oversampling to construct multi-channel conditions, and then constructs a channel matrix, and uses continuous interference cancellation algorithm to achieve single-channel synchronization. Blind separation of frequency mixed signals. This method also involves mapping a single-channel signal into a multi-channel signal and then performing signal separation, which is greatly affected by the time delay difference and the oversampling multiple of the received signal, and there is a demodulation blind spot. The paper "Blind Separation Algorithm for Single-Channel Communication Signals" published by Yang Hailan et al. of Jiangnan University presents a blind source separation algorithm for single-channel communication signals based on Hilbert-Huang transform and independent component analysis. This method is not suitable for the case of frequency domain aliasing. The paper "Blind Source Separation of Single-Channel Mechanical Signals Based on Shift-Invariant Sparse Coding" published by Zhu Huijie et al. of PLA University of Science and Technology proposed a single-channel blind source separation method based on shift-invariant sparse coding for mechanical signals with recurring features. In the algorithm, the source signal is regarded as the convolution of multiple bases and coefficients, which can adaptively learn matching bases and sparse coefficients by using the signal's own characteristics according to the statistical distribution of the signal. This method targets mechanical signals that feature recurring patterns.

发明内容SUMMARY OF THE INVENTION

有鉴于此，有必要针对上述问题，提出一种基于半非负矩阵分解的声音信号分离方法。该方法首先根据源声音信号的主要频段从单通道混合声音信号的频谱中获得初始估计频谱，再基于半非负矩阵分解将源信号的初始估计频谱分解为初始特征矩阵和初始系数矩阵，利用源信号的系数矩阵之间的相关性得到源信号最终的特征矩阵和系数矩阵，从而获得源信号的频谱，再进行傅立叶逆变换获得源信号，最终实现单通道混合声音信号分离。In view of this, it is necessary to propose a sound signal separation method based on semi-non-negative matrix decomposition to solve the above problems. The method firstly obtains the initial estimated spectrum from the spectrum of the single-channel mixed sound signal according to the main frequency band of the source sound signal, and then decomposes the initial estimated spectrum of the source signal into an initial feature matrix and an initial coefficient matrix based on semi-nonnegative matrix decomposition. The correlation between the coefficient matrices of the signal obtains the final characteristic matrix and coefficient matrix of the source signal, so as to obtain the spectrum of the source signal, and then performs inverse Fourier transform to obtain the source signal, and finally realizes the separation of single-channel mixed sound signals.

为实现上述目的，本发明采取以下的技术方案：To achieve the above object, the present invention adopts the following technical solutions:

一种基于半非负矩阵分解的声音信号分离方法，包括如下步骤：A sound signal separation method based on semi-non-negative matrix decomposition, comprising the following steps:

S1、单通道混合声音信号S由若干独立的声音信号S¹,S²,…,Sⁿ混合而成，计算S的傅立叶变换结果F，根据F计算频谱X；S1. The single-channel mixed sound signal S is formed by mixing several independent sound signals S ¹ , S ² , ..., ^Sn , calculate the Fourier transform result F of S, and calculate the spectrum X according to F;

S2、对X进行半非负矩阵分解，得到特征矩阵W和系数矩阵H；S2. Perform semi-non-negative matrix decomposition on X to obtain feature matrix W and coefficient matrix H;

S3、根据X得到声音信号S¹,S²,…,Sⁿ各自的初始估计频谱

S3. Obtain the respective initial estimated spectrums of the sound signals S ¹ , S ² , ..., ^Sn according to X

S4、对

分别进行半非负矩阵分解，得到对应的特征矩阵

和系数矩阵

S4, yes

Perform semi-non-negative matrix decomposition respectively to get the corresponding eigenmatrix

and coefficient matrix

S5、根据X、W、H，以及

获得声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ；S5. According to X, W, H, and

Obtain the respective frequency spectra X ¹ , X ² ,..., X ⁿ of the sound signals S ¹ , S ² ,...,S ⁿ ;

S6、根据傅立叶变换结果F以及频谱X¹,X²,…,Xⁿ，获得声音信号S¹,S²,…,Sⁿ各自的傅立叶变换结果F¹,F²,…,Fⁿ；S6. According to the Fourier transform result F and the frequency spectrum X ¹ , X ² ,...,X ⁿ , obtain the respective Fourier transform results F ¹ , F ² ,..., F ⁿ of the sound signals S ¹ , S ² ,..., ^Sn ;

S7、分别对F¹,F²,…,Fⁿ进行傅立叶逆变换，得到声音信号S¹,S²,…,Sⁿ，从而实现从单通道混合声音信号S中分离出独立的声音信号S¹,S²,…,Sⁿ。 ^S7 . ^Perform inverse Fourier transform ^on ^F ¹ , F ² , . ¹ , S ² ,…,S ⁿ .

进一步地，S1中所述单通道混合声音信号，是指混合声音信号只由一个声音采集器采集而得。Further, the single-channel mixed sound signal in S1 means that the mixed sound signal is collected by only one sound collector.

进一步地，S1中所述单通道混合声音信号S由若干独立的声音信号S¹,S²,…,Sⁿ混合而成，其中n的数值为2或3。Further, the single-channel mixed sound signal S in S1 is formed by mixing several independent sound signals S ¹ , S ² , . . . , Sn , where the value of ⁿ is 2 or 3.

进一步地，S2中对X进行半非负矩阵分解，得到特征矩阵W和系数矩阵H，按如下步骤进行：Further, in S2, semi-non-negative matrix decomposition is performed on X to obtain a feature matrix W and a coefficient matrix H, and the steps are as follows:

S21、构造半非负矩阵分解的目标函数ΓS21. Construct the objective function Γ of the semi-nonnegative matrix factorization

S22、初始化系数矩阵H，其所有元素的值为(0,1)之间的随机数；S22. Initialize the coefficient matrix H, and the values of all its elements are random numbers between (0, 1);

S23、计算特征矩阵W的初始值为S23. Calculate the initial value of the feature matrix W as

W＝XH(H^TH)^-1 (2)W=XH(H ^T H) ^-1 (2)

S24、将特征矩阵W和系数矩阵H交替迭代更新：先迭代更新一次W，然后迭代更新一次H，如此循环往复的先后迭代更新W和H；利用公式W＝XH(H^TH)^-1 (3)迭代更新特征矩阵W中的元素，利用公式

迭代更新系数矩阵H中的元素；S24, iteratively update the feature matrix W and the coefficient matrix H alternately: first iteratively update W once, and then iteratively update H once, so that W and H are iteratively updated successively in a cycle; using the formula W=XH(H ^T H) ^-1 ( 3) Iteratively update the elements in the feature matrix W, using the formula

Iteratively update the elements in the coefficient matrix H;

S25、设定半非负矩阵分解的目标函数Γ的最小值Γ_min，设定最大迭代次数E_max，每次迭代更新完成后计算目标函数Γ的值，当目标函数Γ的值小于Γ_min或者迭代次数达到最大迭代次数E_max时，则停止迭代，得到最终的特征矩阵W和系数矩阵H；S25. Set the minimum value Γ _min of the objective function Γ of the semi-nonnegative matrix decomposition, set the maximum number of iterations E _max , and calculate the value of the objective function Γ after each iteration update is completed. When the value of the objective function Γ is less than Γ _min or When the number of iterations reaches the maximum number of iterations E _max , the iteration is stopped, and the final feature matrix W and coefficient matrix H are obtained;

于公式(1)、(2)、(3)和(4)中，

表示矩阵的Frobenius范数；X为频谱；W为特征矩阵；H为系数矩阵；H^T为H的转置；(H^TH)^-1为H^TH的逆矩阵；X^T为X的转置；W^T为W的转置；(X^TW)⁺为X^TW中的正值元素；(X^TW)^-为X^TW中的负值元素；(W^TW)⁺为W^TW中的正值元素；(W^TW)^-为W^TW中的负值元素。In formulas (1), (2), (3) and (4),

Represents the Frobenius norm of the matrix; ^X is the frequency spectrum; W is the characteristic matrix; H is the coefficient matrix; H ^T is the transpose of H; (H ^T H) ^-1 is the inverse matrix of H ^T H; Set; W ^T is the transpose of W; (X ^T W) ⁺ is the positive value element in X ^T W; (X ^T W) ^- is the negative value element in X ^T W; (W ^T W) ⁺ is W Positive ^{-valued elements in TW; (W T} ^W ) ^- are negative-valued elements in W TW ^.

进一步地，S3中根据X得到声音信号S¹,S²,…,Sⁿ各自的初始估计频谱

按如下步骤进行：Further, in S3, the respective initial estimated spectrums of the sound signals S ¹ , S ² ,..., ^Sn are obtained according to X

Proceed as follows:

S31、声音信号S¹,S²,…,Sⁿ各自的主要频段为f₁,f₂,…,f_n；S31. The respective main frequency bands of the sound signals S ¹ , S ² ,...,S ⁿ are f ₁ , f ₂ ,..., f _n ;

S32、将X中对应于频段f₁,f₂,…,f_n的部分作为声音信号S¹,S²,…,Sⁿ各自的初始估计频谱

S32. Use the part of X corresponding to the frequency bands f ₁ , f ₂ ,..., f _n as the respective initial estimated spectrums of the sound signals S ¹ , S ² ,..., ^Sn

进一步地，S4对

分别进行半非负矩阵分解，得到对应的特征矩阵

和系数矩阵

按如下步骤进行：Further, S4 pairs

and coefficient matrix

Proceed as follows:

S41、构造半非负矩阵分解的目标函数Γⁱ(i＝1,2,…,n)S41. Construct the objective function Γ ⁱ (i=1,2,...,n) of semi-nonnegative matrix decomposition

S42、初始化系数矩阵

其所有元素的值为(0,1)之间的随机数；S42, initialization coefficient matrix

The value of all its elements is a random number between (0,1);

S43、计算特征矩阵

的初始值为S43. Calculate the feature matrix

The initial value is

S44、将特征矩阵

和系数矩阵

交替迭代更新：先迭代更新一次

然后迭代更新一次

如此循环往复的先后迭代更新

和

利用公式

迭代更新特征矩阵

中的元素，利用公式

迭代更新系数矩阵

中的元素；S44, the feature matrix

and coefficient matrix

Alternate iterative update: first iterative update once

Then iteratively update once

Iterative update in this way

and

Use the formula

Iteratively update the feature matrix

elements in , using the formula

Iteratively update the coefficient matrix

elements in;

S45、设定半非负矩阵分解的目标函数Γⁱ(i＝1,2,…,n)的最小值

设定最大迭代次数

每次迭代更新完成后计算目标函数Γⁱ的值，当目标函数Γⁱ的值小于

或者迭代次数达到最大迭代次数

时，则停止迭代，得到最终的特征矩阵

和系数矩阵

S45. Set the minimum value of the objective function Γ ⁱ (i=1,2,...,n) of the semi-nonnegative matrix decomposition

Set the maximum number of iterations

Calculate the value of the objective function Γ ⁱ after each iteration update is completed, when the value of the objective function Γ ⁱ is less than

Or the number of iterations reaches the maximum number of iterations

When , stop the iteration and get the final feature matrix

and coefficient matrix

于公式(5)、(6)、(7)和(8)中，

表示矩阵的Frobenius范数；

表示初始估计频谱

表示特征矩阵表示系数矩阵

的转置；

为

的逆矩阵；

为

的转置；

为

的转置；

为

中的正值元素；

为

中的负值元素；

为

中的正值元素；

为

中的负值元素。In formulas (5), (6), (7) and (8),

represents the Frobenius norm of the matrix;

represents the initial estimated spectrum

Represents the feature matrix Represents the coefficient matrix

transpose of ;

for

The inverse matrix of ;

for

transpose of ;

for

transpose of ;

for

positive-valued elements in ;

for

Negative elements in ;

for

positive-valued elements in ;

for

Negative elements in .

进一步地，S5根据X、W、H，以及

获得声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ，按如下步骤进行：Further, S5 is based on X, W, H, and

To obtain the respective frequency spectra X ¹ , X ² ,..., X ⁿ of the sound signals S ¹ , S ² ,...,S ⁿ , follow the steps below:

S51、H＝[h₁；h₂；…；h_c]，h_k(k＝1,2,…,c)是维度为M的行向量，用

表示系数矩阵

是维度为M的行向量；S51, H=[h ₁ ; h ₂ ;...;h _c ], h _k (k=1, 2,..., c) is a row vector of dimension M, which is represented by

Represents the coefficient matrix

is a row vector of dimension M;

S52、按照公式

计算

与h_k的相关系数

因此可以计算出

的每一行与h_k的相关系数

然后选出

中的最大值，记为最大相关系数

S52. According to the formula

calculate

Correlation coefficient with h _k

So it can be calculated

The correlation coefficient of each row with h _k

then choose

The maximum value in , denoted as the maximum correlation coefficient

S53、按照步骤S52，计算

的每一行与H的每一行的最大相关系数，记为

S53, according to step S52, calculate

The maximum correlation coefficient between each row of H and each row of H, denoted as

S54、按照步骤S53，分别计算

的每一行与H的每一行的最大相关系数，记为Q＝[q¹；q²；…；qⁿ]，qⁱ(i＝1,2,…,n)是维度为c的行向量，Q是一个n行c列的矩阵；S54, according to step S53, calculate respectively

The maximum correlation coefficient between each row of H and each row of H is denoted as Q=[q ¹ ; q ² ;...;q ⁿ ], q ⁱ (i=1,2,...,n) is a row vector of dimension c , Q is a matrix with n rows and c columns;

S55、声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ的特征矩阵为W¹,W²,…,Wⁿ，且系数矩阵为H¹,H²,…,Hⁿ；S55. The characteristic matrices of the respective frequency spectra X ¹ , X ² ,..., X ⁿ of the sound signals S ¹ , S ² ,..., ^Sn are W ¹ , W ² ,..., W ⁿ , and the coefficient matrices are H ¹ , H ² ,…,H ⁿ ;

S56、如果在Q的第k列(k＝1,2,…,c)中，第i行(i＝1,2,…,n)的数值最大，则系数矩阵H的第k行即h_k分配给Hⁱ，且特征矩阵W＝[w₁,w₂,…,w_n]的第k列即w_k分配给Wⁱ；S56. If in the kth column (k=1,2,...,c) of Q, the ith row (i=1,2,...,n) has the largest value, then the kth row of the coefficient matrix H is h _k is assigned to H ⁱ , and the k- ^th column of the feature matrix W=[w ₁ , w ₂ , . . . , _wn ], that is, w _k is assigned to Wi ;

S57、计算Q中每一列的最大值，并按照步骤S56执行，从而得到特征矩阵W¹,W²,…,Wⁿ和系数矩阵H¹,H²,…,Hⁿ；S57, calculate the maximum value of each column in Q, and execute according to step S56, thereby obtaining characteristic matrices W ¹ , W ² ,...,W ⁿ and coefficient matrices H ¹ ,H ² ,...,H ⁿ ;

S58、按照公式Xⁱ＝WⁱH^iT(i＝1,2,…,n)(10)计算得到声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ；S58, according to the formula X ⁱ =W ⁱ H ^iT (i=1,2,...,n) (10), calculate and obtain the respective frequency spectra X ¹ , X ² ,..., of the sound signals S ¹ , S ² ,...,S ⁿ X ⁿ ;

于公式(9)和(10)中，h_k ^T为h_k的转置；H^iT为Hⁱ的转置；Xⁱ(i＝1,2,…,n)表示频谱X¹,X²,…,Xⁿ；Wⁱ(i＝1,2,…,n)表示特征矩阵W¹,W²,…,Wⁿ；Hⁱ(i＝1,2,…,n)表示系数矩阵H¹,H²,…,Hⁿ。In formulas (9) and (10), h _k ^T is the transposition of h _k ; H ^iT is the transposition of H ⁱ ; X ⁱ (i=1,2,...,n) represents the spectrum X ¹ , X ² ,...,X ⁿ ; Wi ( ⁱ =1,2,...,n) represents the characteristic matrix W ¹ ,W ² ,...,W ⁿ ;H ⁱ (i=1,2,...,n) represents the coefficient matrix H ¹ , H ² ,…,H ⁿ .

进一步地，S6中根据傅立叶变换结果F以及频谱X¹,X²,…,Xⁿ，获得声音信号S¹,S²,…,Sⁿ各自的傅立叶变换结果F¹,F²,…,Fⁿ，按如下步骤进行：Further, in S6, according to the Fourier transform result F and the frequency spectrum X ¹ , X ² ,...,X ⁿ , the respective Fourier transform results F ¹ , F ² ,...,F of the sound signals S ¹ , S ² ,...,S ⁿ are obtained ⁿ , proceed as follows:

S61、用Fⁱ(i＝1,2,…,n)表示声音信号S¹,S²,…,Sⁿ各自的傅立叶变换结果F¹,F²,…,Fⁿ，

是p行q列矩阵，初始化Fⁱ为零矩阵； ^S61 . Use ^F ⁱ ( ⁱ ⁼ ¹ , ² , .

is a matrix with p rows and q columns, and initializes F ⁱ as a zero matrix;

S62、F＝{F_rk}_p×q是一个p行q列的矩阵，用Xⁱ(i＝1,2,…,n)表示声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ，

是p行q列的矩阵；S62. F ⁼ ^{ F _rk } _p×q is a matrix with p rows and q columns, and X ⁱ (i=1, ² , . X ¹ , X ² ,...,X ⁿ ,

is a matrix with p rows and q columns;

S63、比较X¹,X²,…,Xⁿ各自的第r行(r＝1,2,…,p)第k(k＝1,2,…,q)列元素的大小，如果Xⁱ(i＝1,2,…,n)中的第r行第k列元素最大，则

S63. Compare the sizes of the elements in the rth row (r=1, 2,...,p) of the kth (k=1, 2,...,q) column of each of X ¹ , X ² ,..., X ⁿ , if X ⁱ (i=1,2,...,n) where the element in the rth row and the kth column is the largest, then

S64、按照步骤S63执行，遍历X¹,X²,…,Xⁿ中所有元素，则得到

即获得声音信号S¹,S²,…,Sⁿ各自的傅立叶变换结果F¹,F²,…,Fⁿ。S64. Execute according to step S63, traverse all elements in X ¹ , X ² ,..., X ⁿ , to obtain

That ^is , the Fourier transform results F ¹ , F ² , . . . , F ⁿ of the sound signals S ¹ , S ² , .

本发明的有益效果为：The beneficial effects of the present invention are:

本发明能用于分离由频域重叠、相互不一定独立的源声音信号混合而成的单通道混合声音信号。The present invention can be used to separate single-channel mixed sound signals which are mixed from source sound signals that overlap in frequency domain and are not necessarily independent of each other.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

图1为本发明的一种基于半非负矩阵分解的声音信号分离方法的工作流程图；Fig. 1 is the working flow chart of a kind of sound signal separation method based on semi-non-negative matrix decomposition of the present invention;

图2为本发明的原始心音信号图；Fig. 2 is the original heart sound signal diagram of the present invention;

图3为本发明的原始肺音信号图；Fig. 3 is the original lung sound signal diagram of the present invention;

图4为本发明的原始心音信号和原始肺音信号的混合信号图；Fig. 4 is the mixed signal diagram of original heart sound signal and original lung sound signal of the present invention;

图5为本发明的分离出的信号与原始信号对比图。FIG. 5 is a comparison diagram of the separated signal of the present invention and the original signal.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明实施例，对本发明的技术方案作进一步清楚、完整地描述。需要说明的是，所描述的实施例仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be further clearly and completely described below with reference to the embodiments of the present invention. It should be noted that the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例Example

如图1所示，本发明提供一种基于半非负矩阵分解的声音信号分离方法包括以下步骤：As shown in Figure 1, the present invention provides a method for separating sound signals based on semi-non-negative matrix decomposition, comprising the following steps:

S3、根据X得到声音信号S¹,S²,…,Sⁿ各自的初始估计频谱

S4、对

分别进行半非负矩阵分解，得到对应的特征矩阵

和系数矩阵

S4, yes

and coefficient matrix

S5、根据X、W、H，以及

于本实施例中，S1中所述单通道混合声音信号，是指混合声音信号只由一个声音采集器采集而得。In this embodiment, the single-channel mixed sound signal in S1 means that the mixed sound signal is collected by only one sound collector.

于本实施例中，S1中所述单通道混合声音信号S由若干独立的声音信号S¹,S²,…,Sⁿ混合而成，其中n的数值一般为2或3。In this embodiment, the single-channel mixed sound signal S in S1 is formed by mixing a plurality of independent sound signals S ¹ , S ² , . . . , Sn , where the value of ⁿ is generally 2 or 3.

于本实施例中，S2中对X进行半非负矩阵分解，得到特征矩阵W和系数矩阵H，按如下步骤进行：In this embodiment, the semi-non-negative matrix decomposition is performed on X in S2 to obtain the feature matrix W and the coefficient matrix H, and the steps are as follows:

W＝XH(H^TH)^-1 (2)W=XH(H ^T H) ^-1 (2)

Iteratively update the elements in the coefficient matrix H;

S25、设定半非负矩阵分解的目标函数Γ的最小值Γ_min，设定最大迭代次数E_max，每次迭代更新完成后计算目标函数Γ的值，当目标函数Γ的值小于Γ_min或者迭代次数达到最大迭代次数E_max时，则停止迭代，得到最终的特征矩阵W和系数矩阵H。S25. Set the minimum value Γ _min of the objective function Γ of the semi-nonnegative matrix decomposition, set the maximum number of iterations E _max , and calculate the value of the objective function Γ after each iteration update is completed. When the value of the objective function Γ is less than Γ _min or When the number of iterations reaches the maximum number of iterations E _max , the iteration is stopped, and the final feature matrix W and coefficient matrix H are obtained.

于公式(1)、(2)、(3)和(4)中，

于本实施例中，S3中根据X得到声音信号S¹,S²,…,Sⁿ各自的初始估计频谱

按如下步骤进行：In this embodiment, in S3, the respective initial estimated spectrums of the sound signals S ¹ , S ² , ..., ^Sn are obtained according to X

Proceed as follows:

于本实施例中，S4中对

分别进行半非负矩阵分解，得到对应的特征矩阵

和系数矩阵

按如下步骤进行：In this embodiment, in S4, the

and coefficient matrix

Proceed as follows:

S42、初始化系数矩阵

The value of all its elements is a random number between (0,1);

S43、计算特征矩阵

的初始值为S43. Calculate the feature matrix

The initial value is

S44、将特征矩阵

和系数矩阵

交替迭代更新：先迭代更新一次

然后迭代更新一次

如此循环往复的先后迭代更新

和

利用公式

迭代更新特征矩阵

中的元素，利用公式

迭代更新系数矩阵

中的元素；S44, the feature matrix

and coefficient matrix

Alternate iterative update: first iterative update once

Then iteratively update once

Iterative update in this way

and

Use the formula

Iteratively update the feature matrix

elements in , using the formula

Iteratively update the coefficient matrix

elements in;

设定最大迭代次数

或者迭代次数达到最大迭代次数

时，则停止迭代，得到最终的特征矩阵

和系数矩阵

Set the maximum number of iterations

Or the number of iterations reaches the maximum number of iterations

When , stop the iteration and get the final feature matrix

and coefficient matrix

于公式(5)、(6)、(7)和(8)中，

表示矩阵的Frobenius范数；

表示初始估计频谱

表示特征矩阵

表示系数矩阵

为

的转置；

为

的逆矩阵；

为

的转置；

为

的转置；

为

中的正值元素；

为

中的负值元素；

为

中的正值元素；

为

中的负值元素。In formulas (5), (6), (7) and (8),

represents the Frobenius norm of the matrix;

represents the initial estimated spectrum

Represents the feature matrix

Represents the coefficient matrix

for

transpose of ;

for

The inverse matrix of ;

for

transpose of ;

for

transpose of ;

for

positive-valued elements in ;

for

Negative elements in ;

for

positive-valued elements in ;

for

Negative elements in .

于本实施例中，S5中根据X、W、H，以及

获得声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ，按如下步骤进行：In this embodiment, according to X, W, H, and

表示系数矩阵

Represents the coefficient matrix

is a row vector of dimension M;

S52、按照公式

计算

与h_k的相关系数

因此可以计算出

的每一行与h_k的相关系数

然后选出

中的最大值，记为最大相关系数

S52. According to the formula

calculate

Correlation coefficient with h _k

So it can be calculated

The correlation coefficient of each row with h _k

then choose

The maximum value in , denoted as the maximum correlation coefficient

S53、按照步骤S52，计算

的每一行与H的每一行的最大相关系数，记为

S53, according to step S52, calculate

S54、按照步骤S53，分别计算

S58、按照公式Xⁱ＝WⁱH^iT(i＝1,2,…,n)(10)计算得到声音信号S¹,S²,…,Sⁿ各自的频谱X¹,X²,…,Xⁿ。S58, according to the formula X ⁱ =W ⁱ H ^iT (i=1,2,...,n) (10), calculate and obtain the respective frequency spectra X ¹ , X ² ,..., of the sound signals S ¹ , S ² ,...,S ⁿ X ⁿ .

于本实施例中，S6中根据傅立叶变换结果F以及频谱X¹,X²,…,Xⁿ，获得声音信号S¹,S²,…,Sⁿ各自的傅立叶变换结果F¹,F²,…,Fⁿ，按如下步骤进行：In this embodiment, in S6, according to the Fourier transform result F and the frequency spectrum X ¹ , X ² ,...,X ⁿ , the respective Fourier transform results F ¹ , F ² of the sound signals S ¹ , S ² ,..., ^Sn are obtained, …,F ⁿ , proceed as follows:

是p行q列矩阵，初始化Fⁱ为零矩阵(矩阵中的元素的值全为0)； ^S61 . Use ^F ⁱ ( ⁱ ⁼ ¹ , ² , .

is a matrix of p rows and q columns, and initializes F ⁱ to a zero matrix (the values of the elements in the matrix are all 0);

is a matrix with p rows and q columns;

S64、按照步骤S63执行，遍历X¹,X²,…,Xⁿ中所有元素，则得到

于本实施例中，本发明的效果可以通过以下实验进一步说明：In this embodiment, the effect of the present invention can be further illustrated by the following experiments:

1)实验数据1) Experimental data

实验数据来自于网上公开的一段心音信号和一段肺音信号，信号的采样频率均为2000Hz，信号持续的时间长度均为5秒。将二者混合，则得到单通道混合声音信号。心音信号的主要频段已知为≤100Hz，肺音信号的主要频段已知为≥300Hz。原始心音信号如图2所示，原始肺音信号如图3所示，原始心音信号和原始肺音信号的混合信号如图4所示。The experimental data comes from a piece of heart sound signal and a piece of lung sound signal published on the Internet. The sampling frequency of the signal is 2000Hz, and the duration of the signal is 5 seconds. Mix the two to get a single-channel mixed sound signal. The main frequency band of the heart sound signal is known to be ≤100Hz, and the main frequency band of the lung sound signal is known to be ≥300Hz. The original heart sound signal is shown in Figure 2, the original lung sound signal is shown in Figure 3, and the mixed signal of the original heart sound signal and the original lung sound signal is shown in Figure 4.

2)实验条件2) Experimental conditions

本发明的实验程序使用Matlab9.2.0软件编写。目标函数Γ的最小值Γ_min设为10^-4，目标函数Γⁱ(i＝1,2,…,n)的最小值

也设为10^-4，最大迭代次数E_max设为500次，最大迭代次数

也设为500次。The experimental program of the present invention is written using Matlab9.2.0 software. The minimum value Γ _min of the objective function Γ is set to 10 ^-4 , and the minimum value of the objective function Γ ⁱ (i=1,2,...,n)

Also set to 10 ^-4 , the maximum number of iterations E _max is set to 500 times, the maximum number of iterations

Also set to 500 times.

3)实验结果3) Experimental results

以分离出的信号与原信号的归一化均方误差NMSE(Normalized Mean SquaredError)作为本发明的分离效果评价指标，该指标越小越好。通过本发明对混合声音信号进行信号分离，然后按照公式

计算分离出的信号与原信号的归一化均方误差NMSE。于公式(11)中，s表示原始信号(在本发明的实验中，s为心音信号或肺音信号)，

表示分离出的信号(在本发明的实验中，

为分离出的心音信号或肺音信号)。分离出的信号与原始信号对比如图5所示，归一化均方误差NMSE结果如表1所示。The normalized mean squared error NMSE (Normalized Mean Squared Error) between the separated signal and the original signal is used as the evaluation index of the separation effect of the present invention, and the smaller the index, the better. The mixed sound signal is separated by the present invention, and then according to the formula

Calculate the normalized mean square error NMSE between the separated signal and the original signal. In formula (11), s represents the original signal (in the experiment of the present invention, s is the heart sound signal or the lung sound signal),

represents the separated signal (in the experiments of the present invention,

is the isolated heart sound signal or lung sound signal). The comparison between the separated signal and the original signal is shown in Figure 5, and the normalized mean square error NMSE results are shown in Table 1.

表1Table 1

以上所述实施例仅表达了本发明的一种实施方式，其描述较为具体和详细，但并不能因此而理解为对本发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干变形和改进，这些都属于本发明的保护范围。因此，本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiment only represents an embodiment of the present invention, and its description is relatively specific and detailed, but it should not be construed as a limitation on the patent scope of the present invention. It should be pointed out that for those skilled in the art, without departing from the concept of the present invention, several modifications and improvements can be made, which all belong to the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention should be subject to the appended claims.

Claims

1. a sound signal separation method based on semi-non-negative matrix decomposition, is characterized in that, this method comprises the following steps:

S1. The single-channel mixed sound signal S is formed by mixing several independent sound signals S ¹ , S ² , ..., ^Sn , calculate the Fourier transform result F of S, and calculate the spectrum X according to F;

S2. Perform semi-non-negative matrix decomposition on X to obtain feature matrix W and coefficient matrix H;

S4, yes

and coefficient matrix

S5. According to X, W, H, and

S6. According to the Fourier transform result F and the frequency spectrum X ¹ , X ² ,...,X ⁿ , obtain the respective Fourier transform results F ¹ , F ² ,..., F ⁿ of the sound signals S ¹ , S ² ,..., ^Sn ;

^S7 . ^Perform inverse Fourier transform ^on ^F ¹ , F ² , . ¹ , S ² ,…,S ⁿ .

2 . The sound signal separation method based on semi-non-negative matrix decomposition according to claim 1 , wherein the single-channel mixed sound signal in S1 means that the mixed sound signal is collected by only one sound collector. 3 .

3. The sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, wherein the single-channel mixed sound signal S described in S1 is composed of several independent sound signals S ¹ , S ² , . . . S A mixture of ⁿ , where the value of n is 2 or 3.

4. the sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, is characterized in that, in S2, X is carried out semi-non-negative matrix decomposition, obtain characteristic matrix W and coefficient matrix H, carry out according to the following steps:

S21. Construct the objective function Γ of the semi-nonnegative matrix factorization

S22. Initialize the coefficient matrix H, and the values of all its elements are random numbers between (0, 1);

S23. Calculate the initial value of the feature matrix W as

W=XH(H ^T H) ^-1 (2)

S24, iteratively update the feature matrix W and the coefficient matrix H alternately: first iteratively update W once, and then iteratively update H once, so that W and H are iteratively updated successively in a cycle; using the formula W=XH(H ^T H) ^-1 ( 3) Iteratively update the elements in the feature matrix W, using the formula

Iteratively update the elements in the coefficient matrix H;

S25. Set the minimum value Γ _min of the objective function Γ of the semi-nonnegative matrix decomposition, set the maximum number of iterations E _max , and calculate the value of the objective function Γ after each iteration update is completed. When the value of the objective function Γ is less than Γ _min or When the number of iterations reaches the maximum number of iterations E _max , the iteration is stopped, and the final feature matrix W and coefficient matrix H are obtained;

In formulas (1), (2), (3) and (4),

5. The sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, wherein in S3, the respective initial estimated spectrums of the sound signals S ¹ , S ² , ..., ^Sn are obtained according to X in S3

Proceed as follows:

S31. The respective main frequency bands of the sound signals S ¹ , S ² ,...,S ⁿ are f ₁ , f ₂ ,..., f _n ;

6. the sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, is characterized in that, in S4, to

and coefficient matrix

Proceed as follows:

S41. Construct the objective function Γ ⁱ (i=1,2,...,n) of semi-nonnegative matrix decomposition

S42, initialization coefficient matrix

The value of all its elements is a random number between (0,1);

S43. Calculate the feature matrix

The initial value is

S44, the feature matrix

and coefficient matrix

Alternate iterative update: first iterative update once

Then iteratively update once

Iterative update in this way

and

Use the formula

Iteratively update the feature matrix

elements in , using the formula

Iteratively update the coefficient matrix

elements in;

Set the maximum number of iterations

Or the number of iterations reaches the maximum number of iterations

When , stop the iteration and get the final feature matrix

and coefficient matrix

In formulas (5), (6), (7) and (8),

represents the Frobenius norm of the matrix;

represents the initial estimated spectrum

Represents the feature matrix

Represents the coefficient matrix

for

transpose of ;

for

The inverse matrix of ;

for

transpose of ;

for

transpose of ;

for

positive-valued elements in ;

for

Negative elements in ;

for

positive-valued elements in ;

for

Negative elements in .

7. the sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, is characterized in that, according to X, W, H in S5, and

S51, H=[h ₁ ; h ₂ ;...;h _c ], h _k (k=1, 2,..., c) is a row vector of dimension M, which is represented by

Represents the coefficient matrix

is a row vector of dimension M;

S52. According to the formula

calculate

Correlation coefficient with h _k

So it can be calculated

The correlation coefficient of each row with h _k

then choose

The maximum value in , denoted as the maximum correlation coefficient

S53, according to step S52, calculate

S54, according to step S53, calculate respectively

S55. The characteristic matrices of the respective frequency spectra X ¹ , X ² ,..., X ⁿ of the sound signals S ¹ , S ² ,..., ^Sn are W ¹ , W ² ,..., W ⁿ , and the coefficient matrices are H ¹ , H ² ,…,H ⁿ ;

S56. If in the kth column (k=1,2,...,c) of Q, the ith row (i=1,2,...,n) has the largest value, then the kth row of the coefficient matrix H is h _k is assigned to H ⁱ , and the k- ^th column of the feature matrix W=[w ₁ , w ₂ , . . . , _wn ], that is, w _k is assigned to Wi ;

S57, calculate the maximum value of each column in Q, and execute according to step S56, thereby obtaining characteristic matrices W ¹ , W ² ,...,W ⁿ and coefficient matrices H ¹ ,H ² ,...,H ⁿ ;

S58, according to the formula X ⁱ =W ⁱ H ^iT (i=1,2,...,n) (10) to obtain the respective frequency spectra X ¹ , X ² ,..., of the sound signals S ¹ , S ² ,...,S ⁿ X ⁿ ;

In formulas (9) and (10), h _k ^T is the transposition of h _k ; H ^iT is the transposition of H ⁱ ; X ⁱ (i=1,2,...,n) represents the spectrum X ¹ , X ² ,...,X ⁿ ; Wi ( ⁱ =1,2,...,n) represents the characteristic matrix W ¹ ,W ² ,...,W ⁿ ;H ⁱ (i=1,2,...,n) represents the coefficient matrix H ¹ , H ² ,…,H ⁿ .

8. The sound signal separation method based on semi-non-negative matrix decomposition according to claim 1, wherein in S6, according to Fourier transform result F and frequency spectrum X ¹ , X ² ,..., X ⁿ , obtain sound signal S ¹ ,S ² ,...,S ⁿ The Fourier transform results F ¹ ,F ² ,...,F ⁿ of each are carried out according to the following steps:

^S61 . Use ^F ⁱ ( ⁱ ⁼ ¹ , ² , .

is a matrix with p rows and q columns, and initializes F ⁱ as a zero matrix;

S62. F ⁼ ^{ F _rk } _p×q is a matrix with p rows and q columns, and X ⁱ (i=1, ² , . X ¹ , X ² ,...,X ⁿ ,

is a matrix with p rows and q columns;

S64. Execute according to step S63, traverse all elements in X ¹ , X ² ,..., X ⁿ , to obtain