CN106504757A

CN106504757A - An Adaptive Audio Blind Watermarking Method Based on Auditory Model

Info

Publication number: CN106504757A
Application number: CN201610983877.6A
Authority: CN
Inventors: 张涛; 张彩霞; 高新意; 赵鑫
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2016-11-09
Filing date: 2016-11-09
Publication date: 2017-03-15

Abstract

A kind of adaptive audio blind watermark method based on auditory model：Arnold conversion, dimension-reduction treatment and chaos transformation are carried out to watermark signal, watermark information is obtained；Calculate human auditory system masking threshold；Self adaptation chooses the embedded section of watermark and embedded location；Carry out embedding algorithm；Audio section restructuring is carried out, the audio section of all audio sections for being not embedded into watermark and embedded watermark is recombinated, is combined into the audio signal containing whole watermarks；Extract digital watermarking.The present invention ensures that watermark signal is embedded in the watermark information of large energy on the premise of not affecting audio quality, the self adaptation for achieving watermark embedded section, watermark embedded location and watermark embedment strength is chosen, and while algorithm not sentience is ensured improves watermark robustness and watermark embedding capacity.The present invention is more fine to make use of human hearing characteristic, with more preferable hiding performance.The safety of watermark is further ensured, and the attack of common signal process can be effective against.

Description

An Adaptive Audio Blind Watermarking Method Based on Auditory Model

技术领域technical field

本发明涉及信息隐藏领域。特别是涉及一种用于音频作品版权管理的基于听觉模型的自适应音频盲水印方法。The present invention relates to the field of information hiding. In particular, it relates to an adaptive audio blind watermarking method based on an auditory model for copyright management of audio works.

背景技术Background technique

信息技术和计算机网络的快速发展引发了一系列信息安全问题、盗版问题等，使得多媒体内容的版权保护和内容认证成为了亟待解决的问题。数字水印技术作为一种有效的解决方案获得了迅速的发展，并且已经成为学术界的一个研究热点。由于人耳具有更高的灵敏度，使得音频水印技术相比视频和图像水印技术具有更大的挑战。音频水印技术就是将一种特殊的标志信息嵌入到原始数字音频作品中，用以辨识音频作品的版权和合法使用者，从而达到音频作品版权保护的作用。音频水印技术需要满足三个基本要求，即水印的不可感知性、鲁棒性和水印容量，三者之间是相互矛盾的。如何设计方法，使得三者之间达到最佳的平衡一直是音频水印技术的一个难点。当前，数字水印技术大致可以分为两种不同的类型，包括时间域水印嵌入方法和变换域水印嵌入方法。早期的方法选择嵌入在时域部分。它的特点是算法简单，易于实现，但鲁棒性和不可感知性较差。当前主流的音频水印算法是变换域算法，变换域算法在嵌入水印前需要将音频载体信号从时域变换到频域。由于变换域算法考虑了音频载体特性和人耳听觉特性，具有比时域算法更好的鲁棒性和不可感知性。而对于音频这样的时变信号，利用小波分析的多分辨率特性和时频局部特性，相比其他变换域算法具有更好的鲁棒性，已经成为音频水印算法的研究热点。近年来有不少学者将压缩感知、奇异值分解、量化索引调制、神经网络和人耳掩蔽效应等方法应用到变换域算法中，进一步提高了变换域算法的性能。The rapid development of information technology and computer networks has caused a series of information security issues, piracy issues, etc., making copyright protection and content authentication of multimedia content an urgent problem to be solved. As an effective solution, digital watermarking has developed rapidly and has become a research hotspot in the academic circle. Due to the higher sensitivity of the human ear, audio watermarking is more challenging than video and image watermarking. Audio watermarking technology is to embed a special logo information into the original digital audio works to identify the copyright and legal users of the audio works, so as to achieve the protection of the copyright of the audio works. Audio watermarking technology needs to meet three basic requirements, that is, watermark imperceptibility, robustness and watermark capacity, and the three are contradictory. How to design a method to achieve the best balance among the three has always been a difficult point in audio watermarking technology. Currently, digital watermarking techniques can be roughly divided into two different types, including time-domain watermark embedding methods and transform-domain watermark embedding methods. Early methods opted to embed the time-domain part. It is characterized by a simple algorithm and easy implementation, but poor robustness and imperceptibility. The current mainstream audio watermarking algorithm is the transform domain algorithm, which needs to transform the audio carrier signal from the time domain to the frequency domain before embedding the watermark. Since the transform domain algorithm takes into account the characteristics of the audio carrier and the auditory characteristics of the human ear, it has better robustness and imperceptibility than the time domain algorithm. For time-varying signals such as audio, using the multi-resolution and time-frequency local characteristics of wavelet analysis has better robustness than other transform domain algorithms, and has become a research hotspot in audio watermarking algorithms. In recent years, many scholars have applied methods such as compressed sensing, singular value decomposition, quantization index modulation, neural network and human ear masking effect to the transform domain algorithm, and further improved the performance of the transform domain algorithm.

尽管当前水印算法已经获得了很好的鲁棒性和不可感知性，但是多数算法对水印嵌入容量问题考虑较少。如何设计算法使得水印在鲁棒性、不可感知性和嵌入容量三者之间达到最佳的平衡依然需要做进一步的研究和探索。此外，大多数算法根据一般经验和实验效果选择将水印嵌入在整个音频段的固定频率点，嵌入位置不是自适应的。Although the current watermarking algorithms have achieved good robustness and imperceptibility, most of the algorithms pay less attention to the watermark embedding capacity. How to design an algorithm to achieve the best balance between robustness, imperceptibility and embedding capacity still needs further research and exploration. In addition, most algorithms choose to embed the watermark at a fixed frequency point in the entire audio segment based on general experience and experimental results, and the embedding position is not adaptive.

发明内容Contents of the invention

本发明所要解决的技术问题是提供了一种鲁棒性和透明性好、隐藏容量大的基于听觉模型的自适应音频盲水印方法。The technical problem to be solved by the present invention is to provide an adaptive audio blind watermarking method based on an auditory model with good robustness, transparency and large hidden capacity.

本发明所采用的技术方案是：一种基于听觉模型的自适应音频盲水印方法，包括如下步骤：The technical scheme adopted in the present invention is: a kind of adaptive audio blind watermarking method based on auditory model, comprises the following steps:

1)对水印信号进行Arnold变换、降维处理和混沌变换，得到水印信息w：1) Perform Arnold transformation, dimension reduction processing and chaotic transformation on the watermark signal to obtain the watermark information w:

w＝{w(i),0≤i≤M×N}；w={w(i),0≤i≤M×N};

其中，w(i)表示加密后的水印序列，M×N表示水印序列总数；Among them, w(i) represents the encrypted watermark sequence, and M×N represents the total number of watermark sequences;

2)计算人耳听觉掩蔽阈值；2) Calculating the threshold of human auditory masking;

3)自适应选取水印嵌入段和嵌入位置：3) Adaptively select the watermark embedding segment and embedding position:

将水印信息嵌入到频域系数X(jw)的中、低频系数中，计算每个子带中能量值S_pz(z)低于掩蔽阈值T_hr(z)的中、低频频率分量的总数，选择频率分量总数高于所设门限值T的音频段作为水印嵌入段，将能量值小于掩蔽阈值的频率分量从大到小进行排序，选取前L位频率分量F_k＝{f_k(i),1≤i≤L}进行水印嵌入；Embed the watermark information into the mid- and low-frequency coefficients of the frequency-domain coefficient X(jw), calculate the total number of mid- and low-frequency frequency components whose energy value S _pz (z) is lower than the masking threshold _{Th hr} (z) in each subband, and select The audio segment whose total frequency component is higher than the set threshold value T is used as the watermark embedding segment, and the frequency components whose energy value is smaller than the masking threshold are sorted from large to small, and the first L frequency components are selected F _k ={f _k (i) ,1≤i≤L} for watermark embedding;

4)进行数字水印的嵌入4) Embed the digital watermark

采用量化索引调制的方法将L位水印嵌入到所选频域系数中，对嵌入水印后的音频段做逆离散余弦变换和逆离散小波包变换，小波包逆变换后的音频信号表示为A^*(k)，用A^*(k)代替A(k)完成一段音频信号的水印嵌入。然后，继续在满足条件的下一段载体音频中嵌入水印，直至实现所有的水印嵌入；Embed the L-bit watermark into the selected frequency domain coefficients by using the method of quantization index modulation, and perform inverse discrete cosine transform and inverse discrete wavelet packet transform on the audio segment after embedding the watermark, and the audio signal after wavelet packet inverse transformation is expressed as A ^* (k), replace A(k) with A ^* (k) to complete the watermark embedding of an audio signal. Then, continue to embed the watermark in the next piece of carrier audio that meets the conditions, until all the watermarks are embedded;

5)进行音频段重组5) Perform audio segment reorganization

将所有未嵌入水印的音频段和嵌入水印的音频段进行重组，组合成含有全部水印的音频信号；Recombining all audio segments not embedded with watermarks and audio segments embedded with watermarks to form an audio signal containing all watermarks;

6)提取数字水印，包括：6) Extract digital watermark, including:

(1)采用步骤2)的方法对含水印音频信号进行分段处理和小波包变换，并计算每一段音频信号的听觉掩蔽阈值；(1) adopting the method of step 2) to carry out subsection processing and wavelet packet transformation to the watermarked audio signal, and calculate the auditory masking threshold of each section of the audio signal;

(2)采用步骤3)的方法找到水印嵌入段和嵌入位置；(2) adopt the method of step 3) to find watermark embedding segment and embedding position;

(3)采用如下公式提取水印序列：(3) Use the following formula to extract the watermark sequence:

式中f_k ^*(i)是第k段待测音频信号的频域系数，△^*是第k段待测音频信号的量化步长；In the formula, f _k ^* (i) is the frequency domain coefficient of the audio signal to be tested in the k section, and △ ^* is the quantization step size of the audio signal to be tested in the k section;

(4)对提取的水印序列进行升维、逆Arnold变换和Logistic解密操作，得到最终的水印图像。(4) Perform dimension enhancement, inverse Arnold transform and Logistic decryption operations on the extracted watermark sequence to obtain the final watermark image.

骤2)包括：Step 2) includes:

(1)设原始音频信号为A＝{a(i),1≤i<N}，N为采样点个数，a(i)为音频信号，对音频信号进行分段处理，每段包含2048个采样点，设第k段原始音频信号表示为A(k)，采用db8小波基对每一段音频信号进行8级小波包变换，将0～22kHz的频带划分为26个非等宽的子带；(1) Suppose the original audio signal is A={a(i), 1≤i<N}, N is the number of sampling points, a(i) is the audio signal, and the audio signal is segmented, each segment contains 2048 Assuming that the k-th segment of the original audio signal is expressed as A(k), the db8 wavelet base is used to perform 8-level wavelet packet transformation on each segment of the audio signal, and the frequency band from 0 to 22 kHz is divided into 26 sub-bands of non-equal width ;

(2)对每个子带的小波包系数进行离散余弦变换，得到频域系数X(jw)：(2) Discrete cosine transform is performed on the wavelet packet coefficients of each subband to obtain the frequency domain coefficient X(jw):

X(jw)＝DCT(w_i(k))X(jw)=DCT(w _i (k))

其中，w_i(k)表示小波包分解后第i个子带中的第k个小波包系数，X(jw)表示DCT变换后的频域系数；Among them, w _i (k) represents the kth wavelet packet coefficient in the i-th subband after wavelet packet decomposition, and X(jw) represents the frequency domain coefficient after DCT transformation;

(3)将频域系数X(jw)映射到巴克域：(3) Map the frequency domain coefficient X(jw) to the Barker domain:

z＝round{13arctan(0.76f/1000)+3.5arctan[(f/7500)²]}，z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500) ² ]},

其中，f表示频率，z表示巴克域各子带编号，Among them, f represents the frequency, z represents the number of each subband in the Barker domain,

计算各个子带的能量值S_pz(z)：Calculate the energy value S _pz (z) for each subband:

其中，|X(jw)|²表示临界频带对应采样点的功率值，h_bz和l_bz分别表示各子带的上下边界频率；Among them, |X(jw)| ² represents the power value of the sampling point corresponding to the critical frequency band, and h _bz and l _bz represent the upper and lower boundary frequencies of each subband respectively;

(4)将各个子带的能量值S_pz(z)调整为S_m(z)＝S_pz(z)×B(z)，(4) Adjust the energy value S _pz (z) of each subband to S _m (z)=S _pz (z)×B(z),

其中B(z)为扩展函数，B(z)＝15.91+7.5(z+0.474)+17.5[1+(z+0.474)²]^1/2；Wherein B(z) is an extension function, B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474) ² ] ^1/2 ;

(5)计算每个子带的噪声特性因子a(z)：a(z)＝min[10lg(G/A)/S_max,1](5) Calculate the noise characteristic factor a(z) of each subband: a(z)=min[10lg(G/A)/S _max ,1]

其中，A为功率谱的算术平均值，G为功率谱的几何平均值，当音频信号为纯音时，a(z)＝1；白噪声时，a(z)＝0；考虑噪声特性因子后的掩蔽阈值修正值为O(z)：O(z)＝(14.5+z)a+5.5(1-a)；Among them, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when white noise, a(z)=0; after considering the noise characteristic factor The masking threshold correction value of is O(z): O(z)=(14.5+z)a+5.5(1-a);

计算每个子带的实际掩蔽阈值T(z)：求实际掩蔽阈值T(z)与子带样点个数的商，并将所述的商与绝对掩蔽阈值进行比较，将二者中值大的作为最终的掩蔽阈值T_hr(z)：T_hr(z)＝max(T,TH)。Calculate the actual masking threshold T(z) for each subband: Find the quotient of the actual masking threshold T(z) and the number of sub-band samples, and compare the quotient with the absolute masking threshold, and use the larger median as the final masking threshold T _hr (z): T _hr (z)=max(T,TH).

步骤4)包括：Step 4) includes:

设第k段音频信号量化后的频域系数表示为F_k＝{f_k ^*(i),1≤i≤L}，量化规则为：Assume that the frequency-domain coefficient of the k-th audio signal after quantization is expressed as F _k = {f _k ^* (i), 1≤i≤L}, and the quantization rule is:

式中，w是水印信息，f_k ^*(i)是第k段音频信号量化后的系数，f_k(i)是原始音频的频域系数，△是量化步长，根据量化原理，最大量化误差为0.5△。In the formula, w is the watermark information, f _k ^* (i) is the quantized coefficient of the k-th audio signal, f _k (i) is the frequency domain coefficient of the original audio, △ is the quantization step size, according to the quantization principle, the maximum quantization The error is 0.5△.

根据权利要求3所述的一种基于听觉模型的自适应音频盲水印方法，其特征在于，所述的量化步长，是根据噪声掩蔽比为每一个临界子带确定一个量化步长， A kind of adaptive audio blind watermarking method based on auditory model according to claim 3, it is characterized in that, described quantization step size is to determine a quantization step size for each critical subband according to the noise-masking ratio,

本发明的一种基于听觉模型的自适应音频盲水印方法，将水印嵌入在DWPT和DCT的混合域，利用掩蔽阈值确定水印嵌入段和具体嵌入位置。根据噪声掩蔽比对水印嵌入强度进行了调制，保证水印信号不影响音频质量的前提下嵌入较大能量的水印信息，实现了水印嵌入段、水印嵌入位置和水印嵌入强度的自适应选取，在保证算法不可感知性的同时提高了水印鲁棒性和水印嵌入容量。本发明更为精细的利用了人耳听觉特性，具有更好的隐藏性能。对于数字水印，采用Logistic加密，同时结合Arnold变换实现了水印双重加密，进一步保证了水印的安全性。实验表明本发明所提算法不仅具有较好的不可感知性和较大的水印容量，而且能够有效抵抗常见信号处理的攻击，具有很好的鲁棒性。An adaptive audio blind watermarking method based on the auditory model of the present invention embeds the watermark in the mixed domain of DWPT and DCT, and uses the masking threshold to determine the watermark embedding segment and specific embedding position. The watermark embedding intensity is modulated according to the noise-masking ratio to ensure that the watermark signal does not affect the audio quality, and the watermark information with large energy is embedded. The imperceptibility of the algorithm improves the watermark robustness and watermark embedding capacity at the same time. The present invention utilizes the auditory characteristics of the human ear more precisely, and has better concealment performance. For the digital watermark, the Logistic encryption is adopted, and the double encryption of the watermark is realized in combination with the Arnold transform, which further ensures the security of the watermark. Experiments show that the algorithm proposed by the present invention not only has good imperceptibility and large watermark capacity, but also can effectively resist common signal processing attacks and has good robustness.

附图说明Description of drawings

图1是本发明一种基于听觉模型的自适应音频盲水印方法的流程图；Fig. 1 is the flow chart of a kind of adaptive audio blind watermarking method based on auditory model of the present invention;

图2是本发明实施例原始音频信号波形图；Fig. 2 is an original audio signal waveform diagram of an embodiment of the present invention;

图3是本发明实施例含水印音频信号波形图；FIG. 3 is a waveform diagram of a watermarked audio signal according to an embodiment of the present invention;

图4是本发明实施例含水印音频信号与原始音频信号差值波形图。Fig. 4 is a waveform diagram of a difference between a watermarked audio signal and an original audio signal according to an embodiment of the present invention.

具体实施方式detailed description

下面结合实施例和附图对本发明的一种基于听觉模型的自适应音频盲水印方法做出详细说明。An adaptive audio blind watermarking method based on an auditory model of the present invention will be described in detail below with reference to embodiments and drawings.

如图1所示，本发明的一种基于听觉模型的自适应音频盲水印方法，包括如下步骤：As shown in Figure 1, a kind of adaptive audio blind watermarking method based on auditory model of the present invention comprises the following steps:

1)对水印信号进行Arnold变换、降维处理和混沌变换，得到水印信息w。1) Arnold transform, dimension reduction processing and chaotic transformation are performed on the watermark signal to obtain the watermark information w.

设M₀是原始水印信号，表示为M₀＝{m₀(i,j),0≤i<M,0≤j<N}。其中，m₀(i,j)∈{0,1}表示二值水印图像的象素灰度值。为消除水印图像的空间相关性，我们将水印作Arnold置乱，表示为M₁＝{m₁(i,j),0≤i<M,0≤j<N}。经Arnold变换后，即使部分水印受到攻击，依然能够提取出整个水印轮廓。此外，如果不知道Arnold变换的次数，即使攻击者提取出水印也无法恢复出水印图像，从而增强了水印的安全性。由于音频信号是一维信号，还需要对Arnold置乱后的水印序列进行降维处理V＝{v(k)＝m₁(i,j),1≤i≤M,1≤j≤N,k＝(i-1)×N+j}。Let M ₀ be the original watermark signal, expressed as M ₀ ={m ₀ (i,j),0≤i<M,0≤j<N}. Among them, m ₀ (i,j)∈{0,1} represents the pixel gray value of the binary watermark image. In order to eliminate the spatial correlation of the watermark image, we scramble the watermark as Arnold, expressed as M ₁ ={m ₁ (i,j),0≤i<M,0≤j<N}. After Arnold transformation, even if part of the watermark is attacked, the entire watermark outline can still be extracted. In addition, if the number of Arnold transformations is not known, even if the attacker extracts the watermark, the watermark image cannot be recovered, thereby enhancing the security of the watermark. Since the audio signal is a one-dimensional signal, it is also necessary to perform dimension reduction processing on the Arnold scrambled watermark sequence V={v(k)=m ₁ (i,j),1≤i≤M,1≤j≤N, k=(i-1)×N+j}.

为了进一步提高水印的抗攻击能力，本发明使用Logistic方程得到混沌序列，将此序列与降维后的图像水印序列v(k)做异或运算。经Logistic加密后的水印表示为：In order to further improve the anti-attack capability of the watermark, the present invention uses the Logistic equation to obtain a chaotic sequence, and performs an XOR operation with the sequence and the image watermark sequence v(k) after dimensionality reduction. The watermark after Logistic encryption is expressed as:

w＝{w(i),0≤i≤M×N}w={w(i),0≤i≤M×N}

其中，w(i)表示加密后的水印序列，M×N为水印序列总数。Among them, w(i) represents the encrypted watermark sequence, and M×N is the total number of watermark sequences.

本发明将混沌序列的初值、参数和Arnold变换次数作为秘钥，在提取水印时利用这些秘钥可以重新获得水印序列。The invention uses the initial value, parameters and Arnold transformation times of the chaotic sequence as secret keys, and the watermark sequence can be obtained again by using these secret keys when extracting the watermark.

2)计算人耳听觉掩蔽阈值2) Calculating the threshold of human auditory masking

人耳存在掩蔽效应，当存在能量相差一定程度的两个或多个激励声音同时作用于人类听觉系统时，弱音就会被强音掩蔽掉，变得不可感知，这种现象称为同时掩蔽(或频率掩蔽)效应。当水印能量被限制在掩蔽阈值以下时，就能保证水印的不可感知性。心理声学模型中子带划分应尽可能接近人耳临界频带，而小波包分解的多分辨率特性满足这个要求，所以本发明选择在小波包域中计算掩蔽阈值。本发明采用了文献[1]提出的一种简单的心理声学模型，并在此基础上进行了改进，将小波包变换引入了心理声学模型，将整个频带划分成26个不等宽的子带，使得子带带宽接近人耳临界频带带宽。具体计算步骤如下：There is a masking effect in the human ear. When two or more exciting sounds with a certain energy difference act on the human auditory system at the same time, the weak sound will be masked by the strong sound and become imperceptible. This phenomenon is called simultaneous masking ( or frequency masking) effect. The imperceptibility of the watermark is guaranteed when the watermark energy is constrained below the masking threshold. The sub-band division in the psychoacoustic model should be as close as possible to the critical frequency band of the human ear, and the multi-resolution characteristics of wavelet packet decomposition meet this requirement, so the present invention chooses to calculate the masking threshold in the wavelet packet domain. The present invention adopts a simple psychoacoustic model proposed by literature [1], and improves it on this basis, introduces wavelet packet transform into the psychoacoustic model, and divides the entire frequency band into 26 subbands of unequal width , so that the sub-band bandwidth is close to the critical band bandwidth of the human ear. The specific calculation steps are as follows:

(2)对每个子带的小波包系数进行离散余弦变换(DCT)，得到频域系数X(jw)：(2) Discrete cosine transform (DCT) is performed on the wavelet packet coefficients of each subband to obtain the frequency domain coefficient X(jw):

X(jw)＝DCT(w_i(k))X(jw)=DCT(w _i (k))

(3)将频域系数X(jw)映射到巴克(Bark)域：(3) Map the frequency domain coefficient X(jw) to the Bark domain:

z＝round{13arctan(0.76f/1000)+3.5arctan[(f/7500)²]}z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500) ² ]}

其中，f表示频率，z表示巴克(Bark)域各子带编号。Wherein, f represents the frequency, and z represents the number of each subband in the Bark domain.

其中|X(jw)|²表示临界子带对应采样点的功率值，h_bz和l_bz分别表示各子带的上下边界频率。Where |X(jw)| ² represents the power value of the sampling point corresponding to the critical sub-band, and h _bz and l _bz represent the upper and lower boundary frequencies of each sub-band respectively.

(4)由人耳感知特性知，某个子带内的信号会受到相邻子带信号的影响，将各个子带的能量值S_pz(z)调整为(4) According to the perceptual characteristics of the human ear, the signal in a certain subband will be affected by the adjacent subband signal, and the energy value S _pz (z) of each subband is adjusted to

S_m(z)＝S_pz(z)×B(z)S _m (z) = S _pz (z) × B (z)

(5)掩蔽阈值的计算与音频的噪声特性有关，计算每个子带的噪声特性因子a(z)：(5) The calculation of the masking threshold is related to the noise characteristics of the audio, and the noise characteristic factor a(z) of each subband is calculated:

a(z)＝min[10lg(G/A)/S_max,1]a(z)=min[10lg(G/A)/S _max ,1]

其中，A为功率谱的算术平均值，G为功率谱的几何平均值，当音频信号为纯音时，a(z)＝1；白噪声时，a(z)＝0。Wherein, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when it is white noise, a(z)=0.

由人耳掩蔽效应知，当纯音掩蔽噪声时，掩蔽阈值大约下降14.5+zdb；当噪声掩蔽纯音时，掩蔽阈值下降5.5db。考虑噪声特性因子后的掩蔽阈值修正值为：O(z)＝(14.5+z)a+5.5(1-a)According to the masking effect of the human ear, when the pure tone masks the noise, the masking threshold drops by about 14.5+zdb; when the noise covers the pure tone, the masking threshold drops by 5.5db. The masking threshold correction value after considering the noise characteristic factor is: O(z)=(14.5+z)a+5.5(1-a)

计算每个子带的实际掩蔽阈值T(z)：求实际掩蔽阈值T(z)与子带样点个数的商，并将所述的商与绝对掩蔽阈值进行比较，将二者中值大的作为最终的掩蔽阈值T_hr(z)：T_hr(z)＝max(T,TH)。本发明根据所求掩蔽阈值自适应确定水印嵌入段、具体嵌入位置和水印嵌入强度。Calculate the actual masking threshold T(z) for each subband: Find the quotient of the actual masking threshold T(z) and the number of sub-band samples, and compare the quotient with the absolute masking threshold, and use the larger median as the final masking threshold T _hr (z): T _hr (z)=max(T,TH). The invention self-adaptively determines the watermark embedding segment, specific embedding position and watermark embedding intensity according to the sought masking threshold.

3)根据听觉掩蔽阈值自适应选取水印嵌入段和嵌入位置：3) Adaptively select the watermark embedding segment and embedding position according to the auditory masking threshold:

音频信号是非平稳信号，每个音频段对噪声的敏感程度不同，如果水印嵌入在整个音频段，势必降低音频信号的信噪比。为此，本发明提出一种水印嵌入段选择方案来提高水印的性能。在确定水印嵌入位置上，已有算法大多根据人耳掩蔽效应给出大致的嵌入范围，没有精确给出不容易被感知的频率位置。本发明所提算法利用频域掩蔽效应来精确确定水印嵌入位置，具有更好的不可感知性。The audio signal is a non-stationary signal, and each audio segment has different sensitivity to noise. If the watermark is embedded in the entire audio segment, the signal-to-noise ratio of the audio signal will inevitably be reduced. For this reason, the present invention proposes a watermark embedding segment selection scheme to improve the performance of the watermark. In determining the embedding position of the watermark, most of the existing algorithms give a rough embedding range based on the masking effect of the human ear, and do not give an accurate frequency position that is not easy to be perceived. The algorithm proposed by the invention utilizes the masking effect in the frequency domain to accurately determine the embedding position of the watermark, and has better imperceptibility.

根据听觉模型,当一帧音频信号中可被掩蔽的频率分量足够多时，说明该音频段噪声敏感度低，可用来隐藏水印信息。由于中低频系数集中了音频信号的大部分能量，而常见音频信号处理大多在高频部分进行(如压缩、滤波等)。考虑到算法的鲁棒性，本发明将水印信息嵌入到频域系数X(jw)的中、低频系数(第0～12Bark)中，计算每个子带中能量值S_pz(z)低于掩蔽阈值T_hr(z)的中、低频频率分量的总数，选择频率分量总数高于所设门限值T的音频段作为水印嵌入段，由频域掩蔽效应知：幅值越大的频域系数，它两侧的频率分量越容易被掩蔽。因此，本算法在选定音频数据段后，将能量值小于掩蔽阈值的频率分量从大到小进行排序，选取前L位频率分量F_k＝{f_k(i),1≤i≤L}进行水印嵌入。According to the auditory model, when there are enough frequency components that can be masked in a frame of audio signal, it means that the audio segment has low noise sensitivity and can be used to hide watermark information. Since the mid-low frequency coefficients concentrate most of the energy of the audio signal, common audio signal processing is mostly performed on the high-frequency part (such as compression, filtering, etc.). Considering the robustness of the algorithm, the present invention embeds the watermark information into the middle and low frequency coefficients (0th to 12th Bark) of the frequency domain coefficient X(jw), and calculates the energy value S _pz (z) in each subband lower than the masking The total number of middle and low frequency frequency components of the threshold _{Th hr} (z), select the audio segment whose total frequency component is higher than the set threshold value T as the watermark embedding segment, known by the frequency domain masking effect: the frequency domain coefficient with larger amplitude , the frequency components on both sides of it are easier to be masked. Therefore, after the audio data segment is selected, the algorithm sorts the frequency components whose energy value is less than the masking threshold from large to small, and selects the first L frequency components F _k = {f _k (i), 1≤i≤L} Perform watermark embedding.

4)进行数字水印的嵌入4) Embed the digital watermark

采用量化索引调制的方法将水印嵌入到所选频域系数中；包括：The watermark is embedded into the selected frequency domain coefficients by quantization index modulation; including:

这里，量化步长的选取关系算法的不可感知性和鲁棒性，量化步长越小不可感知性越好，但降低了鲁棒性；量化步长越大抗攻击能力越强，但不可感知性越差。本发明利用心理声学模型的噪声掩蔽比为每一个临界子带自适应确定水印嵌入强度。Here, the choice of quantization step size is related to the imperceptibility and robustness of the algorithm. The smaller the quantization step size, the better the imperceptibility, but the robustness is reduced; the larger the quantization step size, the stronger the anti-attack ability, but it is imperceptible The worse the sex. The present invention utilizes the noise masking ratio of the psychoacoustic model to adaptively determine the watermark embedding strength for each critical subband.

文献[2]通过对嵌入水印的音频信号设置不同的噪声掩蔽比，分析了嵌入水印对音频质量的影响。实验结果证明如果嵌入水印后的音频噪声掩蔽比NMR小于等于-5db，则音频失真将是不可感知的。基于此，本发明设计了一种根据噪声掩蔽比自适应确定水印嵌入强度的方法。Literature [2] analyzed the impact of embedded watermark on audio quality by setting different noise-masking ratios for embedded watermarked audio signals. Experimental results show that if the audio noise masking ratio NMR after embedding the watermark is less than or equal to -5db, the audio distortion will be imperceptible. Based on this, the present invention designs a method for adaptively determining the embedding strength of the watermark according to the noise-concealment ratio.

所述的量化步长，是根据噪声掩蔽比为每一个临界子带确定一个量化步长，本发明中临界子带的量化噪声由该子带内的频带引入量化噪声的最大值确定，每个临界子带的噪声掩蔽比为：The quantization step size is to determine a quantization step size for each critical subband according to the noise-masking ratio, In the present invention, the quantization noise of the critical subband is determined by the maximum value of the quantization noise introduced by the frequency band in the subband, and the noise masking ratio of each critical subband is:

NMR(i)＝E_ns(i)-T_hr(i)NMR(i)= _Ens (i)-T _hr (i)

其中，i表示临界子带号，j表示嵌入水印的频率点。E_ns(i)和NMR(i)分别表示第i个临界子带的噪声和噪声掩蔽比。E₀(i,j)为原始音频信号能量，E_w(i,j)为含水印音频信号能量，T_hr(i)为第i个临界子带的掩蔽阈值。根据量化策略，每个系数的量化误差不超过△/2，即e_q∈[0,△/2)。Among them, i represents the critical sub-band number, and j represents the frequency point where the watermark is embedded. _Ens (i) and NMR(i) denote the noise and the noise-masking ratio of the ith critical subband, respectively. E ₀ (i,j) is the energy of the original audio signal, E _w (i,j) is the energy of the watermarked audio signal, and Th _hr (i) is the masking threshold of the ith critical subband. According to the quantization strategy, the quantization error of each coefficient does not exceed △/2, that is, e _q ∈[0,△/2).

由于|f_k ^*-f_k|≤△/2，则有E_ns≤△²/4，为使得临界子带内信号的失真不被人耳感知，需满足据此可得出水印量化步长满足本发明取 Since |f _k ^* -f _k |≤△/2, there is E _ns ≤△ ² /4, in order to prevent the distortion of the signal in the critical subband from being perceived by the human ear, it is necessary to satisfy According to this, it can be concluded that the watermark quantization step size satisfies The present invention takes

由上式可知，水印嵌入强度随掩蔽阈值的变化而变化。掩蔽阈值越大，越容易掩蔽噪声，所以可以有较大的量化步长，从而实现了水印嵌入强度与掩蔽效应的自适应性。It can be seen from the above formula that the watermark embedding strength varies with the masking threshold. The larger the masking threshold is, the easier it is to mask the noise, so a larger quantization step can be used, thereby realizing the adaptability of the watermark embedding strength and the masking effect.

重复上述过程直到L位水印信息全部嵌入后，对嵌入水印的音频段频域系数作做逆离散余弦变换和离散小波包重构，小波包逆变换后的音频信号表示为A^*(k)，用A^*(k)代替A(k)完成一段音频信号的水印嵌入。然后，继续在满足条件的下一段嵌入水印，直至实现所有的水印嵌入。Repeat the above process until all the L-bit watermark information is embedded, and perform inverse discrete cosine transform and discrete wavelet packet reconstruction on the frequency-domain coefficients of the audio segment embedded in the watermark. The audio signal after wavelet packet inverse transformation is expressed as A ^* (k), Use A ^* (k) instead of A(k) to complete the watermark embedding of an audio signal. Then, continue to embed watermarks in the next segment that meets the conditions until all watermarks are embedded.

5)进行音频段重组5) Perform audio segment reorganization

将所有未嵌入水印的音频段和嵌入水印的音频段进行重组，组合成含有全部水印的音频信号。Recombine all audio segments without watermarks and embedded watermarks to form an audio signal containing all watermarks.

6)提取数字水印，本发明采用的数字水印提取方法是一种盲水印算法，即水印提取过程不需要原始音频载体。包括：6) Extracting the digital watermark. The digital watermark extraction method used in the present invention is a blind watermark algorithm, that is, the watermark extraction process does not require the original audio carrier. include:

(2)采用步骤3)的方法找到水印嵌入段和嵌入位置，由于水印对宿主信号的影响较小，使得水印嵌入前后的掩蔽阈值误差基本可以忽略；(2) Use the method of step 3) to find the watermark embedding segment and embedding position. Since the watermark has little influence on the host signal, the masking threshold error before and after watermark embedding can basically be ignored;

下面给出最佳实例：Best examples are given below:

1.选取40×40的二值图像做为水印，对水印信号进行Arnold变换、降维处理和混沌变换，得到水印信息w：1. Select a 40×40 binary image as the watermark, and perform Arnold transformation, dimension reduction processing and chaotic transformation on the watermark signal to obtain the watermark information w:

w＝{w(i),0≤i≤M×N}w={w(i),0≤i≤M×N}

其中，w(i)表示加密后的水印序列，M×N表示水印序列总数。Among them, w(i) represents the encrypted watermark sequence, and M×N represents the total number of watermark sequences.

2.计算人耳听觉掩蔽阈值；2. Calculate the threshold of human auditory masking;

(1)设原始音频信号为A＝{a(i),1≤i<N}，N为采样点个数，a(i)为音频信号，对音频信号进行分段处理，每段包含2048个采样点，设第k段原始音频信号表示为A(k)，采用db8小波基对每一段音频信号进行8级小波包变换，将0～22kHz的频带划分为26个非等宽的子带；表1所示为每个子带对应的小波包系数节点和频率边界值。由表可知子带带宽接近人耳临界频带带宽。(1) Suppose the original audio signal is A={a(i), 1≤i<N}, N is the number of sampling points, a(i) is the audio signal, and the audio signal is segmented, each segment contains 2048 Assuming that the k-th segment of the original audio signal is expressed as A(k), the db8 wavelet base is used to perform 8-level wavelet packet transformation on each segment of the audio signal, and the frequency band from 0 to 22 kHz is divided into 26 sub-bands of non-equal width ; Table 1 shows the wavelet packet coefficient nodes and frequency boundary values corresponding to each subband. It can be seen from the table that the sub-band bandwidth is close to the critical frequency band bandwidth of the human ear.

表1小波包变换的子带分解Table 1 Subband decomposition of wavelet packet transform

X(jw)＝DCT(w_i(k))X(jw)=DCT(w _i (k))

其中，f表示频率，z表示巴克域各子带编号。Wherein, f represents the frequency, and z represents the number of each subband in the Bark domain.

其中，|X(jw)|²表示临界子带对应采样点的功率值，h_bz和l_bz分别表示各子带的上下边界频率；Wherein, |X(jw)| ² represents the power value of the sampling point corresponding to the critical subband, and _hbz and _lbz represent the upper and lower boundary frequencies of each subband respectively;

其中，A为功率谱的算术平均值，G为功率谱的几何平均值，当音频信号为纯音时，a(z)＝1；白噪声时，a(z)＝0；考虑噪声特性因子后的掩蔽阈值修正值为：Among them, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when white noise, a(z)=0; after considering the noise characteristic factor The masking threshold correction value for is:

O(z)＝(14.5+z)a+5.5(1-a)。O(z)=(14.5+z)a+5.5(1-a).

3.根据掩蔽阈值自适应选取水印嵌入段和嵌入位置：3. Adaptively select the watermark embedding segment and embedding position according to the masking threshold:

将水印信息嵌入到频域系数X(jw)的中、低频系数中，计算每个子带中能量值S_pz(z)低于掩蔽阈值T_hr(z)的中、低频频率分量的总数，选择频率分量总数高于61的音频段作为水印嵌入段，将能量值小于掩蔽阈值的频率分量从大到小进行排序，选取前46位频率分量F_k＝{f_k(i),1≤i≤L}进行水印嵌入。Embed the watermark information into the mid- and low-frequency coefficients of the frequency-domain coefficient X(jw), calculate the total number of mid- and low-frequency frequency components whose energy value S _pz (z) is lower than the masking threshold _{Th hr} (z) in each subband, and select The audio segment whose total number of frequency components is higher than 61 is used as the watermark embedding segment, and the frequency components whose energy value is less than the masking threshold are sorted from large to small, and the first 46 frequency components F _k = {f _k (i), 1≤i≤ L} for watermark embedding.

4.进行数字水印的嵌入4. Embed digital watermark

采用量化索引调制的方法将46位水印嵌入到所选频域系数中，设第k段音频信号量化后的频域系数表示为F_k＝{f_k ^*(i),1≤i≤L}，量化规则为：Embed the 46-bit watermark into the selected frequency domain coefficients by using the method of quantization index modulation, and assume that the frequency domain coefficients of the kth audio signal after quantization are expressed as F _k = {f _k ^* (i), 1≤i≤L} , the quantization rule is:

本发明利用心理声学模型的噪声掩蔽比为每一个临界子带自适应确定水印嵌入强度，量化步长取 In the present invention, the noise-masking ratio of the psychoacoustic model is used to adaptively determine the watermark embedding strength for each critical sub-band, and the quantization step size is

对嵌入水印后的音频段做逆离散余弦变换和逆离散小波包变换，小波包逆变换后的音频信号表示为A^*(k)，用A^*(k)代替A(k)完成一段音频信号的水印嵌入。然后，继续在满足条件的下一段音频信号中嵌入水印，直至实现所有的水印嵌入。Perform inverse discrete cosine transform and inverse discrete wavelet packet transform on the audio segment after embedding the watermark. The audio signal after wavelet packet inverse transform is expressed as A ^* (k), and A ^* (k) is used instead of A (k) to complete a segment of audio signal embedded watermark. Then, continue to embed the watermark in the next segment of the audio signal that meets the conditions until all the watermarks are embedded.

5.进行音频段重组5. Perform audio segment reorganization

6.提取数字水印，包括：6. Extract digital watermark, including:

(1)采用步骤2的方法对含水印音频信号进行分段处理和小波包变换，并计算每一段音频信号的听觉掩蔽阈值；(1) adopt the method for step 2 to carry out subsection processing and wavelet packet transformation to the watermarked audio signal, and calculate the auditory masking threshold of each section of audio signal;

(2)采用步骤3的方法找到水印嵌入段和嵌入位置；(2) adopt the method of step 3 to find watermark embedding segment and embedding position;

为了测试本发明的性能，选取了三种不同类型的载体音频文件分别为流行音乐、古典音乐和摇滚音乐，它们的采样频率均为44.1kHz；量化精度为16位；格式为单声道。以流行音乐为例，图2、图3、图4分别是原始音频信号、含水印音频信号和二者的差值信号的波形图。由图可见，含水印的音频信号波形图与原始音频信号波形图之间差异很小。水印嵌入后基本察觉不到音频质量的失真。为了更好的评价本方法性能,我们将本发明提出的基于DWPT-DCT的水印方法与传统的基于心理声学模型的水印方法和当前流行的DWT-SVD方法，从不可感知性、水印容量和鲁棒性三个方面进行了比较。In order to test the performance of the present invention, three different types of carrier audio files were selected to be pop music, classical music and rock music, and their sampling frequency was 44.1kHz; the quantization precision was 16 bits; the format was monaural. Taking popular music as an example, Figure 2, Figure 3, and Figure 4 are waveform diagrams of the original audio signal, the watermarked audio signal, and the difference signal between the two, respectively. It can be seen from the figure that there is little difference between the watermarked audio signal waveform and the original audio signal waveform. After the watermark is embedded, the distortion of the audio quality is basically not noticeable. In order to better evaluate the performance of this method, we compared the watermarking method based on DWPT-DCT proposed by the present invention with the traditional watermarking method based on psychoacoustic model and the current popular DWT-SVD method, from imperceptibility, watermarking capacity and robustness. Three aspects of stickiness were compared.

(1)不可感知性对比(1) Comparison of imperceptibility

本发明采用信噪比和平均分段信噪比来评价水印对宿主信号的影响。定义如下：The present invention uses the signal-to-noise ratio and the average segmental signal-to-noise ratio to evaluate the influence of the watermark on the host signal. It is defined as follows:

其中S'(i)为嵌入水印后的系数，S(i)为原始音频信号系数，K为水印嵌入所需音频帧数，N为一帧音频信号包含的采样点个数。SegSNR表示每帧含水印音频信号的信噪比均值，相比SNR，SegSNR能够更好的反映水印信息对原始音频信号的影响。由公式可知，频域系数的改变量决定了信噪比。设某个频域系数的修改增量为e，其引入的噪声能量为e²，根据量化原理，系数的量化噪声是均匀分布的，它的期望值设某帧音频信号的能量为E_g，我们可以据此来估计嵌入水印帧的信噪比上限为：Among them, S'(i) is the coefficient after embedding the watermark, S(i) is the coefficient of the original audio signal, K is the number of audio frames required for watermark embedding, and N is the number of sampling points contained in one frame of audio signal. SegSNR represents the average signal-to-noise ratio of the watermarked audio signal in each frame. Compared with SNR, SegSNR can better reflect the influence of watermark information on the original audio signal. It can be seen from the formula that the change amount of the frequency domain coefficient determines the signal-to-noise ratio. Suppose the modification increment of a certain frequency domain coefficient is e, and the noise energy introduced by it is e ² , according to the quantization principle, the quantization noise of the coefficient is uniformly distributed, and its expected value Assuming that the energy of a certain frame of audio signal is E _g , we can estimate the upper limit of the signal-to-noise ratio of the embedded watermark frame as:

联立量化步长的求解公式，可得嵌入水印的每帧音频信噪比上限为：The solution formula of the simultaneous quantization step size can be obtained as follows:

SNR≤10lg{4E_g/(T_hr·L)}+5SNR≤10lg{4E _g /(T _hr L)}+5

针对流行音乐、古典音乐、摇滚音乐三种音乐类型，实测的信噪比分别为34.23db，29.62db，32.81db；分段信噪比均值分别为32.58db，28.26db，30.95db。实验表明本方法所得分段信噪比均在理论上限值之下，并且大于20db。本发明信噪比平均高于文献[3]7～8db，高于文献[4]1～4db。For pop music, classical music, and rock music, the measured signal-to-noise ratios are 34.23db, 29.62db, and 32.81db respectively; the average segmental signal-to-noise ratios are 32.58db, 28.26db, and 30.95db. Experiments show that the segmented SNRs obtained by this method are all below the theoretical upper limit and greater than 20db. The average signal-to-noise ratio of the present invention is 7-8db higher than that of the document [3], and 1-4db higher than that of the document [4].

(2)水印容量分析(2) Watermark capacity analysis

水印嵌入容量的计算公式为其中，N_w＝40×40，它表示嵌入的水印量，T为水印嵌入完成所需要的时间。水印嵌入率为100％时，不同类型音频载体的水印嵌入容量分别为流行音乐(689.5bps)，古典音乐(576.7bps)，摇滚音乐(668.9bps)。文献[3]在1600个采样点中嵌入1bit水印信息，水印嵌入容量只有27.56bps。文献[4]在512个采样点中嵌入8bit水印，水印嵌入容量为689.06bps，相比之下本发明水印嵌入容量远远高于文献[3]提出的方法，与文献[4]提出的方法嵌入容量相当。The formula for calculating the watermark embedding capacity is Among them, N _w =40×40, which represents the amount of embedded watermark, and T is the time required for the completion of watermark embedding. When the watermark embedding rate is 100%, the watermark embedding capacities of different types of audio carriers are pop music (689.5bps), classical music (576.7bps), rock music (668.9bps). Literature [3] embeds 1bit watermark information in 1600 sampling points, and the watermark embedding capacity is only 27.56bps. Literature [4] embeds 8-bit watermarks in 512 sampling points, and the watermark embedding capacity is 689.06bps. In contrast, the watermark embedding capacity of the present invention is much higher than the method proposed in Literature [3], and the method proposed in Literature [4] The embedded capacity is comparable.

(3)水印鲁棒性分析(3) Watermark robustness analysis

为了测试水印抵抗常见信号处理的攻击能力，本发明对嵌入水印的音频信号做如下处理：(1)叠加噪声：对数字音频信号在时域中加入高斯白噪声，信噪比为20db。(2)重采样：先下采样至22.05kHz，再上采样至44.1kHz。(3)重量化：先将音频信号从16bit量化为8bit，再从8bit量化为16bit。(4)mp3音频压缩：先对音频信号进行压缩，再解压缩(比特率为128kbps)。(5)低通滤波，截止频率为8kHz。本发明比较了所提方法与文献[3]和[4]对常见信号处理与攻击的抵抗能力，对三种方法选取相同的宿主音乐(流行音乐)和水印图像，在满足水印不被感知的条件下测得它们的NC值如表3所示。由表可知，本发明与文献[4]水印容量相当的情况下具有更好的鲁棒性。尽管本发明在水印容量上要远远高于文献[3]，但是水印鲁棒性并不差于文献[3]。In order to test the watermark's ability to resist common signal processing attacks, the present invention performs the following processing on the embedded watermarked audio signal: (1) superimposed noise: add Gaussian white noise to the digital audio signal in the time domain, and the signal-to-noise ratio is 20db. (2) Resampling: downsampling to 22.05kHz first, then upsampling to 44.1kHz. (3) Weighting: first quantize the audio signal from 16bit to 8bit, and then quantize from 8bit to 16bit. (4) mp3 audio compression: the audio signal is first compressed, and then decompressed (the bit rate is 128kbps). (5) Low-pass filtering with a cutoff frequency of 8kHz. The present invention compares the resistance ability of the proposed method and literature [3] and [4] to common signal processing and attacks, selects the same host music (pop music) and watermark image for the three methods, and satisfies the requirement that the watermark is not perceived Their NC values measured under these conditions are shown in Table 3. It can be seen from the table that the present invention has better robustness when the watermark capacity is equivalent to that of document [4]. Although the watermark capacity of the present invention is much higher than that of literature [3], the robustness of watermark is not worse than that of literature [3].

表2不同音乐类型的抗攻击能力Table 2 Anti-attack capabilities of different music types

表3不同方法的抗攻击能力Table 3 Anti-attack capabilities of different methods

相关文献如下：The relevant literature is as follows:

1、Johnston J D.Transform coding of audio signals using perceptualnoise criteria[J].IEEE Journal on Selected Areas in Communications,1988,6(2):314-323.1. Johnston J D. Transform coding of audio signals using perceptual noise criteria [J]. IEEE Journal on Selected Areas in Communications, 1988, 6(2): 314-323.

2、Arnold M.Quality evaluation of watermarked audio tracks[J].Proceedings of SPIE-The International Society for Optical Engineering,2002,4675:91-101.2. Arnold M.Quality evaluation of watermarked audio tracks[J].Proceedings of SPIE-The International Society for Optical Engineering,2002,4675:91-101.

3、Cai Y M,Guo W Q,Ding H Y.An audio blind watermarking scheme basedon DWT-SVD[J].Journal of Software,2013,8(7).1801-1808.3. Cai Y M, Guo W Q, Ding H Y.An audio blind watermarking scheme based on DWT-SVD[J].Journal of Software,2013,8(7).1801-1808.

4、李榕,王宏霞,赵鹏君.基于心理声学模型的自适应音频数字水印算法[C].全国信息隐藏暨多媒体信息安全学术大.2010.4. Li Rong, Wang Hongxia, Zhao Pengjun. Adaptive Audio Digital Watermarking Algorithm Based on Psychoacoustic Model [C]. National Information Hiding and Multimedia Information Security Academic University. 2010.

Claims

1. a kind of adaptive audio blind watermark method based on auditory model, it is characterised in that comprise the steps：

1) Arnold conversion, dimension-reduction treatment and chaos transformation are carried out to watermark signal, obtains watermark information w：

W={ w (i), 0≤i≤M × N }；

Wherein, w (i) represents that the watermark sequence after encryption, M × N represent watermark sequence sum；

2) human auditory system masking threshold is calculated；

3) self adaptation chooses the embedded section of watermark and embedded location：

Watermark information is embedded in frequency coefficient X (jw), in low frequency coefficient, energy value S in each subband is calculated_pzZ () is low In masking threshold T_hrIn (z), the sum of Frequency component, select frequency component sum higher than set threshold T audio frequency Duan Zuowei watermarks are embedded in section, and frequency component of the energy value less than masking threshold is ranked up from big to small, L bit frequencies before choosing Component F_k={ f_k(i), 1≤i≤L } carry out watermark be embedded in；

4) embedding algorithm is carried out

The watermark of L positions is embedded in selected frequency coefficient using the method for quantization index modulation, to the audio section being embedded in after watermark Inverse discrete cosine transform and inverse discrete analog method is done, the audio signal after wavelet packet inverse transformation is expressed as A^*K (), uses A^*(k) The watermark for replacing A (k) to complete a section audio signal is embedded in.Then, continue to be embedded in next section of carrier audio frequency for meet condition Watermark, until realize that all of watermark is embedded in；

5) audio section restructuring is carried out

The audio section of all audio sections for being not embedded into watermark and embedded watermark is recombinated, is combined into the sound containing whole watermarks Frequency signal；

6) digital watermarking is extracted, including：

(1) adopt step 2) method to carrying out segment processing and wavelet package transforms containing watermark audio signal, and calculate each section The auditory masking threshold of audio signal；

(2) adopt step 3) method find the embedded section of watermark and embedded location；

(3) watermark sequence is extracted using equation below：

w^{'} = \{\begin{matrix} 1, & i f r_{k} &GreaterEqual; 0.5 \times Δ^{*} \\ 0, & o t h e r w i s e \end{matrix}

F in formula_k ^*I () is the frequency coefficient of kth section audio signal to be measured, △^*It is the quantization step of kth section audio signal to be measured；

(4) watermark sequence to extracting carries out a liter dimension, inverse Arnold conversion and Logistic decryption oprerations, obtains final watermark Image.

2. a kind of adaptive audio blind watermark method based on auditory model according to claim 1, it is characterised in that step Rapid 2) include：

(1) original audio signal is set as A={ a (i), 1≤i<N }, N is sampled point number, and a (i) is audio signal, and audio frequency is believed Segment processing number is carried out, per section includes 2048 sampled points, if kth section original audio signal is expressed as A (k), using db8 small echos Base carries out 8 grades of wavelet package transforms to each section audio signal, and the frequency band of 0～22kHz is divided into 26 non-wide subbands；

(2) discrete cosine transform is carried out to the wavelet packet coefficient of each subband, obtains frequency coefficient X (jw)：

X (jw)=DCT (w_i(k))

Wherein, w_iK k-th wavelet packet coefficient after () expression WAVELET PACKET DECOMPOSITION in i-th subband, after X (jw) represents dct transform Frequency coefficient；

(3) frequency coefficient X (jw) is mapped to Bark domain：

{ 13arctan (0.76f/1000)+3.5arctan is [(f/7500) for z=round²],

Wherein, f represents that frequency, z represent each subband numbering in Bark domain,

Calculate the energy value S of each subband_pz(z)：

S_{p z} (z) = Σ_{w = l_{b z}}^{h_{b z}} | X (j w) |^{2}

Wherein, | X (jw) |²Represent the performance number of the corresponding sampled point of critical band, h_bzAnd l_bzEach subband upper following is represented respectively Boundary's frequency；

(4) by the energy value S of each subband_pzZ () is adjusted to S_m(z)=S_pz(z) × B (z),

Wherein B (z) be spread function ,+17.5 [1+ (z+0.474) of B (z)=15.91+7.5 (z+0.474)²]^1/2；

(5) noise characteristic factor a (z) of each subband is calculated：A (z)=min [10lg (G/A)/S_max,1]

Wherein, arithmetic mean of instantaneous values of the A for power spectrum, geometrical means of the G for power spectrum, when audio signal is pure tone, a (z) =1；During white noise, a (z)=0；Masking threshold correction value after the consideration noise characteristic factor is O (z)：O (z)=(14.5+z) a +5.5(1-a)；

Calculate actual masked thresholds T (z) of each subband：Ask actual masked thresholds T (z) and subband The business of sampling point number, and described business is compared with absolute masking threshold, big for the two intermediate value is sheltered as final Threshold value T_hr(z)：T_hr(z)=max (T, TH).

3. a kind of adaptive audio blind watermark method based on auditory model according to claim 1, it is characterised in that step Rapid 4) include：

If the frequency coefficient after kth section audio signal quantization is expressed as F_k={ f_k ^*(i), 1≤i≤L }, quantizing rule is：

In formula, w is watermark information, f_k ^*I () is the coefficient after kth section audio signal quantization, f_kI () is the frequency domain system of original audio Number, △ is quantization step, and according to quantization principles, max quantization error is 0.5 △.

4. a kind of adaptive audio blind watermark method based on auditory model according to claim 3, it is characterised in that described Quantization step, be that a quantization step is determined for each critical band according to masking by noise ratio,