CN106504757A - An Adaptive Audio Blind Watermarking Method Based on Auditory Model - Google Patents
An Adaptive Audio Blind Watermarking Method Based on Auditory Model Download PDFInfo
- Publication number
- CN106504757A CN106504757A CN201610983877.6A CN201610983877A CN106504757A CN 106504757 A CN106504757 A CN 106504757A CN 201610983877 A CN201610983877 A CN 201610983877A CN 106504757 A CN106504757 A CN 106504757A
- Authority
- CN
- China
- Prior art keywords
- watermark
- audio
- embedded
- audio signal
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 16
- 230000005236 sound signal Effects 0.000 claims abstract description 86
- 230000000873 masking effect Effects 0.000 claims abstract description 60
- 230000009466 transformation Effects 0.000 claims abstract description 22
- 230000006978 adaptation Effects 0.000 claims abstract 3
- 238000006243 chemical reaction Methods 0.000 claims abstract 3
- 238000013139 quantization Methods 0.000 claims description 49
- 238000012545 processing Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 11
- 238000001228 spectrum Methods 0.000 claims description 8
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 238000012937 correction Methods 0.000 claims description 4
- 238000002592 echocardiography Methods 0.000 claims 1
- 239000000284 extract Substances 0.000 abstract description 4
- 230000008569 process Effects 0.000 abstract description 2
- 230000000739 chaotic effect Effects 0.000 description 5
- 230000009467 reduction Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000008521 reorganization Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 239000011435 rock Substances 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000013432 robust analysis Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
技术领域technical field
本发明涉及信息隐藏领域。特别是涉及一种用于音频作品版权管理的基于听觉模型的自适应音频盲水印方法。The present invention relates to the field of information hiding. In particular, it relates to an adaptive audio blind watermarking method based on an auditory model for copyright management of audio works.
背景技术Background technique
信息技术和计算机网络的快速发展引发了一系列信息安全问题、盗版问题等,使得多媒体内容的版权保护和内容认证成为了亟待解决的问题。数字水印技术作为一种有效的解决方案获得了迅速的发展,并且已经成为学术界的一个研究热点。由于人耳具有更高的灵敏度,使得音频水印技术相比视频和图像水印技术具有更大的挑战。音频水印技术就是将一种特殊的标志信息嵌入到原始数字音频作品中,用以辨识音频作品的版权和合法使用者,从而达到音频作品版权保护的作用。音频水印技术需要满足三个基本要求,即水印的不可感知性、鲁棒性和水印容量,三者之间是相互矛盾的。如何设计方法,使得三者之间达到最佳的平衡一直是音频水印技术的一个难点。当前,数字水印技术大致可以分为两种不同的类型,包括时间域水印嵌入方法和变换域水印嵌入方法。早期的方法选择嵌入在时域部分。它的特点是算法简单,易于实现,但鲁棒性和不可感知性较差。当前主流的音频水印算法是变换域算法,变换域算法在嵌入水印前需要将音频载体信号从时域变换到频域。由于变换域算法考虑了音频载体特性和人耳听觉特性,具有比时域算法更好的鲁棒性和不可感知性。而对于音频这样的时变信号,利用小波分析的多分辨率特性和时频局部特性,相比其他变换域算法具有更好的鲁棒性,已经成为音频水印算法的研究热点。近年来有不少学者将压缩感知、奇异值分解、量化索引调制、神经网络和人耳掩蔽效应等方法应用到变换域算法中,进一步提高了变换域算法的性能。The rapid development of information technology and computer networks has caused a series of information security issues, piracy issues, etc., making copyright protection and content authentication of multimedia content an urgent problem to be solved. As an effective solution, digital watermarking has developed rapidly and has become a research hotspot in the academic circle. Due to the higher sensitivity of the human ear, audio watermarking is more challenging than video and image watermarking. Audio watermarking technology is to embed a special logo information into the original digital audio works to identify the copyright and legal users of the audio works, so as to achieve the protection of the copyright of the audio works. Audio watermarking technology needs to meet three basic requirements, that is, watermark imperceptibility, robustness and watermark capacity, and the three are contradictory. How to design a method to achieve the best balance among the three has always been a difficult point in audio watermarking technology. Currently, digital watermarking techniques can be roughly divided into two different types, including time-domain watermark embedding methods and transform-domain watermark embedding methods. Early methods opted to embed the time-domain part. It is characterized by a simple algorithm and easy implementation, but poor robustness and imperceptibility. The current mainstream audio watermarking algorithm is the transform domain algorithm, which needs to transform the audio carrier signal from the time domain to the frequency domain before embedding the watermark. Since the transform domain algorithm takes into account the characteristics of the audio carrier and the auditory characteristics of the human ear, it has better robustness and imperceptibility than the time domain algorithm. For time-varying signals such as audio, using the multi-resolution and time-frequency local characteristics of wavelet analysis has better robustness than other transform domain algorithms, and has become a research hotspot in audio watermarking algorithms. In recent years, many scholars have applied methods such as compressed sensing, singular value decomposition, quantization index modulation, neural network and human ear masking effect to the transform domain algorithm, and further improved the performance of the transform domain algorithm.
尽管当前水印算法已经获得了很好的鲁棒性和不可感知性,但是多数算法对水印嵌入容量问题考虑较少。如何设计算法使得水印在鲁棒性、不可感知性和嵌入容量三者之间达到最佳的平衡依然需要做进一步的研究和探索。此外,大多数算法根据一般经验和实验效果选择将水印嵌入在整个音频段的固定频率点,嵌入位置不是自适应的。Although the current watermarking algorithms have achieved good robustness and imperceptibility, most of the algorithms pay less attention to the watermark embedding capacity. How to design an algorithm to achieve the best balance between robustness, imperceptibility and embedding capacity still needs further research and exploration. In addition, most algorithms choose to embed the watermark at a fixed frequency point in the entire audio segment based on general experience and experimental results, and the embedding position is not adaptive.
发明内容Contents of the invention
本发明所要解决的技术问题是提供了一种鲁棒性和透明性好、隐藏容量大的基于听觉模型的自适应音频盲水印方法。The technical problem to be solved by the present invention is to provide an adaptive audio blind watermarking method based on an auditory model with good robustness, transparency and large hidden capacity.
本发明所采用的技术方案是:一种基于听觉模型的自适应音频盲水印方法,包括如下步骤:The technical scheme adopted in the present invention is: a kind of adaptive audio blind watermarking method based on auditory model, comprises the following steps:
1)对水印信号进行Arnold变换、降维处理和混沌变换,得到水印信息w:1) Perform Arnold transformation, dimension reduction processing and chaotic transformation on the watermark signal to obtain the watermark information w:
w={w(i),0≤i≤M×N};w={w(i),0≤i≤M×N};
其中,w(i)表示加密后的水印序列,M×N表示水印序列总数;Among them, w(i) represents the encrypted watermark sequence, and M×N represents the total number of watermark sequences;
2)计算人耳听觉掩蔽阈值;2) Calculating the threshold of human auditory masking;
3)自适应选取水印嵌入段和嵌入位置:3) Adaptively select the watermark embedding segment and embedding position:
将水印信息嵌入到频域系数X(jw)的中、低频系数中,计算每个子带中能量值Spz(z)低于掩蔽阈值Thr(z)的中、低频频率分量的总数,选择频率分量总数高于所设门限值T的音频段作为水印嵌入段,将能量值小于掩蔽阈值的频率分量从大到小进行排序,选取前L位频率分量Fk={fk(i),1≤i≤L}进行水印嵌入;Embed the watermark information into the mid- and low-frequency coefficients of the frequency-domain coefficient X(jw), calculate the total number of mid- and low-frequency frequency components whose energy value S pz (z) is lower than the masking threshold Th hr (z) in each subband, and select The audio segment whose total frequency component is higher than the set threshold value T is used as the watermark embedding segment, and the frequency components whose energy value is smaller than the masking threshold are sorted from large to small, and the first L frequency components are selected F k ={f k (i) ,1≤i≤L} for watermark embedding;
4)进行数字水印的嵌入4) Embed the digital watermark
采用量化索引调制的方法将L位水印嵌入到所选频域系数中,对嵌入水印后的音频段做逆离散余弦变换和逆离散小波包变换,小波包逆变换后的音频信号表示为A*(k),用A*(k)代替A(k)完成一段音频信号的水印嵌入。然后,继续在满足条件的下一段载体音频中嵌入水印,直至实现所有的水印嵌入;Embed the L-bit watermark into the selected frequency domain coefficients by using the method of quantization index modulation, and perform inverse discrete cosine transform and inverse discrete wavelet packet transform on the audio segment after embedding the watermark, and the audio signal after wavelet packet inverse transformation is expressed as A * (k), replace A(k) with A * (k) to complete the watermark embedding of an audio signal. Then, continue to embed the watermark in the next piece of carrier audio that meets the conditions, until all the watermarks are embedded;
5)进行音频段重组5) Perform audio segment reorganization
将所有未嵌入水印的音频段和嵌入水印的音频段进行重组,组合成含有全部水印的音频信号;Recombining all audio segments not embedded with watermarks and audio segments embedded with watermarks to form an audio signal containing all watermarks;
6)提取数字水印,包括:6) Extract digital watermark, including:
(1)采用步骤2)的方法对含水印音频信号进行分段处理和小波包变换,并计算每一段音频信号的听觉掩蔽阈值;(1) adopting the method of step 2) to carry out subsection processing and wavelet packet transformation to the watermarked audio signal, and calculate the auditory masking threshold of each section of the audio signal;
(2)采用步骤3)的方法找到水印嵌入段和嵌入位置;(2) adopt the method of step 3) to find watermark embedding segment and embedding position;
(3)采用如下公式提取水印序列:(3) Use the following formula to extract the watermark sequence:
式中fk *(i)是第k段待测音频信号的频域系数,△*是第k段待测音频信号的量化步长;In the formula, f k * (i) is the frequency domain coefficient of the audio signal to be tested in the k section, and △ * is the quantization step size of the audio signal to be tested in the k section;
(4)对提取的水印序列进行升维、逆Arnold变换和Logistic解密操作,得到最终的水印图像。(4) Perform dimension enhancement, inverse Arnold transform and Logistic decryption operations on the extracted watermark sequence to obtain the final watermark image.
骤2)包括:Step 2) includes:
(1)设原始音频信号为A={a(i),1≤i<N},N为采样点个数,a(i)为音频信号,对音频信号进行分段处理,每段包含2048个采样点,设第k段原始音频信号表示为A(k),采用db8小波基对每一段音频信号进行8级小波包变换,将0~22kHz的频带划分为26个非等宽的子带;(1) Suppose the original audio signal is A={a(i), 1≤i<N}, N is the number of sampling points, a(i) is the audio signal, and the audio signal is segmented, each segment contains 2048 Assuming that the k-th segment of the original audio signal is expressed as A(k), the db8 wavelet base is used to perform 8-level wavelet packet transformation on each segment of the audio signal, and the frequency band from 0 to 22 kHz is divided into 26 sub-bands of non-equal width ;
(2)对每个子带的小波包系数进行离散余弦变换,得到频域系数X(jw):(2) Discrete cosine transform is performed on the wavelet packet coefficients of each subband to obtain the frequency domain coefficient X(jw):
X(jw)=DCT(wi(k))X(jw)=DCT(w i (k))
其中,wi(k)表示小波包分解后第i个子带中的第k个小波包系数,X(jw)表示DCT变换后的频域系数;Among them, w i (k) represents the kth wavelet packet coefficient in the i-th subband after wavelet packet decomposition, and X(jw) represents the frequency domain coefficient after DCT transformation;
(3)将频域系数X(jw)映射到巴克域:(3) Map the frequency domain coefficient X(jw) to the Barker domain:
z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500)2]},z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500) 2 ]},
其中,f表示频率,z表示巴克域各子带编号,Among them, f represents the frequency, z represents the number of each subband in the Barker domain,
计算各个子带的能量值Spz(z):Calculate the energy value S pz (z) for each subband:
其中,|X(jw)|2表示临界频带对应采样点的功率值,hbz和lbz分别表示各子带的上下边界频率;Among them, |X(jw)| 2 represents the power value of the sampling point corresponding to the critical frequency band, and h bz and l bz represent the upper and lower boundary frequencies of each subband respectively;
(4)将各个子带的能量值Spz(z)调整为Sm(z)=Spz(z)×B(z),(4) Adjust the energy value S pz (z) of each subband to S m (z)=S pz (z)×B(z),
其中B(z)为扩展函数,B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474)2]1/2;Wherein B(z) is an extension function, B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474) 2 ] 1/2 ;
(5)计算每个子带的噪声特性因子a(z):a(z)=min[10lg(G/A)/Smax,1](5) Calculate the noise characteristic factor a(z) of each subband: a(z)=min[10lg(G/A)/S max ,1]
其中,A为功率谱的算术平均值,G为功率谱的几何平均值,当音频信号为纯音时,a(z)=1;白噪声时,a(z)=0;考虑噪声特性因子后的掩蔽阈值修正值为O(z):O(z)=(14.5+z)a+5.5(1-a);Among them, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when white noise, a(z)=0; after considering the noise characteristic factor The masking threshold correction value of is O(z): O(z)=(14.5+z)a+5.5(1-a);
计算每个子带的实际掩蔽阈值T(z):求实际掩蔽阈值T(z)与子带样点个数的商,并将所述的商与绝对掩蔽阈值进行比较,将二者中值大的作为最终的掩蔽阈值Thr(z):Thr(z)=max(T,TH)。Calculate the actual masking threshold T(z) for each subband: Find the quotient of the actual masking threshold T(z) and the number of sub-band samples, and compare the quotient with the absolute masking threshold, and use the larger median as the final masking threshold T hr (z): T hr (z)=max(T,TH).
步骤4)包括:Step 4) includes:
设第k段音频信号量化后的频域系数表示为Fk={fk *(i),1≤i≤L},量化规则为:Assume that the frequency-domain coefficient of the k-th audio signal after quantization is expressed as F k = {f k * (i), 1≤i≤L}, and the quantization rule is:
式中,w是水印信息,fk *(i)是第k段音频信号量化后的系数,fk(i)是原始音频的频域系数,△是量化步长,根据量化原理,最大量化误差为0.5△。In the formula, w is the watermark information, f k * (i) is the quantized coefficient of the k-th audio signal, f k (i) is the frequency domain coefficient of the original audio, △ is the quantization step size, according to the quantization principle, the maximum quantization The error is 0.5△.
根据权利要求3所述的一种基于听觉模型的自适应音频盲水印方法,其特征在于,所述的量化步长,是根据噪声掩蔽比为每一个临界子带确定一个量化步长, A kind of adaptive audio blind watermarking method based on auditory model according to claim 3, it is characterized in that, described quantization step size is to determine a quantization step size for each critical subband according to the noise-masking ratio,
本发明的一种基于听觉模型的自适应音频盲水印方法,将水印嵌入在DWPT和DCT的混合域,利用掩蔽阈值确定水印嵌入段和具体嵌入位置。根据噪声掩蔽比对水印嵌入强度进行了调制,保证水印信号不影响音频质量的前提下嵌入较大能量的水印信息,实现了水印嵌入段、水印嵌入位置和水印嵌入强度的自适应选取,在保证算法不可感知性的同时提高了水印鲁棒性和水印嵌入容量。本发明更为精细的利用了人耳听觉特性,具有更好的隐藏性能。对于数字水印,采用Logistic加密,同时结合Arnold变换实现了水印双重加密,进一步保证了水印的安全性。实验表明本发明所提算法不仅具有较好的不可感知性和较大的水印容量,而且能够有效抵抗常见信号处理的攻击,具有很好的鲁棒性。An adaptive audio blind watermarking method based on the auditory model of the present invention embeds the watermark in the mixed domain of DWPT and DCT, and uses the masking threshold to determine the watermark embedding segment and specific embedding position. The watermark embedding intensity is modulated according to the noise-masking ratio to ensure that the watermark signal does not affect the audio quality, and the watermark information with large energy is embedded. The imperceptibility of the algorithm improves the watermark robustness and watermark embedding capacity at the same time. The present invention utilizes the auditory characteristics of the human ear more precisely, and has better concealment performance. For the digital watermark, the Logistic encryption is adopted, and the double encryption of the watermark is realized in combination with the Arnold transform, which further ensures the security of the watermark. Experiments show that the algorithm proposed by the present invention not only has good imperceptibility and large watermark capacity, but also can effectively resist common signal processing attacks and has good robustness.
附图说明Description of drawings
图1是本发明一种基于听觉模型的自适应音频盲水印方法的流程图;Fig. 1 is the flow chart of a kind of adaptive audio blind watermarking method based on auditory model of the present invention;
图2是本发明实施例原始音频信号波形图;Fig. 2 is an original audio signal waveform diagram of an embodiment of the present invention;
图3是本发明实施例含水印音频信号波形图;FIG. 3 is a waveform diagram of a watermarked audio signal according to an embodiment of the present invention;
图4是本发明实施例含水印音频信号与原始音频信号差值波形图。Fig. 4 is a waveform diagram of a difference between a watermarked audio signal and an original audio signal according to an embodiment of the present invention.
具体实施方式detailed description
下面结合实施例和附图对本发明的一种基于听觉模型的自适应音频盲水印方法做出详细说明。An adaptive audio blind watermarking method based on an auditory model of the present invention will be described in detail below with reference to embodiments and drawings.
如图1所示,本发明的一种基于听觉模型的自适应音频盲水印方法,包括如下步骤:As shown in Figure 1, a kind of adaptive audio blind watermarking method based on auditory model of the present invention comprises the following steps:
1)对水印信号进行Arnold变换、降维处理和混沌变换,得到水印信息w。1) Arnold transform, dimension reduction processing and chaotic transformation are performed on the watermark signal to obtain the watermark information w.
设M0是原始水印信号,表示为M0={m0(i,j),0≤i<M,0≤j<N}。其中,m0(i,j)∈{0,1}表示二值水印图像的象素灰度值。为消除水印图像的空间相关性,我们将水印作Arnold置乱,表示为M1={m1(i,j),0≤i<M,0≤j<N}。经Arnold变换后,即使部分水印受到攻击,依然能够提取出整个水印轮廓。此外,如果不知道Arnold变换的次数,即使攻击者提取出水印也无法恢复出水印图像,从而增强了水印的安全性。由于音频信号是一维信号,还需要对Arnold置乱后的水印序列进行降维处理V={v(k)=m1(i,j),1≤i≤M,1≤j≤N,k=(i-1)×N+j}。Let M 0 be the original watermark signal, expressed as M 0 ={m 0 (i,j),0≤i<M,0≤j<N}. Among them, m 0 (i,j)∈{0,1} represents the pixel gray value of the binary watermark image. In order to eliminate the spatial correlation of the watermark image, we scramble the watermark as Arnold, expressed as M 1 ={m 1 (i,j),0≤i<M,0≤j<N}. After Arnold transformation, even if part of the watermark is attacked, the entire watermark outline can still be extracted. In addition, if the number of Arnold transformations is not known, even if the attacker extracts the watermark, the watermark image cannot be recovered, thereby enhancing the security of the watermark. Since the audio signal is a one-dimensional signal, it is also necessary to perform dimension reduction processing on the Arnold scrambled watermark sequence V={v(k)=m 1 (i,j),1≤i≤M,1≤j≤N, k=(i-1)×N+j}.
为了进一步提高水印的抗攻击能力,本发明使用Logistic方程得到混沌序列,将此序列与降维后的图像水印序列v(k)做异或运算。经Logistic加密后的水印表示为:In order to further improve the anti-attack capability of the watermark, the present invention uses the Logistic equation to obtain a chaotic sequence, and performs an XOR operation with the sequence and the image watermark sequence v(k) after dimensionality reduction. The watermark after Logistic encryption is expressed as:
w={w(i),0≤i≤M×N}w={w(i),0≤i≤M×N}
其中,w(i)表示加密后的水印序列,M×N为水印序列总数。Among them, w(i) represents the encrypted watermark sequence, and M×N is the total number of watermark sequences.
本发明将混沌序列的初值、参数和Arnold变换次数作为秘钥,在提取水印时利用这些秘钥可以重新获得水印序列。The invention uses the initial value, parameters and Arnold transformation times of the chaotic sequence as secret keys, and the watermark sequence can be obtained again by using these secret keys when extracting the watermark.
2)计算人耳听觉掩蔽阈值2) Calculating the threshold of human auditory masking
人耳存在掩蔽效应,当存在能量相差一定程度的两个或多个激励声音同时作用于人类听觉系统时,弱音就会被强音掩蔽掉,变得不可感知,这种现象称为同时掩蔽(或频率掩蔽)效应。当水印能量被限制在掩蔽阈值以下时,就能保证水印的不可感知性。心理声学模型中子带划分应尽可能接近人耳临界频带,而小波包分解的多分辨率特性满足这个要求,所以本发明选择在小波包域中计算掩蔽阈值。本发明采用了文献[1]提出的一种简单的心理声学模型,并在此基础上进行了改进,将小波包变换引入了心理声学模型,将整个频带划分成26个不等宽的子带,使得子带带宽接近人耳临界频带带宽。具体计算步骤如下:There is a masking effect in the human ear. When two or more exciting sounds with a certain energy difference act on the human auditory system at the same time, the weak sound will be masked by the strong sound and become imperceptible. This phenomenon is called simultaneous masking ( or frequency masking) effect. The imperceptibility of the watermark is guaranteed when the watermark energy is constrained below the masking threshold. The sub-band division in the psychoacoustic model should be as close as possible to the critical frequency band of the human ear, and the multi-resolution characteristics of wavelet packet decomposition meet this requirement, so the present invention chooses to calculate the masking threshold in the wavelet packet domain. The present invention adopts a simple psychoacoustic model proposed by literature [1], and improves it on this basis, introduces wavelet packet transform into the psychoacoustic model, and divides the entire frequency band into 26 subbands of unequal width , so that the sub-band bandwidth is close to the critical band bandwidth of the human ear. The specific calculation steps are as follows:
(1)设原始音频信号为A={a(i),1≤i<N},N为采样点个数,a(i)为音频信号,对音频信号进行分段处理,每段包含2048个采样点,设第k段原始音频信号表示为A(k),采用db8小波基对每一段音频信号进行8级小波包变换,将0~22kHz的频带划分为26个非等宽的子带;(1) Suppose the original audio signal is A={a(i), 1≤i<N}, N is the number of sampling points, a(i) is the audio signal, and the audio signal is segmented, each segment contains 2048 Assuming that the k-th segment of the original audio signal is expressed as A(k), the db8 wavelet base is used to perform 8-level wavelet packet transformation on each segment of the audio signal, and the frequency band from 0 to 22 kHz is divided into 26 sub-bands of non-equal width ;
(2)对每个子带的小波包系数进行离散余弦变换(DCT),得到频域系数X(jw):(2) Discrete cosine transform (DCT) is performed on the wavelet packet coefficients of each subband to obtain the frequency domain coefficient X(jw):
X(jw)=DCT(wi(k))X(jw)=DCT(w i (k))
其中,wi(k)表示小波包分解后第i个子带中的第k个小波包系数,X(jw)表示DCT变换后的频域系数;Among them, w i (k) represents the kth wavelet packet coefficient in the i-th subband after wavelet packet decomposition, and X(jw) represents the frequency domain coefficient after DCT transformation;
(3)将频域系数X(jw)映射到巴克(Bark)域:(3) Map the frequency domain coefficient X(jw) to the Bark domain:
z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500)2]}z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500) 2 ]}
其中,f表示频率,z表示巴克(Bark)域各子带编号。Wherein, f represents the frequency, and z represents the number of each subband in the Bark domain.
计算各个子带的能量值Spz(z):Calculate the energy value S pz (z) for each subband:
其中|X(jw)|2表示临界子带对应采样点的功率值,hbz和lbz分别表示各子带的上下边界频率。Where |X(jw)| 2 represents the power value of the sampling point corresponding to the critical sub-band, and h bz and l bz represent the upper and lower boundary frequencies of each sub-band respectively.
(4)由人耳感知特性知,某个子带内的信号会受到相邻子带信号的影响,将各个子带的能量值Spz(z)调整为(4) According to the perceptual characteristics of the human ear, the signal in a certain subband will be affected by the adjacent subband signal, and the energy value S pz (z) of each subband is adjusted to
Sm(z)=Spz(z)×B(z)S m (z) = S pz (z) × B (z)
其中B(z)为扩展函数,B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474)2]1/2;Wherein B(z) is an extension function, B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474) 2 ] 1/2 ;
(5)掩蔽阈值的计算与音频的噪声特性有关,计算每个子带的噪声特性因子a(z):(5) The calculation of the masking threshold is related to the noise characteristics of the audio, and the noise characteristic factor a(z) of each subband is calculated:
a(z)=min[10lg(G/A)/Smax,1]a(z)=min[10lg(G/A)/S max ,1]
其中,A为功率谱的算术平均值,G为功率谱的几何平均值,当音频信号为纯音时,a(z)=1;白噪声时,a(z)=0。Wherein, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when it is white noise, a(z)=0.
由人耳掩蔽效应知,当纯音掩蔽噪声时,掩蔽阈值大约下降14.5+zdb;当噪声掩蔽纯音时,掩蔽阈值下降5.5db。考虑噪声特性因子后的掩蔽阈值修正值为:O(z)=(14.5+z)a+5.5(1-a)According to the masking effect of the human ear, when the pure tone masks the noise, the masking threshold drops by about 14.5+zdb; when the noise covers the pure tone, the masking threshold drops by 5.5db. The masking threshold correction value after considering the noise characteristic factor is: O(z)=(14.5+z)a+5.5(1-a)
计算每个子带的实际掩蔽阈值T(z):求实际掩蔽阈值T(z)与子带样点个数的商,并将所述的商与绝对掩蔽阈值进行比较,将二者中值大的作为最终的掩蔽阈值Thr(z):Thr(z)=max(T,TH)。本发明根据所求掩蔽阈值自适应确定水印嵌入段、具体嵌入位置和水印嵌入强度。Calculate the actual masking threshold T(z) for each subband: Find the quotient of the actual masking threshold T(z) and the number of sub-band samples, and compare the quotient with the absolute masking threshold, and use the larger median as the final masking threshold T hr (z): T hr (z)=max(T,TH). The invention self-adaptively determines the watermark embedding segment, specific embedding position and watermark embedding intensity according to the sought masking threshold.
3)根据听觉掩蔽阈值自适应选取水印嵌入段和嵌入位置:3) Adaptively select the watermark embedding segment and embedding position according to the auditory masking threshold:
音频信号是非平稳信号,每个音频段对噪声的敏感程度不同,如果水印嵌入在整个音频段,势必降低音频信号的信噪比。为此,本发明提出一种水印嵌入段选择方案来提高水印的性能。在确定水印嵌入位置上,已有算法大多根据人耳掩蔽效应给出大致的嵌入范围,没有精确给出不容易被感知的频率位置。本发明所提算法利用频域掩蔽效应来精确确定水印嵌入位置,具有更好的不可感知性。The audio signal is a non-stationary signal, and each audio segment has different sensitivity to noise. If the watermark is embedded in the entire audio segment, the signal-to-noise ratio of the audio signal will inevitably be reduced. For this reason, the present invention proposes a watermark embedding segment selection scheme to improve the performance of the watermark. In determining the embedding position of the watermark, most of the existing algorithms give a rough embedding range based on the masking effect of the human ear, and do not give an accurate frequency position that is not easy to be perceived. The algorithm proposed by the invention utilizes the masking effect in the frequency domain to accurately determine the embedding position of the watermark, and has better imperceptibility.
根据听觉模型,当一帧音频信号中可被掩蔽的频率分量足够多时,说明该音频段噪声敏感度低,可用来隐藏水印信息。由于中低频系数集中了音频信号的大部分能量,而常见音频信号处理大多在高频部分进行(如压缩、滤波等)。考虑到算法的鲁棒性,本发明将水印信息嵌入到频域系数X(jw)的中、低频系数(第0~12Bark)中,计算每个子带中能量值Spz(z)低于掩蔽阈值Thr(z)的中、低频频率分量的总数,选择频率分量总数高于所设门限值T的音频段作为水印嵌入段,由频域掩蔽效应知:幅值越大的频域系数,它两侧的频率分量越容易被掩蔽。因此,本算法在选定音频数据段后,将能量值小于掩蔽阈值的频率分量从大到小进行排序,选取前L位频率分量Fk={fk(i),1≤i≤L}进行水印嵌入。According to the auditory model, when there are enough frequency components that can be masked in a frame of audio signal, it means that the audio segment has low noise sensitivity and can be used to hide watermark information. Since the mid-low frequency coefficients concentrate most of the energy of the audio signal, common audio signal processing is mostly performed on the high-frequency part (such as compression, filtering, etc.). Considering the robustness of the algorithm, the present invention embeds the watermark information into the middle and low frequency coefficients (0th to 12th Bark) of the frequency domain coefficient X(jw), and calculates the energy value S pz (z) in each subband lower than the masking The total number of middle and low frequency frequency components of the threshold Th hr (z), select the audio segment whose total frequency component is higher than the set threshold value T as the watermark embedding segment, known by the frequency domain masking effect: the frequency domain coefficient with larger amplitude , the frequency components on both sides of it are easier to be masked. Therefore, after the audio data segment is selected, the algorithm sorts the frequency components whose energy value is less than the masking threshold from large to small, and selects the first L frequency components F k = {f k (i), 1≤i≤L} Perform watermark embedding.
4)进行数字水印的嵌入4) Embed the digital watermark
采用量化索引调制的方法将水印嵌入到所选频域系数中;包括:The watermark is embedded into the selected frequency domain coefficients by quantization index modulation; including:
设第k段音频信号量化后的频域系数表示为Fk={fk *(i),1≤i≤L},量化规则为:Assume that the frequency-domain coefficient of the k-th audio signal after quantization is expressed as F k = {f k * (i), 1≤i≤L}, and the quantization rule is:
式中,w是水印信息,fk *(i)是第k段音频信号量化后的系数,fk(i)是原始音频的频域系数,△是量化步长,根据量化原理,最大量化误差为0.5△。In the formula, w is the watermark information, f k * (i) is the quantized coefficient of the k-th audio signal, f k (i) is the frequency domain coefficient of the original audio, △ is the quantization step size, according to the quantization principle, the maximum quantization The error is 0.5△.
这里,量化步长的选取关系算法的不可感知性和鲁棒性,量化步长越小不可感知性越好,但降低了鲁棒性;量化步长越大抗攻击能力越强,但不可感知性越差。本发明利用心理声学模型的噪声掩蔽比为每一个临界子带自适应确定水印嵌入强度。Here, the choice of quantization step size is related to the imperceptibility and robustness of the algorithm. The smaller the quantization step size, the better the imperceptibility, but the robustness is reduced; the larger the quantization step size, the stronger the anti-attack ability, but it is imperceptible The worse the sex. The present invention utilizes the noise masking ratio of the psychoacoustic model to adaptively determine the watermark embedding strength for each critical subband.
文献[2]通过对嵌入水印的音频信号设置不同的噪声掩蔽比,分析了嵌入水印对音频质量的影响。实验结果证明如果嵌入水印后的音频噪声掩蔽比NMR小于等于-5db,则音频失真将是不可感知的。基于此,本发明设计了一种根据噪声掩蔽比自适应确定水印嵌入强度的方法。Literature [2] analyzed the impact of embedded watermark on audio quality by setting different noise-masking ratios for embedded watermarked audio signals. Experimental results show that if the audio noise masking ratio NMR after embedding the watermark is less than or equal to -5db, the audio distortion will be imperceptible. Based on this, the present invention designs a method for adaptively determining the embedding strength of the watermark according to the noise-concealment ratio.
所述的量化步长,是根据噪声掩蔽比为每一个临界子带确定一个量化步长,本发明中临界子带的量化噪声由该子带内的频带引入量化噪声的最大值确定,每个临界子带的噪声掩蔽比为:The quantization step size is to determine a quantization step size for each critical subband according to the noise-masking ratio, In the present invention, the quantization noise of the critical subband is determined by the maximum value of the quantization noise introduced by the frequency band in the subband, and the noise masking ratio of each critical subband is:
NMR(i)=Ens(i)-Thr(i)NMR(i)= Ens (i)-T hr (i)
其中,i表示临界子带号,j表示嵌入水印的频率点。Ens(i)和NMR(i)分别表示第i个临界子带的噪声和噪声掩蔽比。E0(i,j)为原始音频信号能量,Ew(i,j)为含水印音频信号能量,Thr(i)为第i个临界子带的掩蔽阈值。根据量化策略,每个系数的量化误差不超过△/2,即eq∈[0,△/2)。Among them, i represents the critical sub-band number, and j represents the frequency point where the watermark is embedded. Ens (i) and NMR(i) denote the noise and the noise-masking ratio of the ith critical subband, respectively. E 0 (i,j) is the energy of the original audio signal, E w (i,j) is the energy of the watermarked audio signal, and Th hr (i) is the masking threshold of the ith critical subband. According to the quantization strategy, the quantization error of each coefficient does not exceed △/2, that is, e q ∈[0,△/2).
由于|fk *-fk|≤△/2,则有Ens≤△2/4,为使得临界子带内信号的失真不被人耳感知,需满足据此可得出水印量化步长满足本发明取 Since |f k * -f k |≤△/2, there is E ns ≤△ 2 /4, in order to prevent the distortion of the signal in the critical subband from being perceived by the human ear, it is necessary to satisfy According to this, it can be concluded that the watermark quantization step size satisfies The present invention takes
由上式可知,水印嵌入强度随掩蔽阈值的变化而变化。掩蔽阈值越大,越容易掩蔽噪声,所以可以有较大的量化步长,从而实现了水印嵌入强度与掩蔽效应的自适应性。It can be seen from the above formula that the watermark embedding strength varies with the masking threshold. The larger the masking threshold is, the easier it is to mask the noise, so a larger quantization step can be used, thereby realizing the adaptability of the watermark embedding strength and the masking effect.
重复上述过程直到L位水印信息全部嵌入后,对嵌入水印的音频段频域系数作做逆离散余弦变换和离散小波包重构,小波包逆变换后的音频信号表示为A*(k),用A*(k)代替A(k)完成一段音频信号的水印嵌入。然后,继续在满足条件的下一段嵌入水印,直至实现所有的水印嵌入。Repeat the above process until all the L-bit watermark information is embedded, and perform inverse discrete cosine transform and discrete wavelet packet reconstruction on the frequency-domain coefficients of the audio segment embedded in the watermark. The audio signal after wavelet packet inverse transformation is expressed as A * (k), Use A * (k) instead of A(k) to complete the watermark embedding of an audio signal. Then, continue to embed watermarks in the next segment that meets the conditions until all watermarks are embedded.
5)进行音频段重组5) Perform audio segment reorganization
将所有未嵌入水印的音频段和嵌入水印的音频段进行重组,组合成含有全部水印的音频信号。Recombine all audio segments without watermarks and embedded watermarks to form an audio signal containing all watermarks.
6)提取数字水印,本发明采用的数字水印提取方法是一种盲水印算法,即水印提取过程不需要原始音频载体。包括:6) Extracting the digital watermark. The digital watermark extraction method used in the present invention is a blind watermark algorithm, that is, the watermark extraction process does not require the original audio carrier. include:
(1)采用步骤2)的方法对含水印音频信号进行分段处理和小波包变换,并计算每一段音频信号的听觉掩蔽阈值;(1) adopting the method of step 2) to carry out subsection processing and wavelet packet transformation to the watermarked audio signal, and calculate the auditory masking threshold of each section of the audio signal;
(2)采用步骤3)的方法找到水印嵌入段和嵌入位置,由于水印对宿主信号的影响较小,使得水印嵌入前后的掩蔽阈值误差基本可以忽略;(2) Use the method of step 3) to find the watermark embedding segment and embedding position. Since the watermark has little influence on the host signal, the masking threshold error before and after watermark embedding can basically be ignored;
(3)采用如下公式提取水印序列:(3) Use the following formula to extract the watermark sequence:
式中fk *(i)是第k段待测音频信号的频域系数,△*是第k段待测音频信号的量化步长;In the formula, f k * (i) is the frequency domain coefficient of the audio signal to be tested in the k section, and △ * is the quantization step size of the audio signal to be tested in the k section;
(4)对提取的水印序列进行升维、逆Arnold变换和Logistic解密操作,得到最终的水印图像。(4) Perform dimension enhancement, inverse Arnold transform and Logistic decryption operations on the extracted watermark sequence to obtain the final watermark image.
下面给出最佳实例:Best examples are given below:
1.选取40×40的二值图像做为水印,对水印信号进行Arnold变换、降维处理和混沌变换,得到水印信息w:1. Select a 40×40 binary image as the watermark, and perform Arnold transformation, dimension reduction processing and chaotic transformation on the watermark signal to obtain the watermark information w:
w={w(i),0≤i≤M×N}w={w(i),0≤i≤M×N}
其中,w(i)表示加密后的水印序列,M×N表示水印序列总数。Among them, w(i) represents the encrypted watermark sequence, and M×N represents the total number of watermark sequences.
2.计算人耳听觉掩蔽阈值;2. Calculate the threshold of human auditory masking;
(1)设原始音频信号为A={a(i),1≤i<N},N为采样点个数,a(i)为音频信号,对音频信号进行分段处理,每段包含2048个采样点,设第k段原始音频信号表示为A(k),采用db8小波基对每一段音频信号进行8级小波包变换,将0~22kHz的频带划分为26个非等宽的子带;表1所示为每个子带对应的小波包系数节点和频率边界值。由表可知子带带宽接近人耳临界频带带宽。(1) Suppose the original audio signal is A={a(i), 1≤i<N}, N is the number of sampling points, a(i) is the audio signal, and the audio signal is segmented, each segment contains 2048 Assuming that the k-th segment of the original audio signal is expressed as A(k), the db8 wavelet base is used to perform 8-level wavelet packet transformation on each segment of the audio signal, and the frequency band from 0 to 22 kHz is divided into 26 sub-bands of non-equal width ; Table 1 shows the wavelet packet coefficient nodes and frequency boundary values corresponding to each subband. It can be seen from the table that the sub-band bandwidth is close to the critical frequency band bandwidth of the human ear.
表1小波包变换的子带分解Table 1 Subband decomposition of wavelet packet transform
(2)对每个子带的小波包系数进行离散余弦变换,得到频域系数X(jw):(2) Discrete cosine transform is performed on the wavelet packet coefficients of each subband to obtain the frequency domain coefficient X(jw):
X(jw)=DCT(wi(k))X(jw)=DCT(w i (k))
其中,wi(k)表示小波包分解后第i个子带中的第k个小波包系数,X(jw)表示DCT变换后的频域系数;Among them, w i (k) represents the kth wavelet packet coefficient in the i-th subband after wavelet packet decomposition, and X(jw) represents the frequency domain coefficient after DCT transformation;
(3)将频域系数X(jw)映射到巴克域:(3) Map the frequency domain coefficient X(jw) to the Barker domain:
z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500)2]}z=round{13arctan(0.76f/1000)+3.5arctan[(f/7500) 2 ]}
其中,f表示频率,z表示巴克域各子带编号。Wherein, f represents the frequency, and z represents the number of each subband in the Bark domain.
计算各个子带的能量值Spz(z):Calculate the energy value S pz (z) for each subband:
其中,|X(jw)|2表示临界子带对应采样点的功率值,hbz和lbz分别表示各子带的上下边界频率;Wherein, |X(jw)| 2 represents the power value of the sampling point corresponding to the critical subband, and hbz and lbz represent the upper and lower boundary frequencies of each subband respectively;
(4)将各个子带的能量值Spz(z)调整为Sm(z)=Spz(z)×B(z),(4) Adjust the energy value S pz (z) of each subband to S m (z)=S pz (z)×B(z),
其中B(z)为扩展函数,B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474)2]1/2;Wherein B(z) is an extension function, B(z)=15.91+7.5(z+0.474)+17.5[1+(z+0.474) 2 ] 1/2 ;
(5)计算每个子带的噪声特性因子a(z):a(z)=min[10lg(G/A)/Smax,1](5) Calculate the noise characteristic factor a(z) of each subband: a(z)=min[10lg(G/A)/S max ,1]
其中,A为功率谱的算术平均值,G为功率谱的几何平均值,当音频信号为纯音时,a(z)=1;白噪声时,a(z)=0;考虑噪声特性因子后的掩蔽阈值修正值为:Among them, A is the arithmetic mean value of the power spectrum, G is the geometric mean value of the power spectrum, when the audio signal is pure tone, a(z)=1; when white noise, a(z)=0; after considering the noise characteristic factor The masking threshold correction value for is:
O(z)=(14.5+z)a+5.5(1-a)。O(z)=(14.5+z)a+5.5(1-a).
计算每个子带的实际掩蔽阈值T(z):求实际掩蔽阈值T(z)与子带样点个数的商,并将所述的商与绝对掩蔽阈值进行比较,将二者中值大的作为最终的掩蔽阈值Thr(z):Thr(z)=max(T,TH)。Calculate the actual masking threshold T(z) for each subband: Find the quotient of the actual masking threshold T(z) and the number of sub-band samples, and compare the quotient with the absolute masking threshold, and use the larger median as the final masking threshold T hr (z): T hr (z)=max(T,TH).
3.根据掩蔽阈值自适应选取水印嵌入段和嵌入位置:3. Adaptively select the watermark embedding segment and embedding position according to the masking threshold:
将水印信息嵌入到频域系数X(jw)的中、低频系数中,计算每个子带中能量值Spz(z)低于掩蔽阈值Thr(z)的中、低频频率分量的总数,选择频率分量总数高于61的音频段作为水印嵌入段,将能量值小于掩蔽阈值的频率分量从大到小进行排序,选取前46位频率分量Fk={fk(i),1≤i≤L}进行水印嵌入。Embed the watermark information into the mid- and low-frequency coefficients of the frequency-domain coefficient X(jw), calculate the total number of mid- and low-frequency frequency components whose energy value S pz (z) is lower than the masking threshold Th hr (z) in each subband, and select The audio segment whose total number of frequency components is higher than 61 is used as the watermark embedding segment, and the frequency components whose energy value is less than the masking threshold are sorted from large to small, and the first 46 frequency components F k = {f k (i), 1≤i≤ L} for watermark embedding.
4.进行数字水印的嵌入4. Embed digital watermark
采用量化索引调制的方法将46位水印嵌入到所选频域系数中,设第k段音频信号量化后的频域系数表示为Fk={fk *(i),1≤i≤L},量化规则为:Embed the 46-bit watermark into the selected frequency domain coefficients by using the method of quantization index modulation, and assume that the frequency domain coefficients of the kth audio signal after quantization are expressed as F k = {f k * (i), 1≤i≤L} , the quantization rule is:
式中,w是水印信息,fk *(i)是第k段音频信号量化后的系数,fk(i)是原始音频的频域系数,△是量化步长,根据量化原理,最大量化误差为0.5△。In the formula, w is the watermark information, f k * (i) is the quantized coefficient of the k-th audio signal, f k (i) is the frequency domain coefficient of the original audio, △ is the quantization step size, according to the quantization principle, the maximum quantization The error is 0.5△.
本发明利用心理声学模型的噪声掩蔽比为每一个临界子带自适应确定水印嵌入强度,量化步长取 In the present invention, the noise-masking ratio of the psychoacoustic model is used to adaptively determine the watermark embedding strength for each critical sub-band, and the quantization step size is
对嵌入水印后的音频段做逆离散余弦变换和逆离散小波包变换,小波包逆变换后的音频信号表示为A*(k),用A*(k)代替A(k)完成一段音频信号的水印嵌入。然后,继续在满足条件的下一段音频信号中嵌入水印,直至实现所有的水印嵌入。Perform inverse discrete cosine transform and inverse discrete wavelet packet transform on the audio segment after embedding the watermark. The audio signal after wavelet packet inverse transform is expressed as A * (k), and A * (k) is used instead of A (k) to complete a segment of audio signal embedded watermark. Then, continue to embed the watermark in the next segment of the audio signal that meets the conditions until all the watermarks are embedded.
5.进行音频段重组5. Perform audio segment reorganization
将所有未嵌入水印的音频段和嵌入水印的音频段进行重组,组合成含有全部水印的音频信号。Recombine all audio segments without watermarks and embedded watermarks to form an audio signal containing all watermarks.
6.提取数字水印,包括:6. Extract digital watermark, including:
(1)采用步骤2的方法对含水印音频信号进行分段处理和小波包变换,并计算每一段音频信号的听觉掩蔽阈值;(1) adopt the method for step 2 to carry out subsection processing and wavelet packet transformation to the watermarked audio signal, and calculate the auditory masking threshold of each section of audio signal;
(2)采用步骤3的方法找到水印嵌入段和嵌入位置;(2) adopt the method of step 3 to find watermark embedding segment and embedding position;
(3)采用如下公式提取水印序列:(3) Use the following formula to extract the watermark sequence:
式中fk *(i)是第k段待测音频信号的频域系数,△*是第k段待测音频信号的量化步长;In the formula, f k * (i) is the frequency domain coefficient of the audio signal to be tested in the k section, and △ * is the quantization step size of the audio signal to be tested in the k section;
(4)对提取的水印序列进行升维、逆Arnold变换和Logistic解密操作,得到最终的水印图像。(4) Perform dimension enhancement, inverse Arnold transform and Logistic decryption operations on the extracted watermark sequence to obtain the final watermark image.
为了测试本发明的性能,选取了三种不同类型的载体音频文件分别为流行音乐、古典音乐和摇滚音乐,它们的采样频率均为44.1kHz;量化精度为16位;格式为单声道。以流行音乐为例,图2、图3、图4分别是原始音频信号、含水印音频信号和二者的差值信号的波形图。由图可见,含水印的音频信号波形图与原始音频信号波形图之间差异很小。水印嵌入后基本察觉不到音频质量的失真。为了更好的评价本方法性能,我们将本发明提出的基于DWPT-DCT的水印方法与传统的基于心理声学模型的水印方法和当前流行的DWT-SVD方法,从不可感知性、水印容量和鲁棒性三个方面进行了比较。In order to test the performance of the present invention, three different types of carrier audio files were selected to be pop music, classical music and rock music, and their sampling frequency was 44.1kHz; the quantization precision was 16 bits; the format was monaural. Taking popular music as an example, Figure 2, Figure 3, and Figure 4 are waveform diagrams of the original audio signal, the watermarked audio signal, and the difference signal between the two, respectively. It can be seen from the figure that there is little difference between the watermarked audio signal waveform and the original audio signal waveform. After the watermark is embedded, the distortion of the audio quality is basically not noticeable. In order to better evaluate the performance of this method, we compared the watermarking method based on DWPT-DCT proposed by the present invention with the traditional watermarking method based on psychoacoustic model and the current popular DWT-SVD method, from imperceptibility, watermarking capacity and robustness. Three aspects of stickiness were compared.
(1)不可感知性对比(1) Comparison of imperceptibility
本发明采用信噪比和平均分段信噪比来评价水印对宿主信号的影响。定义如下:The present invention uses the signal-to-noise ratio and the average segmental signal-to-noise ratio to evaluate the influence of the watermark on the host signal. It is defined as follows:
其中S'(i)为嵌入水印后的系数,S(i)为原始音频信号系数,K为水印嵌入所需音频帧数,N为一帧音频信号包含的采样点个数。SegSNR表示每帧含水印音频信号的信噪比均值,相比SNR,SegSNR能够更好的反映水印信息对原始音频信号的影响。由公式可知,频域系数的改变量决定了信噪比。设某个频域系数的修改增量为e,其引入的噪声能量为e2,根据量化原理,系数的量化噪声是均匀分布的,它的期望值设某帧音频信号的能量为Eg,我们可以据此来估计嵌入水印帧的信噪比上限为:Among them, S'(i) is the coefficient after embedding the watermark, S(i) is the coefficient of the original audio signal, K is the number of audio frames required for watermark embedding, and N is the number of sampling points contained in one frame of audio signal. SegSNR represents the average signal-to-noise ratio of the watermarked audio signal in each frame. Compared with SNR, SegSNR can better reflect the influence of watermark information on the original audio signal. It can be seen from the formula that the change amount of the frequency domain coefficient determines the signal-to-noise ratio. Suppose the modification increment of a certain frequency domain coefficient is e, and the noise energy introduced by it is e 2 , according to the quantization principle, the quantization noise of the coefficient is uniformly distributed, and its expected value Assuming that the energy of a certain frame of audio signal is E g , we can estimate the upper limit of the signal-to-noise ratio of the embedded watermark frame as:
联立量化步长的求解公式,可得嵌入水印的每帧音频信噪比上限为:The solution formula of the simultaneous quantization step size can be obtained as follows:
SNR≤10lg{4Eg/(Thr·L)}+5SNR≤10lg{4E g /(T hr L)}+5
针对流行音乐、古典音乐、摇滚音乐三种音乐类型,实测的信噪比分别为34.23db,29.62db,32.81db;分段信噪比均值分别为32.58db,28.26db,30.95db。实验表明本方法所得分段信噪比均在理论上限值之下,并且大于20db。本发明信噪比平均高于文献[3]7~8db,高于文献[4]1~4db。For pop music, classical music, and rock music, the measured signal-to-noise ratios are 34.23db, 29.62db, and 32.81db respectively; the average segmental signal-to-noise ratios are 32.58db, 28.26db, and 30.95db. Experiments show that the segmented SNRs obtained by this method are all below the theoretical upper limit and greater than 20db. The average signal-to-noise ratio of the present invention is 7-8db higher than that of the document [3], and 1-4db higher than that of the document [4].
(2)水印容量分析(2) Watermark capacity analysis
水印嵌入容量的计算公式为其中,Nw=40×40,它表示嵌入的水印量,T为水印嵌入完成所需要的时间。水印嵌入率为100%时,不同类型音频载体的水印嵌入容量分别为流行音乐(689.5bps),古典音乐(576.7bps),摇滚音乐(668.9bps)。文献[3]在1600个采样点中嵌入1bit水印信息,水印嵌入容量只有27.56bps。文献[4]在512个采样点中嵌入8bit水印,水印嵌入容量为689.06bps,相比之下本发明水印嵌入容量远远高于文献[3]提出的方法,与文献[4]提出的方法嵌入容量相当。The formula for calculating the watermark embedding capacity is Among them, N w =40×40, which represents the amount of embedded watermark, and T is the time required for the completion of watermark embedding. When the watermark embedding rate is 100%, the watermark embedding capacities of different types of audio carriers are pop music (689.5bps), classical music (576.7bps), rock music (668.9bps). Literature [3] embeds 1bit watermark information in 1600 sampling points, and the watermark embedding capacity is only 27.56bps. Literature [4] embeds 8-bit watermarks in 512 sampling points, and the watermark embedding capacity is 689.06bps. In contrast, the watermark embedding capacity of the present invention is much higher than the method proposed in Literature [3], and the method proposed in Literature [4] The embedded capacity is comparable.
(3)水印鲁棒性分析(3) Watermark robustness analysis
为了测试水印抵抗常见信号处理的攻击能力,本发明对嵌入水印的音频信号做如下处理:(1)叠加噪声:对数字音频信号在时域中加入高斯白噪声,信噪比为20db。(2)重采样:先下采样至22.05kHz,再上采样至44.1kHz。(3)重量化:先将音频信号从16bit量化为8bit,再从8bit量化为16bit。(4)mp3音频压缩:先对音频信号进行压缩,再解压缩(比特率为128kbps)。(5)低通滤波,截止频率为8kHz。本发明比较了所提方法与文献[3]和[4]对常见信号处理与攻击的抵抗能力,对三种方法选取相同的宿主音乐(流行音乐)和水印图像,在满足水印不被感知的条件下测得它们的NC值如表3所示。由表可知,本发明与文献[4]水印容量相当的情况下具有更好的鲁棒性。尽管本发明在水印容量上要远远高于文献[3],但是水印鲁棒性并不差于文献[3]。In order to test the watermark's ability to resist common signal processing attacks, the present invention performs the following processing on the embedded watermarked audio signal: (1) superimposed noise: add Gaussian white noise to the digital audio signal in the time domain, and the signal-to-noise ratio is 20db. (2) Resampling: downsampling to 22.05kHz first, then upsampling to 44.1kHz. (3) Weighting: first quantize the audio signal from 16bit to 8bit, and then quantize from 8bit to 16bit. (4) mp3 audio compression: the audio signal is first compressed, and then decompressed (the bit rate is 128kbps). (5) Low-pass filtering with a cutoff frequency of 8kHz. The present invention compares the resistance ability of the proposed method and literature [3] and [4] to common signal processing and attacks, selects the same host music (pop music) and watermark image for the three methods, and satisfies the requirement that the watermark is not perceived Their NC values measured under these conditions are shown in Table 3. It can be seen from the table that the present invention has better robustness when the watermark capacity is equivalent to that of document [4]. Although the watermark capacity of the present invention is much higher than that of literature [3], the robustness of watermark is not worse than that of literature [3].
表2不同音乐类型的抗攻击能力Table 2 Anti-attack capabilities of different music types
表3不同方法的抗攻击能力Table 3 Anti-attack capabilities of different methods
相关文献如下:The relevant literature is as follows:
1、Johnston J D.Transform coding of audio signals using perceptualnoise criteria[J].IEEE Journal on Selected Areas in Communications,1988,6(2):314-323.1. Johnston J D. Transform coding of audio signals using perceptual noise criteria [J]. IEEE Journal on Selected Areas in Communications, 1988, 6(2): 314-323.
2、Arnold M.Quality evaluation of watermarked audio tracks[J].Proceedings of SPIE-The International Society for Optical Engineering,2002,4675:91-101.2. Arnold M.Quality evaluation of watermarked audio tracks[J].Proceedings of SPIE-The International Society for Optical Engineering,2002,4675:91-101.
3、Cai Y M,Guo W Q,Ding H Y.An audio blind watermarking scheme basedon DWT-SVD[J].Journal of Software,2013,8(7).1801-1808.3. Cai Y M, Guo W Q, Ding H Y.An audio blind watermarking scheme based on DWT-SVD[J].Journal of Software,2013,8(7).1801-1808.
4、李榕,王宏霞,赵鹏君.基于心理声学模型的自适应音频数字水印算法[C].全国信息隐藏暨多媒体信息安全学术大.2010.4. Li Rong, Wang Hongxia, Zhao Pengjun. Adaptive Audio Digital Watermarking Algorithm Based on Psychoacoustic Model [C]. National Information Hiding and Multimedia Information Security Academic University. 2010.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610983877.6A CN106504757A (en) | 2016-11-09 | 2016-11-09 | An Adaptive Audio Blind Watermarking Method Based on Auditory Model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610983877.6A CN106504757A (en) | 2016-11-09 | 2016-11-09 | An Adaptive Audio Blind Watermarking Method Based on Auditory Model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106504757A true CN106504757A (en) | 2017-03-15 |
Family
ID=58323879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610983877.6A Pending CN106504757A (en) | 2016-11-09 | 2016-11-09 | An Adaptive Audio Blind Watermarking Method Based on Auditory Model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106504757A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106875954A (en) * | 2017-03-27 | 2017-06-20 | 中国农业大学 | The speech hiding circuit structure and its control method of a kind of anti vocoder treatment |
CN110163787A (en) * | 2019-04-26 | 2019-08-23 | 江苏信实云安全技术有限公司 | Digital audio Robust Blind Watermarking Scheme embedding grammar based on dual-tree complex wavelet transform |
CN110223273A (en) * | 2019-05-16 | 2019-09-10 | 天津大学 | A kind of image repair evidence collecting method of combination discrete cosine transform and neural network |
WO2020073508A1 (en) * | 2018-10-12 | 2020-04-16 | 平安科技(深圳)有限公司 | Method and device for adding and extracting audio watermark, electronic device and medium |
CN111091841A (en) * | 2019-12-12 | 2020-05-01 | 天津大学 | An audio watermarking algorithm for identity authentication based on deep learning |
CN111292756A (en) * | 2020-01-19 | 2020-06-16 | 成都嗨翻屋科技有限公司 | Compression-resistant audio silent watermark embedding and extracting method and system |
CN111755018A (en) * | 2020-05-14 | 2020-10-09 | 华南理工大学 | Audio Hiding Method and Device Based on Wavelet Transform and Quantized Embedding Key |
CN112364386A (en) * | 2020-10-21 | 2021-02-12 | 天津大学 | Audio tampering detection and recovery method combining compressed sensing and DWT |
CN113362835A (en) * | 2020-03-05 | 2021-09-07 | 杭州网易云音乐科技有限公司 | Audio watermark processing method and device, electronic equipment and storage medium |
CN113506580A (en) * | 2021-04-28 | 2021-10-15 | 合肥工业大学 | Method and system for audio watermarking resistant to arbitrary cutting and ripping |
CN113704707A (en) * | 2021-08-26 | 2021-11-26 | 湖南天河国云科技有限公司 | Block chain-based audio tamper-proof method and device |
CN113782041A (en) * | 2021-09-14 | 2021-12-10 | 随锐科技集团股份有限公司 | Method for embedding and positioning watermark based on audio frequency-to-frequency domain |
CN114758660A (en) * | 2022-04-18 | 2022-07-15 | 中国银行股份有限公司 | A kind of bank exclusive audio copyright protection method and device |
CN116052695A (en) * | 2022-10-28 | 2023-05-02 | 陕西师范大学 | A wave operation-based audio auditory cipher method, system and device |
CN116312577A (en) * | 2023-05-22 | 2023-06-23 | 海底鹰深海科技股份有限公司 | Speech processing apparatus, speech encryption method, and speech decryption method |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030223593A1 (en) * | 2002-06-03 | 2003-12-04 | Lopez-Estrada Alex A. | Perceptual normalization of digital audio signals |
CN101038771A (en) * | 2006-03-18 | 2007-09-19 | 辽宁师范大学 | Novel method of digital watermarking for protecting literary property of music works |
CN101122996A (en) * | 2007-09-12 | 2008-02-13 | 北京大学 | Method and device for watermark embedding and extraction of digital image |
US20080215333A1 (en) * | 1996-08-30 | 2008-09-04 | Ahmed Tewfik | Embedding Data in Audio and Detecting Embedded Data in Audio |
CN101271568A (en) * | 2008-05-16 | 2008-09-24 | 山东大学 | Iterative Adaptive Quantization Index Modulation Watermarking Method Based on Vision Model |
CN101308566A (en) * | 2008-06-02 | 2008-11-19 | 西安电子科技大学 | Digital image watermarking method against geometric attack based on contourlet transform |
CN101493928A (en) * | 2009-02-10 | 2009-07-29 | 国网信息通信有限公司 | Digital watermarking embedding, extracting and quantizing step size coordinating factor optimizing method and device |
CN102142255A (en) * | 2010-07-08 | 2011-08-03 | 北京三信时代信息公司 | Method for embedding and extracting digital watermark in audio signal |
CN103208288A (en) * | 2013-03-13 | 2013-07-17 | 漳州职业技术学院 | Dual encryption based discrete wavelet transform-discrete cosine transform (DWT-DCT) domain audio public watermarking algorithm |
EP2787503A1 (en) * | 2013-04-05 | 2014-10-08 | Movym S.r.l. | Method and system of audio signal watermarking |
CN104658541A (en) * | 2013-11-25 | 2015-05-27 | 哈尔滨恒誉名翔科技有限公司 | Digital watermarking system based on discrete wavelet transform |
CN104795071A (en) * | 2015-04-18 | 2015-07-22 | 广东石油化工学院 | Blind audio watermark embedding and watermark extraction processing method |
-
2016
- 2016-11-09 CN CN201610983877.6A patent/CN106504757A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080215333A1 (en) * | 1996-08-30 | 2008-09-04 | Ahmed Tewfik | Embedding Data in Audio and Detecting Embedded Data in Audio |
US20030223593A1 (en) * | 2002-06-03 | 2003-12-04 | Lopez-Estrada Alex A. | Perceptual normalization of digital audio signals |
CN101038771A (en) * | 2006-03-18 | 2007-09-19 | 辽宁师范大学 | Novel method of digital watermarking for protecting literary property of music works |
CN101122996A (en) * | 2007-09-12 | 2008-02-13 | 北京大学 | Method and device for watermark embedding and extraction of digital image |
CN101271568A (en) * | 2008-05-16 | 2008-09-24 | 山东大学 | Iterative Adaptive Quantization Index Modulation Watermarking Method Based on Vision Model |
CN101308566A (en) * | 2008-06-02 | 2008-11-19 | 西安电子科技大学 | Digital image watermarking method against geometric attack based on contourlet transform |
CN101493928A (en) * | 2009-02-10 | 2009-07-29 | 国网信息通信有限公司 | Digital watermarking embedding, extracting and quantizing step size coordinating factor optimizing method and device |
CN102142255A (en) * | 2010-07-08 | 2011-08-03 | 北京三信时代信息公司 | Method for embedding and extracting digital watermark in audio signal |
CN103208288A (en) * | 2013-03-13 | 2013-07-17 | 漳州职业技术学院 | Dual encryption based discrete wavelet transform-discrete cosine transform (DWT-DCT) domain audio public watermarking algorithm |
EP2787503A1 (en) * | 2013-04-05 | 2014-10-08 | Movym S.r.l. | Method and system of audio signal watermarking |
CN104658541A (en) * | 2013-11-25 | 2015-05-27 | 哈尔滨恒誉名翔科技有限公司 | Digital watermarking system based on discrete wavelet transform |
CN104795071A (en) * | 2015-04-18 | 2015-07-22 | 广东石油化工学院 | Blind audio watermark embedding and watermark extraction processing method |
Non-Patent Citations (4)
Title |
---|
HWAI-TSU HU ET AL.: "Exploiting Psychoacoustic Properties to Achieve", 《2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS》 * |
M. ARNOLD: "Subjective and Objective Quality Evaluation of Watermarked Audio Tracks", 《SECOND INTERNATIONAL CONFERENCE ON WEB DELIVERING OF MUSIC》 * |
王舜: "基于小波包变换和听觉隐蔽的同步音频盲水印技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
祁薇: "数字音频水印理论与研究实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106875954A (en) * | 2017-03-27 | 2017-06-20 | 中国农业大学 | The speech hiding circuit structure and its control method of a kind of anti vocoder treatment |
WO2020073508A1 (en) * | 2018-10-12 | 2020-04-16 | 平安科技(深圳)有限公司 | Method and device for adding and extracting audio watermark, electronic device and medium |
CN110163787A (en) * | 2019-04-26 | 2019-08-23 | 江苏信实云安全技术有限公司 | Digital audio Robust Blind Watermarking Scheme embedding grammar based on dual-tree complex wavelet transform |
CN110163787B (en) * | 2019-04-26 | 2023-02-28 | 江苏水印科技有限公司 | Audio digital robust blind watermark embedding method based on dual-tree complex wavelet transform |
CN110223273A (en) * | 2019-05-16 | 2019-09-10 | 天津大学 | A kind of image repair evidence collecting method of combination discrete cosine transform and neural network |
CN110223273B (en) * | 2019-05-16 | 2023-04-07 | 天津大学 | Image restoration evidence obtaining method combining discrete cosine transform and neural network |
CN111091841B (en) * | 2019-12-12 | 2022-09-30 | 天津大学 | Identity authentication audio watermarking algorithm based on deep learning |
CN111091841A (en) * | 2019-12-12 | 2020-05-01 | 天津大学 | An audio watermarking algorithm for identity authentication based on deep learning |
CN111292756A (en) * | 2020-01-19 | 2020-06-16 | 成都嗨翻屋科技有限公司 | Compression-resistant audio silent watermark embedding and extracting method and system |
CN111292756B (en) * | 2020-01-19 | 2023-05-26 | 成都潜在人工智能科技有限公司 | Compression-resistant audio silent watermark embedding and extracting method and system |
CN113362835B (en) * | 2020-03-05 | 2024-06-07 | 杭州网易云音乐科技有限公司 | Audio watermarking method, device, electronic equipment and storage medium |
CN113362835A (en) * | 2020-03-05 | 2021-09-07 | 杭州网易云音乐科技有限公司 | Audio watermark processing method and device, electronic equipment and storage medium |
CN111755018A (en) * | 2020-05-14 | 2020-10-09 | 华南理工大学 | Audio Hiding Method and Device Based on Wavelet Transform and Quantized Embedding Key |
CN111755018B (en) * | 2020-05-14 | 2023-08-22 | 华南理工大学 | Audio hiding method and device based on wavelet transform and quantized embedded key |
CN112364386B (en) * | 2020-10-21 | 2022-04-26 | 天津大学 | Audio tampering detection and recovery method combining compressed sensing and DWT |
CN112364386A (en) * | 2020-10-21 | 2021-02-12 | 天津大学 | Audio tampering detection and recovery method combining compressed sensing and DWT |
CN113506580A (en) * | 2021-04-28 | 2021-10-15 | 合肥工业大学 | Method and system for audio watermarking resistant to arbitrary cutting and ripping |
CN113506580B (en) * | 2021-04-28 | 2024-05-07 | 合肥工业大学 | Audio watermarking method and system capable of resisting random cutting and transcription |
CN113704707A (en) * | 2021-08-26 | 2021-11-26 | 湖南天河国云科技有限公司 | Block chain-based audio tamper-proof method and device |
CN113782041A (en) * | 2021-09-14 | 2021-12-10 | 随锐科技集团股份有限公司 | Method for embedding and positioning watermark based on audio frequency-to-frequency domain |
CN113782041B (en) * | 2021-09-14 | 2023-08-15 | 随锐科技集团股份有限公司 | Method for embedding and positioning watermark based on audio variable frequency domain |
CN114758660A (en) * | 2022-04-18 | 2022-07-15 | 中国银行股份有限公司 | A kind of bank exclusive audio copyright protection method and device |
CN116052695A (en) * | 2022-10-28 | 2023-05-02 | 陕西师范大学 | A wave operation-based audio auditory cipher method, system and device |
CN116312577A (en) * | 2023-05-22 | 2023-06-23 | 海底鹰深海科技股份有限公司 | Speech processing apparatus, speech encryption method, and speech decryption method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106504757A (en) | An Adaptive Audio Blind Watermarking Method Based on Auditory Model | |
Wang et al. | A norm-space, adaptive, and blind audio watermarking algorithm by discrete wavelet transform | |
CN101271690B (en) | Audio spread-spectrum watermark processing method for protecting audio data | |
Bhat K et al. | A new audio watermarking scheme based on singular value decomposition and quantization | |
Elshazly et al. | Secure and robust high quality DWT domain audio watermarking algorithm with binary image | |
Dhar et al. | Digital watermarking scheme based on fast Fourier transformation for audio copyright protection | |
Chauhan et al. | A survey: Digital audio watermarking techniques and applications | |
Attari et al. | Robust audio watermarking algorithm based on DWT using Fibonacci numbers | |
Dhar et al. | A new audio watermarking system using discrete fourier transform for copyright protection | |
Gopalan | A unified audio and image steganography by spectrum modification | |
Zhang et al. | Robust Audio Watermarking Based on Extended Improved Spread Spectrum with Perceptual Masking. | |
Avci et al. | A new information hiding method for audio signals | |
Attari et al. | Robust and transparent audio watermarking based on spread spectrum in wavelet domain | |
Dhar et al. | Audio watermarking in transform domain based on singular value decomposition and quantization | |
Liu et al. | Adaptive audio steganography scheme based on wavelet packet energy | |
Khalil et al. | Informed audio watermarking based on adaptive carrier modulation | |
Attari et al. | Robust and blind audio watermarking in wavelet domain | |
Li et al. | Analysis on unit maximum capacity of orthogonal multiple watermarking for multimedia signals in B5G wireless communications | |
El-Khamy et al. | Chaos-based image hiding scheme between silent intervals of high quality audio signals using feature extraction and image bits spreading | |
Sharma et al. | Robust image watermarking technique using contourlet transform and optimized edge detection algorithm | |
Patil et al. | Performance evaluation of digital audio watermarking based on discrete wavelet transform for ownership protection | |
Li et al. | A novel audio watermarking in wavelet domain | |
Patil et al. | Audio watermarking: A way to copyright protection | |
Dhavale et al. | Lossless audio watermarking based on the alpha statistic modulation | |
Zhao et al. | Quantization index modulation audio watermarking system using a psychoacoustic model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170315 |