Disclosure of Invention
The present invention is directed to a method for steganalysis of small samples based on feature enhancement and sample expansion, which overcomes the above-mentioned drawbacks of the prior art.
The aim of the invention can be achieved by the following technical scheme:
The invention provides a few-sample steganalysis method based on feature enhancement and sample expansion, which comprises the following steps:
Acquiring a steganographic image set generated by an unknown steganographic algorithm;
Extracting features of each steganographic image in the steganographic image set to obtain a plurality of steganographic noise feature images corresponding to each steganographic image;
screening the steganographic noise feature images of each steganographic image by using a normalized feature significance function, and screening feature images of each steganographic image with significance;
Inputting the saliency feature images into a pre-trained variational self-encoder model to generate an embedding probability image of each steganographic image in a steganographic image set;
steganographically the non-steganographically based on the embedding probability map to generate a pseudo-secret-containing sample;
Fine tuning the pre-trained steganalysis model by combining the pseudo-secret-containing sample and the steganalysis image set to obtain a target steganalysis network;
And inputting the test sample of the unknown set into a target steganography analysis network, and judging whether the test sample hides secret information or not through the output label.
Further, the feature extraction for each steganographic image in the steganographic image set specifically includes:
Extracting high-frequency noise characteristics from the steganographic image through an airspace rich model SRM, specifically, checking the steganographic image through a plurality of high-pass filters to carry out convolution operation, and obtaining a residual characteristic diagram of the image, wherein the formula is as follows:
wherein S (i,j) represents an ith Zhang Yin written image extracted by a space domain rich model SRM method And a residual characteristic diagram obtained by the convolution operation of the jth high-pass filter kernel s j;
Convoluting the steganographic image through Gabor filters in different directions and scales to extract textures and edge features in the steganographic image, wherein the formula is as follows:
Wherein, the Representing an i Zhang Yin th written image extracted by a Gabor filterAnd a texture and edge feature map obtained by convolution operation of the jth Gabor filter kernel g j;
Residual feature map S (i,j) and texture and edge feature map extracted by SRM and Gabor filters Collectively, as a steganographic noise feature map of the steganographic image i.
Further, the filtering the steganographic noise feature map of each steganographic image by using the normalized feature saliency function specifically includes:
Calculating the representative degree of the steganographic noise feature map of each steganographic image by using a normalized feature saliency function A NES (x), wherein the formula is as follows:
Wherein G M represents the average gradient magnitude of each pixel in the steganographic noise feature map calculated using the Sobel operator, H and W are the height and width of the steganographic noise feature map, gx and Gy are the gradient approximations in the horizontal and vertical directions, respectively, C mns represents the root mean square contrast of the steganographic noise feature map, I ij is the gray value of pixel (I, j) in the steganographic noise feature map, For the average gray value of the steganographic noise feature map, H (p i) represents the image entropy, which is used to measure the complexity of the image, p i is the frequency of the i-th gray value in the steganographic noise feature map, n is the number of gray levels, norm represents the min-max normalization process, and a NES (x) is the normalized feature significance value of the steganographic noise feature map;
According to the normalized feature saliency value A NES (x), selecting a steganographic noise feature map with the saliency value higher than a preset threshold value E as a feature map of the saliency of the steganographic image, wherein the formula is as follows:
Wherein, the Feature map representing the j-th salience of the i Zhang Yin th write image after normalized feature salience threshold screening, x (i,j) is the j-th steganographic noise feature map of steganographic image i, a NES(x(i,j)) is the normalized feature salience value of steganographic noise feature map x (i,j).
Further, the variation self-encoder model comprises an encoder E (x, z) and a decoder D (z, x), wherein the encoder is used for mapping the significance of the inputThe hidden variable z mapped into the low-dimensional potential space is used for reconstructing the original image, the coder part adopts ResNet-152 as a feature extractor, removes the last full-connection layer, connects two full-connection layers and a batch normalization layer respectively, and converts the feature vector output by ResNet into two 256-dimensional vector outputs respectively, wherein one is a mean vector mu, and the other is natural logarithm of variance logvar =ln sigma 2.
Further, the inputting the saliency feature map into a pre-trained variational self-encoder model to generate an embedding probability map of each steganographic image in the steganographic image set specifically includes:
feature map of all salience of steganographic image i Inputting a pre-trained variation from an encoder model, the variation from a feature map of encoder output significance of the encoder modelThe natural logarithm of its corresponding mean vector μ and variance logvar = ln σ 2;
sampling from standard normal distribution by a heavy parameterization technique based on natural logarithm logvar =ln sigma 2 of mean vector μ and variance to obtain hidden space vector z (i,j);
Feature map based on all saliency of steganographic image i Is used for calculating the Gaussian average value of the hidden space vector z (i,j) of the hidden image iAnd a covariance matrix Σ i;
based on the calculated Gaussian average of the steganographic image i And covariance matrix Sigma i, generating R Zhang Yin variable z (i,r) of a steganographic image i by a uniform sampling method, inputting the steganographic variable z (i,r) into a decoder of a variational self-encoder model, and generating an embedded probability map corresponding to the steganographic variable z (i,r) The formula is:
Where Decoder represents the Decoder operation of the variational self-encoder model.
Further, the generation formula of the hidden space vector z (i,j) is as follows:
z(i,j)=μ+eps×σ~N(μ,σ2)
Where μ is the mean vector of the encoder output of the encoder model, σ is the index of the natural logarithm of the variance, i.e., the standard deviation, eps is the noise sampled from the standard normal distribution N (μ, σ 2), and z (i,j) is the saliency map Hidden space vectors extracted from the encoder of the encoder model by variance.
Further, the feature map based on all saliency of the steganographic image iIs used for calculating the Gaussian average value of the hidden space vector z (i,j) of the hidden image iAnd covariance matrix Σ i, the calculation formula is:
wherein z (i,j) is the feature map of the j-th salience of the i Zhang Yin-th written image The hidden space vector mapped by the encoder,Is the characteristic map of the full salience of the i Zhang Yin th written imageSigma i is the covariance matrix of the i Zhang Yin-th written image, J is the number of feature maps of saliency of the i Zhang Yin-th written image.
Further, the Gaussian average value of the hidden image i obtained by calculationAnd the covariance matrix Σ i generates the R Zhang Yin variable z (i,r) of the steganographic image i by a uniform sampling method, which specifically comprises the following steps:
Gaussian mean of steganographic image i And covariance matrix Σ i as upper and lower limits of uniform distribution, generating R hidden space vectors z (i,r), wherein R represents the R-th generated hidden variable, and the generation formula is:
Wherein Φ -1 represents an inverse function of a standard normal distribution, α is a probability threshold, and is used for controlling a sampling range for generating hidden variables, and U represents a uniform distribution.
Further, the steganographically based on the embedding probability map specifically includes:
embedding each sheet into a probability map Inputting into Otsu threshold segmentation method to obtain segmentation threshold T (i,r), and determining segmentation result by the following formula
Wherein, the The pixel values of the (x, y) th embedding probability map representing the (i Zhang Yin) th written image at position (x, y), T (i,r) is the segmentation threshold calculated by the Otsu thresholding method,The binary pixel value of the segmented image at the position (x, y) is 0 if the original pixel value is smaller than the threshold value, otherwise, the binary pixel value is 1;
Generating a random number matrix W= (W (x, y)) H×W and satisfying W (x, y) -U (0, 1), based on the segmentation result according to the following rule And a random number matrix W, performing embedded modification:
Wherein, the Representing the probability of a pixel being modified to +1 or-1 at position (x, y), respectively, N ±1 represents the number of pixels in the image that were modified to +1 or-1,For segmented probability mapsThe sum of all pixel values of the image, H and W are the height and width of the image respectively;
the random number matrix W is adjusted by the following embedding mapping rules to obtain an embedding modification matrix M (i,r) (x, y):
Wherein M (i,r) (x, y) is the value of the ith Zhang Yin write image r Zhang Qianru modification image at position (x, y), W x,y is the random value at position (x, y) in the random number matrix W;
Adding the embedded modification matrix M (i,r) to the non-steganographic image X us to obtain a pseudo-secret sample X' us, where the formula is:
X′us=Xus+M(i,r)
wherein X us is an unchecked image, and X' us is a pseudo-close-containing sample image after embedding modification.
Further, the loss function of the variation self-encoder model is:
Wherein, the To change the loss function of the self-encoder model, the first termReconstruction loss, mean square error of decoder output and input significance signature, second term The posterior distribution is measured for Kullback-Leibler divergence lossThe third term R cluster is a cluster regularization term that encourages latent variables to form clearer cluster structures in the latent space, unlike the standard normal distributions p (z) -N (0,I),Reconstructing loss, and measuring input significance characteristic diagramAnd decoder reconstruct outputThe mean square error, mu and sigma are the mean and standard deviation of the encoder output, lambda is the hyper-parameter, tr (Σ i) is the trace of the covariance matrix, and the total variance of the latent variable is represented.
Compared with the prior art, the invention has the following advantages:
(1) According to the invention, by generating the embedded probability map and combining the pseudo-secret-containing sample with the original steganography image set, the learning ability of the model under the condition of few samples is enhanced. The method effectively overcomes the challenge of limited steganography data, enhances the generalization capability of a steganography analysis model, and remarkably improves the adaptability and the detection precision in a small sample scene.
(2) The invention extracts high-frequency noise, texture and edge characteristics from a steganographic image through a space domain rich model (SRM) and a Gabor filter to generate a steganographic noise characteristic diagram of the steganographic image. The technology can effectively capture the subtle changes in the steganographic image and improve the recognition capability of steganographic noise. And screening out the feature map of the salience by using the normalized feature salience function, so that the distinguishing degree of the image features is further improved.
(3) The invention adopts a pre-trained variational self-encoder model, extracts a saliency characteristic image through an encoder, generates hidden space vectors of a low-dimensional potential space, and generates an embedded probability image based on the hidden space vectors. The technology not only improves the reconstruction capability of the image, but also further strengthens the recognition effect of the model on the steganographic image, so that the model can still keep higher accuracy under the condition of unknown steganographic algorithm.
(4) The invention can effectively enhance the steganalysis capability by fine-tuning the pre-trained steganalysis model under the condition that only a small amount of steganalysis image samples exist. By combining the pseudo-secret-containing sample and the significance characteristic diagram for training, dependence on a large amount of labeling data is greatly reduced, and applicability and flexibility in practical application are improved.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Example 1:
According to the embodiment, through the enhancement of the secret-containing samples and the hierarchical training of the model, the detection capability of the existing steganography analysis model on the unknown steganography algorithm is improved under the condition that the number of the unknown steganography algorithm samples is rare. As shown in FIG. 1, for a 2-way K-shot few sample steganalysis task, it is assumed that there is a carrier image and steganographic image pair with K pairs of unknown steganographic algorithms. First, a high frequency dense noise map is extracted from these sample pairs using a diversified high pass filter kernel. Based on these noise-containing maps, a generation model is constructed and trained, and a large number of and diversified pseudo noise-containing maps are generated through a series of operations. And then, combining the pseudo-secret samples and the real samples to fine tune the pre-trained steganalysis model, thereby obtaining the target steganalysis network. And finally, inputting the test sample of the unknown set into a target steganography analysis network, and judging whether the test sample hides secret information or not through the output label.
The technical solution of the embodiment is that in the actual steganalysis task, the detection capability of the existing deep learning model on the unknown steganalysis algorithm is reduced because the steganalysis algorithm is unknown and labeling samples are rare. It is therefore an object of the present invention to provide a method for optimizing and improving the detection capability of a model by synthesizing pseudo-dense samples, combining feature enhancement techniques with a few-sample learning strategy.
The embodiment provides a method for steganalysis of few samples based on feature enhancement and sample expansion, as shown in fig. 2, by capturing high-frequency noise features in an image through an SRM and Gabor filter kernel, and screening out a most representative feature map by using a normalized feature significance index to adjust a VAE model, a pseudo-dense sample is generated. And then, carrying out coarse granularity and fine granularity adjustment on the pre-training model by combining the true dense sample, and optimizing the performance of the model. In addition, the invention has the characteristics of plug and play and higher universality, and the structure of the steganalysis network does not need to be changed. The method can greatly improve the detection precision of an unknown steganography algorithm under the condition of a small amount of marked data, and has good performance on the aspect of detecting an antagonistic steganography sample, and specifically comprises the following steps:
Acquiring a steganographic image set generated by an unknown steganographic algorithm;
Extracting features of each steganographic image in the steganographic image set to obtain a plurality of steganographic noise feature images corresponding to each steganographic image;
screening the steganographic noise feature images of each steganographic image by using a normalized feature significance function, and screening feature images of each steganographic image with significance;
Inputting the saliency feature images into a pre-trained variational self-encoder model to generate an embedding probability image of each steganographic image in a steganographic image set;
steganographically the non-steganographically based on the embedding probability map to generate a pseudo-secret-containing sample;
Fine tuning the pre-trained steganalysis model by combining the pseudo-secret-containing sample and the steganalysis image set to obtain a target steganalysis network;
And inputting the test sample of the unknown set into a target steganography analysis network, and judging whether the test sample hides secret information or not through the output label.
Further, the feature extraction for each steganographic image in the steganographic image set specifically includes:
Extracting high-frequency noise characteristics from the steganographic image through an airspace rich model SRM, specifically, checking the steganographic image through a plurality of high-pass filters to carry out convolution operation, and obtaining a residual characteristic diagram of the image, wherein the formula is as follows:
wherein S (i, j) represents an i Zhang Yin th written image extracted by a space domain rich model SRM method And a residual characteristic diagram obtained by the convolution operation of the jth high-pass filter kernel s j;
Convoluting the steganographic image through Gabor filters in different directions and scales to extract textures and edge features in the steganographic image, wherein the formula is as follows:
Wherein, the Representing an i Zhang Yin th written image extracted by a Gabor filterAnd a texture and edge feature map obtained by convolution operation of the jth Gabor filter kernel g j;
Residual feature map S (i,j) and texture and edge feature map extracted by SRM and Gabor filters Collectively, as a steganographic noise feature map of the steganographic image i.
In steganography, the operation of embedding secret information can be seen as adding very weak noise to the carrier image, such modification being fine-tuned only at the pixel level. Unlike methods that model the image content directly, spatial Rich Models (SRMs) focus on analyzing noise components (i.e., noise residuals) in the image. Since prediction errors among local pixels can reflect neighborhood correlations, the SRM extracts various types of features through a plurality of sub-models, thereby better describing the damage of steganography operation to various local correlations among pixels. The SRM method extracts spatial feature information of an image by establishing different high-pass filter kernels and calculates a residual error. Then, these residual information are subjected to truncation and quantization processes, and steganalysis features are calculated through the co-occurrence matrix. The resulting co-occurrence matrix is classified into 7 classes, and as shown in fig. 3, 30 SRM filter kernels containing the 7 classes are shown. These high pass filter kernels focus on extracting the embedded artifacts introduced by steganography, enabling a richer steganography feature. Thus, m out of the above filtering kernels are selected, and all kernel sizes are unified to (5, 5) by zero-padding, and then the generated steganographic image is generated for the ith unknown steganographic algorithmAnd performing convolution operation.
In image processing, 2D Gabor filters are often used for texture analysis, particularly suited to detect content of a particular direction and frequency in an image. By selecting a specific Gabor function, gabor filters for multi-scale, multi-directional feature extraction can be designed. In fig. 4, a 2D Gabor filter kernel of size (8, 8) and having different parameters is visualized. We set four direction parameters (i.e., θ∈ {0, pi/4, pi/2, 3 pi/4 }) and set the scale parameter σ to 0.5, 0.6, 0.7, 0.8, the phase offset parameterFurthermore, we subtract the kernel mean from the 2D Gabor filter element to zero the filter mean. Similarly, to obtain diversified Gabor rich features, we generate n Gabor filter kernels and steganographic images for unknown steganographic algorithmsAnd performing convolution operation.
Further, the filtering the steganographic noise feature map of each steganographic image by using the normalized feature saliency function specifically includes:
Calculating the representative degree of the steganographic noise feature map of each steganographic image by using a normalized feature saliency function A NES (x), wherein the formula is as follows:
Wherein G M represents the average gradient magnitude of each pixel in the steganographic noise feature map calculated using the Sobel operator, H and W are the height and width of the steganographic noise feature map, gx and Gy are the gradient approximations in the horizontal and vertical directions, respectively, C mns represents the root mean square contrast of the steganographic noise feature map, I ij is the gray value of pixel (I, j) in the steganographic noise feature map, For the average gray value of the steganographic noise feature map, H (p i) represents the image entropy, which is used to measure the complexity of the image, p i is the frequency of the i-th gray value in the steganographic noise feature map, n is the number of gray levels, norm represents the min-max normalization process, and a NES (x) is the normalized feature significance value of the steganographic noise feature map;
According to the normalized feature saliency value A NES (x), selecting a steganographic noise feature map with the saliency value higher than a preset threshold value E as a feature map of the saliency of the steganographic image, wherein the formula is as follows:
Wherein, the Feature map representing the j-th salience of the i Zhang Yin th write image after normalized feature salience threshold screening, x (i,j) is the j-th steganographic noise feature map of steganographic image i, a NES(x(i,j)) is the normalized feature salience value of steganographic noise feature map x (i,j).
The present embodiments propose a method of training and directing a variation from an encoder (VAE) to generate more such samples using representative samples. These representative samples have more prominent key features in certain classes of steganographic images, which help to more accurately construct potential spatial prototypes of such steganographic features. Since the difference in steganographic features between different dense noise-containing maps in the dense feature set is large, to pick out those most representative samples, we have introduced a normalized feature saliency function a NFS (x) to calculate the degree of representativeness of each sample. After the normalized feature significance value of each type of image is obtained, a threshold may be set to filter out samples with insignificant steganography features, thereby obtaining a set of samples with strong representative features. The method not only can effectively identify and select the most representative sample, but also can more accurately simulate the feature distribution of the steganographic image in potential space.
Further, the variation self-encoder model comprises an encoder E (x, z) and a decoder D (z, x), wherein the encoder is used for mapping the significance of the inputThe hidden variable z mapped into the low-dimensional potential space is used for reconstructing the original image, the coder part adopts ResNet-152 as a feature extractor, removes the last full-connection layer, connects two full-connection layers and a batch normalization layer respectively, and converts the feature vector output by ResNet into two 256-dimensional vector outputs respectively, wherein one is a mean vector mu, and the other is natural logarithm of variance logvar =ln sigma 2.
Further, the inputting the saliency feature map into a pre-trained variational self-encoder model to generate an embedding probability map of each steganographic image in the steganographic image set specifically includes:
feature map of all salience of steganographic image i Inputting a pre-trained variation from an encoder model, the variation from a feature map of encoder output significance of the encoder modelThe natural logarithm of its corresponding mean vector μ and variance logvar = ln σ 2;
sampling from standard normal distribution by a heavy parameterization technique based on natural logarithm logvar =ln sigma 2 of mean vector μ and variance to obtain hidden space vector z (i,j);
Feature map based on all saliency of steganographic image i Is used for calculating the Gaussian average value of the hidden space vector z (i,j) of the hidden image iAnd a covariance matrix Σ i;
based on the calculated Gaussian average of the steganographic image i And covariance matrix Sigma i, generating R Zhang Yin variable z (i,r) of a steganographic image i by a uniform sampling method, inputting the steganographic variable z (i,r) into a decoder of a variational self-encoder model, and generating an embedded probability map corresponding to the steganographic variable z (i,r) The formula is:
Where Decoder represents the Decoder operation of the variational self-encoder model.
Through the method, a rich steganographic noise characteristic diagram is obtained, and representative samples are screened out by using the steganographic characteristic significance index. These samples enable us to train a probability map generation model to generate a number of probability maps for each unknown set of steganographic images. However, because of the limited sample size, training directly using GANs or Diffusion models is difficult, and we have employed a simple and efficient architecture based on a variational self-encoder (VAE) to design a probability map generation network.
As shown in fig. 5, the VAE architecture includes an encoder E (x, z) and a decoder D (z, x), where the encoder outputs an input two-dimensional imageThe hidden variable z mapped into the low-dimensional latent space, and the decoder reconstructs the original image. The encoder part adopts ResNet-152 as a feature extractor, removes the last full-connection layer, connects two full-connection layers and a batch normalization layer respectively, and converts the feature vector output by ResNet into two 256-dimensional vector outputs respectively, wherein one is a mean vector mu, and the other is the natural logarithm of variance logvar =ln sigma 2. Since the latent variables are random, sampling directly from the probability distribution results in an inability to perform efficient gradient computation by back propagation, sampling from a standard normal distribution noise eps using a re-parameterization method to obtain the latent variables, the sampled latent variables can be expressed as z=μ+eps×σ -N (μ, σ 2). In the decoder part, the hidden variable is processed by two full-connection layers and a batch normalization layer, and a leakage ReLU activation function is adopted, and then up-sampling is carried out by three residual modules and a transpose convolution module, so as to output a three-channel tensor. Finally, the image is resized to the input size by a sigmoid function using bilinear interpolation.
Further, the generation formula of the hidden space vector z (i,j) is as follows:
z(i,j)=μ+eps×σ~N(μ,σ2)
Where μ is the mean vector of the encoder output of the encoder model, σ is the index of the natural logarithm of the variance, i.e., the standard deviation, eps is the noise sampled from the standard normal distribution N (μ, σ 2), and z (i,j) is the saliency map Hidden space vectors extracted from the encoder of the encoder model by variance.
Further, the feature map based on all saliency of the steganographic image iIs used for calculating the Gaussian average value of the hidden space vector z (i,j) of the hidden image iAnd covariance matrix Σ i, the calculation formula is:
wherein z (i,j) is the feature map of the j-th salience of the i Zhang Yin-th written image The hidden space vector mapped by the encoder,Is the characteristic map of the full salience of the i Zhang Yin th written imageSigma i is the covariance matrix of the i Zhang Yin-th written image, J is the number of feature maps of saliency of the i Zhang Yin-th written image.
Further, the Gaussian average value of the hidden image i obtained by calculationAnd the covariance matrix Σ i generates the R Zhang Yin variable z (i,r) of the steganographic image i by a uniform sampling method, which specifically comprises the following steps:
To ensure that the generated probability map is close to the representative potential space prototype, a probability threshold alpha is set, and the value range of random sampling is controlled by adjusting the threshold. To obtain a certain class of R Zhang Gailv graphs, R hidden vectors z (i,r) need to be generated from the potential space, R being a constant.
Gaussian mean of steganographic image iAnd covariance matrix Σ i as upper and lower limits of uniform distribution, generating R hidden space vectors z (i,r), wherein R represents the R-th generated hidden variable, and the generation formula is:
Wherein Φ -1 represents an inverse function of a standard normal distribution, α is a probability threshold, and is used for controlling a sampling range for generating hidden variables, and U represents a uniform distribution.
Further, the steganographically based on the embedding probability map specifically includes:
embedding each sheet into a probability map Inputting into Otsu threshold segmentation method to obtain segmentation threshold T (i,r), and determining segmentation result by the following formula
Wherein, the The pixel values of the (x, y) th embedding probability map representing the (i Zhang Yin) th written image at position (x, y), T (i,r) is the segmentation threshold calculated by the Otsu thresholding method,The binary pixel value of the segmented image at the position (x, y) is 0 if the original pixel value is smaller than the threshold value, otherwise, the binary pixel value is 1;
Generating a random number matrix W= (W (x, y)) H×W and satisfying W (x, y) -U (0, 1), based on the segmentation result according to the following rule And a random number matrix W, performing embedded modification:
Wherein, the Representing the probability of a pixel being modified to +1 or-1 at position (x, y), respectively, N ±1 represents the number of pixels in the image that were modified to +1 or-1,For segmented probability mapsThe sum of all pixel values of the image, H and W are the height and width of the image respectively;
the random number matrix W is adjusted by the following embedding mapping rules to obtain an embedding modification matrix M (i,r) (x, y):
Wherein M (i,r) (x, y) is the value of the ith Zhang Yin write image r Zhang Qianru modification image at position (x, y), W x,y is the random value at position (x, y) in the random number matrix W;
Adding the embedded modification matrix M (i,r) to the non-steganographic image X us to obtain a pseudo-secret sample X' us, where the formula is:
X′us=Xus+M(i,r)
wherein X us is an unchecked image, and X' us is a pseudo-close-containing sample image after embedding modification.
After obtaining the embedding probability map through the above steps, to implement embedding modification, we first apply Otsu thresholding to divide the image into two regions according to the threshold. After the segmentation threshold T for each probability map is determined, an STC embedding simulator is used to generate an embedded modified image. This step first requires creating a random number matrix w= (W (x, y)) H×W, and satisfying W (x, y) -U (0, 1). To ensure that the impact of embedded modifications on the image statistics is minimized, it is generally required that the number of modified pixels of +1 and-1 be approximately equal. Finally, the random number matrix is adjusted according to the embedding mapping rule of M (i,r) to complete the embedding modification process. Finally, a pseudo-close sample can be obtained through X' us=Xuc +M, and the pseudo-close sample and an unknown set carrier image form a positive and negative sample pair for fine adjustment of a pre-training steganography analysis model.
Further, we assign the dense feature-containing graph corresponding to each unknown set steganographic image to one class (i.e. k-shot is classified into k classes), and the VAE loss function used for training the jth feature graph of the ith class can be expressed as:
Wherein, the To change the loss function of the self-encoder model, the first termReconstruction loss, mean square error of decoder output and input significance signature, second term The posterior distribution is measured for Kullback-Leibler divergence lossThe third term R cluster is a cluster regularization term that encourages latent variables to form clearer cluster structures in the latent space, unlike the standard normal distributions p (z) -N (0,I),Reconstructing loss, and measuring input significance characteristic diagramAnd decoder reconstruct outputThe mean square error, mu and sigma are the mean and standard deviation of the encoder output, lambda is the hyper-parameter, tr (Σ i) is the trace of the covariance matrix, and the total variance of the latent variable is represented.
Example 2:
The data sets used in this example were derived from BOSSbase v 1.01.01, BOWS2, and ALASKA #2. First, we have determined the best value of the normalized feature significance threshold ε and demonstrate the effectiveness and broad applicability of the method by conducting experiments on a variety of baseline steganalysis models. Furthermore, we have studied the effect of variations in the number of support set samples on the performance of model detection of unknown samples. Considering that resistant samples can mislead the steganography analyzer by specific embedding techniques, thus enhancing the security of steganography, we also evaluate the effectiveness of the proposed method in detecting these resistant steganography algorithms. Experimental results show that the method can provide stable performance improvement under different embedding rates or in the face of complex resistance samples, and consistency and reliability of the method are proved. In order to verify the rationality of the proposed feature enhanced low sample steganalysis method, we designed a series of comparative experiments, including different methods and parameter settings, and conducted detailed experimental analysis. The series of experiments not only verify the effectiveness of the method, but also provide basis for further optimization.
The feature enhanced steganalysis method architecture based on a small number of steganalysis sample pairs was constructed as described in fig. 3.
A probabilistic graph generation model with VAE as a core architecture is constructed by using pytorch frames, wherein ResNet-152 of an encoder is initialized by using pre-trained model weights for visual identification on an ImageNet dataset, the rest is initialized by adopting a He initialization method, an Adam optimizer is used, the learning rate is set to be 0.001, and the batch size is set to be 50. The hardware configuration used in the invention is that the display card is NVIDIA GeForce RTX,3090 with 24GB of display memory, the CUDA version is 12.2, the CPU is an Intel (R) Xeon (R) Silver 4314CPU@2.40GHz 32 core processor, the memory size is 16GB, and the operating system is Ubuntu 18.04.6LTS.
To explore the impact of different thresholds epsilon in the screening formula on the steganographic analyzer adjustment and to determine the best threshold in the experiment, we performed the experiment on BOWS data set. We set 10 different thresholds and used the S-UNIWARD steganographic algorithm to train CVTStego-Net as a pre-training network with data sets generated at 0.2bpp and 0.4bpp embeddings. To evaluate the model's performance in the unknown domain, we selected HUGO (highly undetectable steganography) and MiPOD datasets as the unknown set and randomly extracted 6 vector and steganography sample pairs from them as the support set for the less sample learning. Furthermore, we performed a zero sample test, i.e. applying the pre-trained model directly to the test set of the unknown set to obtain its detection accuracy and compare it as a baseline. The experimental results are shown in FIG. 6, which shows the detection performance under different embedding loads (a) with an embedding load of 0.2bpp and (b) with an embedding load of 0.4bpp. The result shows that the detection accuracy corresponding to almost all thresholds is higher than that of a zero sample test, and the feature enhancement method can improve the detection capability of the model on the unknown set steganography algorithm image. However, there is a significant difference in the performance of the different thresholds. When the threshold epsilon is set to 0.5 or 0.6, the detection accuracy reaches the highest. The detection accuracy decreases slowly as the threshold decreases from 0.5 or 0.6, while the detection accuracy decreases rapidly when exceeding 0.5 or 0.6. The low threshold results in the screening of too many non-representative samples, affecting the quality of the potential spatial prototype, thereby degrading detection performance. With the increase of the threshold value, non-representative samples are filtered, so that the steganography features are more concentrated, and the detection accuracy is improved. However, too high a threshold may result in an insufficient number of samples for training, affecting the ability of the model to build a steganographic feature prototype, thereby reducing detection accuracy. Based on the experimental results, we set the normalized feature significance threshold to 0.6, since this threshold can achieve the highest detection accuracy, further verifying the validity of this threshold.
Based on the above experiments, values of the threshold parameter ε were determined, we selected several previously excellent steganalysis models and pre-trained these models using the S-UNIWARD (spatially generalized wavelet relative distortion) dataset. Subsequently, we performed two fine-tuning of these pre-trained models and recorded the detection accuracy improvement of the fine-tuned model relative to the unadjusted model (i.e., zero sample test). The results are shown in table 1, and by analyzing the experimental data in the table, we can see that the pseudo-dense samples generated by using the feature enhancement method significantly improve the cross-domain detection capability of different pre-training models by 1% to 2% on average. Especially when tested on a CVTStego-Net model trained using the S-UNIWARD steganographic dataset and on 6 pairs of secret-containing samples generated by the MiPOD steganographic algorithm at 0.2bpp (minimizing the performance of the optimal detector), the steganographic analysis detection accuracy of the pre-trained model on the MiPOD test set is improved by 2.42%. This result shows that the invention not only has wide applicability, but also achieves significant performance improvement under specific conditions. Therefore, the validity and universality of the steganographic analysis method for the few-sample image are demonstrated through sufficient real verification.
TABLE 1
We explored the effect of support set sample pair numbers on the performance of model detection unknowns. Specifically, five experimental conditions of 2-shot, 4-shot, 6-shot, 8-shot and 10-shot are set, namely positive and negative samples with labels and respectively containing 2, 4, 6, 8 and 10 pairs in a supporting set. Two steganalysis models, CVTStego-Net and GBRAS-Net, were chosen and pre-trained on the S-UNIWARD dataset (payload 0.4 bpp). In order to evaluate the cross-domain detection capability of the model, datasets of two steganography algorithms, namely HUGO and MiPOD, are selected as target domain datasets. Furthermore, we also tested zero sample test accuracy of the two pre-trained models on the target domain for comparison as a benchmark. The experimental results are shown in fig. 7, where fig. 7a shows the use of HUGO steganographic dataset as the unknown set and fig. 7b shows the use of MiPOD steganographic dataset as the unknown set. The results show that the detection accuracy of CVTStego-Net and GBRAS-Net on HUGO and MiPOD test sets was reduced compared to baseline at support set sample pairs numbers 2 and 4. However, when the number of pairs of support sets increases to 6 and above, the model detection accuracy adjusted by our proposed method exceeds the baseline level, and the cross-domain detection performance of the model further improves as the number of pairs of samples increases. When the number of the support set sample pairs is small, the generated pseudo-secret-containing sample is single, so that the steganalysis model cannot fully learn the data characteristics of the target domain in the fine tuning process, and the detection performance of the unknown set image is reduced. In contrast, as the number of support set samples increases, the generated pseudo-secret samples become more diversified, so that the model can better capture the steganographic features of the target domain, and further the generalization capability and the detection capability of the unknown set steganographic images are improved. Based on the above experimental results, it can be concluded that properly increasing the number of pairs of support sets of samples helps to improve the cross-domain detection performance of the model.
Since the resistant samples mislead the steganography analyzer by using the resistant embedding technique, thus enhancing the security of steganography, we explored the performance improvement of the present invention in detecting resistant steganography algorithms. To this end, we generated MAE and Steg-GMAN resistant steganography algorithm samples with embedding rates of 0.2bpp and 0.4bpp as an unknown set. The experimental results are shown in FIG. 8, wherein FIG. 8a shows a MAE resistant steganography algorithm sample set and FIG. 8b shows Steg-GMAN resistant steganography algorithm sample sets. FAFSL in the figure represent model representations adjusted using a feature-enhancement-based low sample learning method. AUC (area under the receiver operating characteristic) is used to measure the performance of the test, with a larger AUC value indicating more excellent classification performance. After the pre-training model is adjusted by adopting the method, the ROC curve of the model is closer to the upper left corner, and the AUC value is obviously higher than the result of a zero sample migration test. This shows that the feature enhancement-based few-sample learning method can effectively improve the detection capability of the model on the resistance sample. In particular, the detection performance of the adjusted model on the resistant sample is significantly improved, and particularly under the condition of high embedding rate (0.4 bpp), the AUC value is obviously improved. This result verifies the effectiveness of the present invention in detecting resistant steganography attacks. In addition, experiments also show that the invention shows stable performance improvement under different embedding rates, and proves the consistency and reliability of the invention when complex challenge samples are processed.
In order to verify the rationality of the proposed feature enhanced steganographic analysis method with few samples, we designed various methods and parameters for experimental comparison. First, we discuss the impact of data enhancement techniques on model performance. The data enhancement increases the diversity and number of data sets by transforming the existing data (such as adding zero-mean Gaussian noise with different intensities, salt and pepper noise with different proportions, adjusting the brightness of the image, turning horizontally and vertically, rotating at different angles and reversing gray scales), thereby improving the generalization capability of the model. In the experiment, the data enhancement operation is performed on the 6 support set sample pairs, so that the data set is expanded to 100 times of the original data set, and the data set is ensured to be consistent with the generated pseudo-dense sample pair number. Furthermore, we have studied the impact of the cluster regularization term on model performance. Specifically, a cluster regularization term is introduced into the loss function of the VAE model, and different regularization coefficients λ are set. When λ=0, the representation does not contain a cluster regularization term, i.e., a standard VAE model is used. In this way the impact of the cluster regularization term on detection performance is evaluated. Under the condition that the effective load is 0.2bpp and 0.4bpp, the accuracy rate, the false alarm rate, the omission factor and the F1-score of the model are measured, and the target domain steganography algorithm is HUGO. Table 2 shows that the detection performance is significantly improved with the data enhancement method at loads of 0.2bpp and 0.4bpp compared to the zero sample test scheme. Specifically, the accuracy rate is improved by 0.81% and 0.52%, the false alarm rate is reduced by 0.0079 and 0.0032, the omission rate is reduced by 0.0083 and 0.072, and the F1-score is improved by 0.008 and 0.0055. This shows that the data enhancement can effectively improve the diversity of a small number of samples, and enhance the detection capability of the model to the unknown steganography algorithm. It was further found that the detection performance of the present invention is superior to the simple data enhancement method in both cases of λ=0 and λ=1. In particular, the CVTStego-Net+ FAFSL (lambda=1) scheme improves the accuracy by 1.3% compared with the data enhancement scheme when the load is 0.2bpp, and improves the accuracy by 0.82% when the load is 0.4 bpp. CVTStego-net+ FAFSL (λ=1) is further boosted at both loads compared to CVTStego-net+ FAFSL (λ=0). This shows that the clustering regularization term helps the model to more effectively aggregate the potential spatial distribution of each type of sample, generating more representative pseudo-dense samples, and thus improving the detection performance of the model.
TABLE 2
The above functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.