CN120279355A

CN120279355A - Feature enhancement and sample expansion-based few-sample steganalysis method

Info

Publication number: CN120279355A
Application number: CN202510345925.8A
Authority: CN
Inventors: 涂智龙; 王子驰; 张新鹏
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2025-03-24
Filing date: 2025-03-24
Publication date: 2025-07-08

Abstract

The present invention proposes a few-sample steganalysis method based on feature enhancement and sample expansion. The method combines the theory of few-sample learning, enhances the diversity of feature space, and uses the generated pseudo-steganographic samples to fine-tune the pre-trained model, thereby improving the training effect and detection accuracy of the steganalysis model. Specifically, the high-frequency noise features of the image are first extracted by the spatial domain rich model and the 2D Gabor filter, and the most representative steganalysis feature map is screened by the normalized feature significance metric. Then, the variational autoencoder VAE is used to generate the steganalysis feature prototype, and a large number of pseudo-steganographic samples are generated by sampling, threshold segmentation and other methods to enhance the diversity of training samples. Finally, the pre-trained steganalysis model is layered trained by combining the real steganalysis samples and the pseudo-steganography samples, and the model performance is gradually optimized. This method is applicable to a variety of steganalysis networks and can effectively detect steganalysis images generated by different steganalysis algorithms.

Description

Feature enhancement and sample expansion-based few-sample steganalysis method

Technical Field

The invention belongs to the technical field of information security, and particularly relates to a few-sample steganalysis method based on feature enhancement and sample expansion.

Background

Steganography is a technique for hiding secret information by modifying a digital carrier (such as an image, audio, or video) in order to achieve covert communication without posing doubt to a third party. With the development of big data and digital image processing technologies, image steganography is gradually mature and diversified, and becomes one of important technologies in the field of information hiding. However, misuse of steganography may lead to security risks, such as illegal molecules utilizing steganography for covert communications to evade supervision or to conduct malicious activity. Thus, steganalysis techniques have been developed that aim to detect and identify secret information hidden in digital media to ensure the security of digital communications.

Steganalysis techniques are largely divided into methods based on manual features and methods based on deep learning. Manual feature-based methods typically rely on features of an artificial design such as statistical features, filter kernels or higher order moments, but these methods tend to have low detection performance and limited generalization ability in the face of complex or unknown steganographic algorithms. In recent years, the success of deep learning techniques in the field of image processing has provided new solutions for steganalysis. By constructing the deep neural network, the deep learning model can automatically extract the steganographic noise characteristics of the image, and the limitation that the traditional method needs to manually design the characteristics is overcome. For example, the generation of a countermeasure network (GAN) is used to generate steganographic images, while Convolutional Neural Networks (CNN) are widely used for steganographic analysis tasks. In addition, the deep learning model can combine the spatial domain and frequency domain characteristics, so that the detection capability of steganalysis is effectively improved.

However, the performance of deep learning models is highly dependent on a large amount of training data, and in practical applications, the number of dense images tends to be limited, which limits the generalization ability of the model. The purpose of the small sample Learning (Few-shot Learning) is to utilize a small amount of training samples to improve the recognition capability of the model to new categories, and usually, a training set is expanded or a support set and a query set are constructed by methods of data enhancement, meta Learning and the like so as to improve the generalization capability and the measurement capability of the model. In the field of steganalysis, the method for learning with few samples provides a new idea for solving the problem of insufficient training data. For example, by adding auxiliary information and data enhancement techniques, a variety of training samples can be generated, thereby improving the detection accuracy of the model. In addition, the less sample learning method can also incorporate a priori knowledge of the steganography algorithm to enhance the steganography analysis capability of the model.

In practical applications, steganalysis models often face challenges of insufficient training data and unknown steganalysis algorithms. The existing deep learning method generally needs a large amount of labeling data, and the collection and labeling cost of steganographic data is high, so that the model is easy to be fitted under the condition of a small sample.

In the prior art, chinese patent CN112785478a discloses a method and a system for detecting hidden information based on an embedding probability map, where the method may generate an embedding probability map to be detected according to an image to be detected, convolve the embedding probability map to be detected and the image to be detected with a plurality of high-pass filter kernels to obtain a first residual map corresponding to the embedding probability map to be detected and a second residual map corresponding to the image to be detected, fuse the first residual map and the second residual map to obtain a fused image to be detected, test the fused image to be detected by using a steganography analysis model obtained by pre-training, and output a probability of whether the image to be detected stores secret information, so as to detect whether the secret information is hidden in the image to be detected, thereby having higher detection precision and detection efficiency. However, the method relies on a large amount of training data to train the steganographic analysis model, and in practical application, the steganographic data is difficult to obtain, so that the method has poor adaptability in a small sample scene.

In order to solve the above-mentioned problems, a method capable of effectively expanding the steganographic image sample under the condition of few samples and enhancing the detection capability of the steganographic analysis model is needed to improve the applicability of the steganographic analysis technology in the actual scene.

Disclosure of Invention

The present invention is directed to a method for steganalysis of small samples based on feature enhancement and sample expansion, which overcomes the above-mentioned drawbacks of the prior art.

The aim of the invention can be achieved by the following technical scheme:

The invention provides a few-sample steganalysis method based on feature enhancement and sample expansion, which comprises the following steps:

Acquiring a steganographic image set generated by an unknown steganographic algorithm;

Extracting features of each steganographic image in the steganographic image set to obtain a plurality of steganographic noise feature images corresponding to each steganographic image;

screening the steganographic noise feature images of each steganographic image by using a normalized feature significance function, and screening feature images of each steganographic image with significance;

Inputting the saliency feature images into a pre-trained variational self-encoder model to generate an embedding probability image of each steganographic image in a steganographic image set;

steganographically the non-steganographically based on the embedding probability map to generate a pseudo-secret-containing sample;

Fine tuning the pre-trained steganalysis model by combining the pseudo-secret-containing sample and the steganalysis image set to obtain a target steganalysis network;

And inputting the test sample of the unknown set into a target steganography analysis network, and judging whether the test sample hides secret information or not through the output label.

Further, the feature extraction for each steganographic image in the steganographic image set specifically includes:

Extracting high-frequency noise characteristics from the steganographic image through an airspace rich model SRM, specifically, checking the steganographic image through a plurality of high-pass filters to carry out convolution operation, and obtaining a residual characteristic diagram of the image, wherein the formula is as follows:

wherein S ^(i,j) represents an ith Zhang Yin written image extracted by a space domain rich model SRM method And a residual characteristic diagram obtained by the convolution operation of the jth high-pass filter kernel s ^j;

Convoluting the steganographic image through Gabor filters in different directions and scales to extract textures and edge features in the steganographic image, wherein the formula is as follows:

Wherein, the Representing an i Zhang Yin th written image extracted by a Gabor filterAnd a texture and edge feature map obtained by convolution operation of the jth Gabor filter kernel g ^j;

Residual feature map S ^(i,j) and texture and edge feature map extracted by SRM and Gabor filters Collectively, as a steganographic noise feature map of the steganographic image i.

Further, the filtering the steganographic noise feature map of each steganographic image by using the normalized feature saliency function specifically includes:

Calculating the representative degree of the steganographic noise feature map of each steganographic image by using a normalized feature saliency function A _NES (x), wherein the formula is as follows:

Wherein G _M represents the average gradient magnitude of each pixel in the steganographic noise feature map calculated using the Sobel operator, H and W are the height and width of the steganographic noise feature map, gx and Gy are the gradient approximations in the horizontal and vertical directions, respectively, C _mns represents the root mean square contrast of the steganographic noise feature map, I _ij is the gray value of pixel (I, j) in the steganographic noise feature map, For the average gray value of the steganographic noise feature map, H (p _i) represents the image entropy, which is used to measure the complexity of the image, p _i is the frequency of the i-th gray value in the steganographic noise feature map, n is the number of gray levels, norm represents the min-max normalization process, and a _NES (x) is the normalized feature significance value of the steganographic noise feature map;

According to the normalized feature saliency value A _NES (x), selecting a steganographic noise feature map with the saliency value higher than a preset threshold value E as a feature map of the saliency of the steganographic image, wherein the formula is as follows:

Wherein, the Feature map representing the j-th salience of the i Zhang Yin th write image after normalized feature salience threshold screening, x ^(i,j) is the j-th steganographic noise feature map of steganographic image i, a _NES(x^(i,j)) is the normalized feature salience value of steganographic noise feature map x ^(i,j).

Further, the variation self-encoder model comprises an encoder E (x, z) and a decoder D (z, x), wherein the encoder is used for mapping the significance of the inputThe hidden variable z mapped into the low-dimensional potential space is used for reconstructing the original image, the coder part adopts ResNet-152 as a feature extractor, removes the last full-connection layer, connects two full-connection layers and a batch normalization layer respectively, and converts the feature vector output by ResNet into two 256-dimensional vector outputs respectively, wherein one is a mean vector mu, and the other is natural logarithm of variance logvar =ln sigma ².

Further, the inputting the saliency feature map into a pre-trained variational self-encoder model to generate an embedding probability map of each steganographic image in the steganographic image set specifically includes:

feature map of all salience of steganographic image i Inputting a pre-trained variation from an encoder model, the variation from a feature map of encoder output significance of the encoder modelThe natural logarithm of its corresponding mean vector μ and variance logvar = ln σ ²;

sampling from standard normal distribution by a heavy parameterization technique based on natural logarithm logvar =ln sigma ² of mean vector μ and variance to obtain hidden space vector z ^(i,j);

Feature map based on all saliency of steganographic image i Is used for calculating the Gaussian average value of the hidden space vector z ^(i,j) of the hidden image iAnd a covariance matrix Σ ⁱ;

based on the calculated Gaussian average of the steganographic image i And covariance matrix Sigma ⁱ, generating R Zhang Yin variable z ^(i,r) of a steganographic image i by a uniform sampling method, inputting the steganographic variable z ^(i,r) into a decoder of a variational self-encoder model, and generating an embedded probability map corresponding to the steganographic variable z ^(i,r) The formula is:

Where Decoder represents the Decoder operation of the variational self-encoder model.

Further, the generation formula of the hidden space vector z ^(i,j) is as follows:

z^(i,j)＝μ+eps×σ～N(μ,σ²)

Where μ is the mean vector of the encoder output of the encoder model, σ is the index of the natural logarithm of the variance, i.e., the standard deviation, eps is the noise sampled from the standard normal distribution N (μ, σ ²), and z ^(i,j) is the saliency map Hidden space vectors extracted from the encoder of the encoder model by variance.

Further, the feature map based on all saliency of the steganographic image iIs used for calculating the Gaussian average value of the hidden space vector z ^(i,j) of the hidden image iAnd covariance matrix Σ ⁱ, the calculation formula is:

wherein z ^(i,j) is the feature map of the j-th salience of the i Zhang Yin-th written image The hidden space vector mapped by the encoder,Is the characteristic map of the full salience of the i Zhang Yin th written imageSigma ⁱ is the covariance matrix of the i Zhang Yin-th written image, J is the number of feature maps of saliency of the i Zhang Yin-th written image.

Further, the Gaussian average value of the hidden image i obtained by calculationAnd the covariance matrix Σ ⁱ generates the R Zhang Yin variable z ^(i,r) of the steganographic image i by a uniform sampling method, which specifically comprises the following steps:

Gaussian mean of steganographic image i And covariance matrix Σ ⁱ as upper and lower limits of uniform distribution, generating R hidden space vectors z ^(i,r), wherein R represents the R-th generated hidden variable, and the generation formula is:

Wherein Φ ^-1 represents an inverse function of a standard normal distribution, α is a probability threshold, and is used for controlling a sampling range for generating hidden variables, and U represents a uniform distribution.

Further, the steganographically based on the embedding probability map specifically includes:

embedding each sheet into a probability map Inputting into Otsu threshold segmentation method to obtain segmentation threshold T ^(i,r), and determining segmentation result by the following formula

Wherein, the The pixel values of the (x, y) th embedding probability map representing the (i Zhang Yin) th written image at position (x, y), T ^(i,r) is the segmentation threshold calculated by the Otsu thresholding method,The binary pixel value of the segmented image at the position (x, y) is 0 if the original pixel value is smaller than the threshold value, otherwise, the binary pixel value is 1;

Generating a random number matrix W= (W (x, y)) ^H×W and satisfying W (x, y) -U (0, 1), based on the segmentation result according to the following rule And a random number matrix W, performing embedded modification:

Wherein, the Representing the probability of a pixel being modified to +1 or-1 at position (x, y), respectively, N _±1 represents the number of pixels in the image that were modified to +1 or-1,For segmented probability mapsThe sum of all pixel values of the image, H and W are the height and width of the image respectively;

the random number matrix W is adjusted by the following embedding mapping rules to obtain an embedding modification matrix M ^(i,r) (x, y):

Wherein M ^(i,r) (x, y) is the value of the ith Zhang Yin write image r Zhang Qianru modification image at position (x, y), W _x,y is the random value at position (x, y) in the random number matrix W;

Adding the embedded modification matrix M ^(i,r) to the non-steganographic image X _us to obtain a pseudo-secret sample X' _us, where the formula is:

X′_us＝X_us+M^(i,r)

wherein X _us is an unchecked image, and X' _us is a pseudo-close-containing sample image after embedding modification.

Further, the loss function of the variation self-encoder model is:

Wherein, the To change the loss function of the self-encoder model, the first termReconstruction loss, mean square error of decoder output and input significance signature, second term The posterior distribution is measured for Kullback-Leibler divergence lossThe third term R _cluster is a cluster regularization term that encourages latent variables to form clearer cluster structures in the latent space, unlike the standard normal distributions p (z) -N (0,I),Reconstructing loss, and measuring input significance characteristic diagramAnd decoder reconstruct outputThe mean square error, mu and sigma are the mean and standard deviation of the encoder output, lambda is the hyper-parameter, tr (Σ ⁱ) is the trace of the covariance matrix, and the total variance of the latent variable is represented.

Compared with the prior art, the invention has the following advantages:

(1) According to the invention, by generating the embedded probability map and combining the pseudo-secret-containing sample with the original steganography image set, the learning ability of the model under the condition of few samples is enhanced. The method effectively overcomes the challenge of limited steganography data, enhances the generalization capability of a steganography analysis model, and remarkably improves the adaptability and the detection precision in a small sample scene.

(2) The invention extracts high-frequency noise, texture and edge characteristics from a steganographic image through a space domain rich model (SRM) and a Gabor filter to generate a steganographic noise characteristic diagram of the steganographic image. The technology can effectively capture the subtle changes in the steganographic image and improve the recognition capability of steganographic noise. And screening out the feature map of the salience by using the normalized feature salience function, so that the distinguishing degree of the image features is further improved.

(3) The invention adopts a pre-trained variational self-encoder model, extracts a saliency characteristic image through an encoder, generates hidden space vectors of a low-dimensional potential space, and generates an embedded probability image based on the hidden space vectors. The technology not only improves the reconstruction capability of the image, but also further strengthens the recognition effect of the model on the steganographic image, so that the model can still keep higher accuracy under the condition of unknown steganographic algorithm.

(4) The invention can effectively enhance the steganalysis capability by fine-tuning the pre-trained steganalysis model under the condition that only a small amount of steganalysis image samples exist. By combining the pseudo-secret-containing sample and the significance characteristic diagram for training, dependence on a large amount of labeling data is greatly reduced, and applicability and flexibility in practical application are improved.

Drawings

Fig. 1 is a flow chart of the input and output of the method of the present invention.

Fig. 2 is a frame diagram of the present invention.

Fig. 3 is a diagram showing 30 SRM filter kernels used in the present invention.

Fig. 4 is a 2D Gabor filter kernel representation of different directions and scales used in the present invention.

Fig. 5 is a probability map generation network based on a VAE architecture in accordance with the present invention.

Fig. 6 is an experimental result of different normalized feature significance thresholds.

FIG. 7 is a graph of detection accuracy versus number of samples of different support sets.

Fig. 8 is a ROC curve for the present invention on two sets of resistant steganographic samples.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

Example 1:

According to the embodiment, through the enhancement of the secret-containing samples and the hierarchical training of the model, the detection capability of the existing steganography analysis model on the unknown steganography algorithm is improved under the condition that the number of the unknown steganography algorithm samples is rare. As shown in FIG. 1, for a 2-way K-shot few sample steganalysis task, it is assumed that there is a carrier image and steganographic image pair with K pairs of unknown steganographic algorithms. First, a high frequency dense noise map is extracted from these sample pairs using a diversified high pass filter kernel. Based on these noise-containing maps, a generation model is constructed and trained, and a large number of and diversified pseudo noise-containing maps are generated through a series of operations. And then, combining the pseudo-secret samples and the real samples to fine tune the pre-trained steganalysis model, thereby obtaining the target steganalysis network. And finally, inputting the test sample of the unknown set into a target steganography analysis network, and judging whether the test sample hides secret information or not through the output label.

The technical solution of the embodiment is that in the actual steganalysis task, the detection capability of the existing deep learning model on the unknown steganalysis algorithm is reduced because the steganalysis algorithm is unknown and labeling samples are rare. It is therefore an object of the present invention to provide a method for optimizing and improving the detection capability of a model by synthesizing pseudo-dense samples, combining feature enhancement techniques with a few-sample learning strategy.

The embodiment provides a method for steganalysis of few samples based on feature enhancement and sample expansion, as shown in fig. 2, by capturing high-frequency noise features in an image through an SRM and Gabor filter kernel, and screening out a most representative feature map by using a normalized feature significance index to adjust a VAE model, a pseudo-dense sample is generated. And then, carrying out coarse granularity and fine granularity adjustment on the pre-training model by combining the true dense sample, and optimizing the performance of the model. In addition, the invention has the characteristics of plug and play and higher universality, and the structure of the steganalysis network does not need to be changed. The method can greatly improve the detection precision of an unknown steganography algorithm under the condition of a small amount of marked data, and has good performance on the aspect of detecting an antagonistic steganography sample, and specifically comprises the following steps:

wherein S (i, j) represents an i Zhang Yin th written image extracted by a space domain rich model SRM method And a residual characteristic diagram obtained by the convolution operation of the jth high-pass filter kernel s ^j;

In steganography, the operation of embedding secret information can be seen as adding very weak noise to the carrier image, such modification being fine-tuned only at the pixel level. Unlike methods that model the image content directly, spatial Rich Models (SRMs) focus on analyzing noise components (i.e., noise residuals) in the image. Since prediction errors among local pixels can reflect neighborhood correlations, the SRM extracts various types of features through a plurality of sub-models, thereby better describing the damage of steganography operation to various local correlations among pixels. The SRM method extracts spatial feature information of an image by establishing different high-pass filter kernels and calculates a residual error. Then, these residual information are subjected to truncation and quantization processes, and steganalysis features are calculated through the co-occurrence matrix. The resulting co-occurrence matrix is classified into 7 classes, and as shown in fig. 3, 30 SRM filter kernels containing the 7 classes are shown. These high pass filter kernels focus on extracting the embedded artifacts introduced by steganography, enabling a richer steganography feature. Thus, m out of the above filtering kernels are selected, and all kernel sizes are unified to (5, 5) by zero-padding, and then the generated steganographic image is generated for the ith unknown steganographic algorithmAnd performing convolution operation.

In image processing, 2D Gabor filters are often used for texture analysis, particularly suited to detect content of a particular direction and frequency in an image. By selecting a specific Gabor function, gabor filters for multi-scale, multi-directional feature extraction can be designed. In fig. 4, a 2D Gabor filter kernel of size (8, 8) and having different parameters is visualized. We set four direction parameters (i.e., θ∈ {0, pi/4, pi/2, 3 pi/4 }) and set the scale parameter σ to 0.5, 0.6, 0.7, 0.8, the phase offset parameterFurthermore, we subtract the kernel mean from the 2D Gabor filter element to zero the filter mean. Similarly, to obtain diversified Gabor rich features, we generate n Gabor filter kernels and steganographic images for unknown steganographic algorithmsAnd performing convolution operation.

The present embodiments propose a method of training and directing a variation from an encoder (VAE) to generate more such samples using representative samples. These representative samples have more prominent key features in certain classes of steganographic images, which help to more accurately construct potential spatial prototypes of such steganographic features. Since the difference in steganographic features between different dense noise-containing maps in the dense feature set is large, to pick out those most representative samples, we have introduced a normalized feature saliency function a _NFS (x) to calculate the degree of representativeness of each sample. After the normalized feature significance value of each type of image is obtained, a threshold may be set to filter out samples with insignificant steganography features, thereby obtaining a set of samples with strong representative features. The method not only can effectively identify and select the most representative sample, but also can more accurately simulate the feature distribution of the steganographic image in potential space.

Through the method, a rich steganographic noise characteristic diagram is obtained, and representative samples are screened out by using the steganographic characteristic significance index. These samples enable us to train a probability map generation model to generate a number of probability maps for each unknown set of steganographic images. However, because of the limited sample size, training directly using GANs or Diffusion models is difficult, and we have employed a simple and efficient architecture based on a variational self-encoder (VAE) to design a probability map generation network.

As shown in fig. 5, the VAE architecture includes an encoder E (x, z) and a decoder D (z, x), where the encoder outputs an input two-dimensional imageThe hidden variable z mapped into the low-dimensional latent space, and the decoder reconstructs the original image. The encoder part adopts ResNet-152 as a feature extractor, removes the last full-connection layer, connects two full-connection layers and a batch normalization layer respectively, and converts the feature vector output by ResNet into two 256-dimensional vector outputs respectively, wherein one is a mean vector mu, and the other is the natural logarithm of variance logvar =ln sigma ². Since the latent variables are random, sampling directly from the probability distribution results in an inability to perform efficient gradient computation by back propagation, sampling from a standard normal distribution noise eps using a re-parameterization method to obtain the latent variables, the sampled latent variables can be expressed as z=μ+eps×σ -N (μ, σ ²). In the decoder part, the hidden variable is processed by two full-connection layers and a batch normalization layer, and a leakage ReLU activation function is adopted, and then up-sampling is carried out by three residual modules and a transpose convolution module, so as to output a three-channel tensor. Finally, the image is resized to the input size by a sigmoid function using bilinear interpolation.

z^(i,j)＝μ+eps×σ～N(μ,σ²)

To ensure that the generated probability map is close to the representative potential space prototype, a probability threshold alpha is set, and the value range of random sampling is controlled by adjusting the threshold. To obtain a certain class of R Zhang Gailv graphs, R hidden vectors z ^(i,r) need to be generated from the potential space, R being a constant.

Gaussian mean of steganographic image iAnd covariance matrix Σ ⁱ as upper and lower limits of uniform distribution, generating R hidden space vectors z ^(i,r), wherein R represents the R-th generated hidden variable, and the generation formula is:

X′_us＝X_us+M^(i,r)

After obtaining the embedding probability map through the above steps, to implement embedding modification, we first apply Otsu thresholding to divide the image into two regions according to the threshold. After the segmentation threshold T for each probability map is determined, an STC embedding simulator is used to generate an embedded modified image. This step first requires creating a random number matrix w= (W (x, y)) ^H×W, and satisfying W (x, y) -U (0, 1). To ensure that the impact of embedded modifications on the image statistics is minimized, it is generally required that the number of modified pixels of +1 and-1 be approximately equal. Finally, the random number matrix is adjusted according to the embedding mapping rule of M ^(i,r) to complete the embedding modification process. Finally, a pseudo-close sample can be obtained through X' _us＝X_uc +M, and the pseudo-close sample and an unknown set carrier image form a positive and negative sample pair for fine adjustment of a pre-training steganography analysis model.

Further, we assign the dense feature-containing graph corresponding to each unknown set steganographic image to one class (i.e. k-shot is classified into k classes), and the VAE loss function used for training the jth feature graph of the ith class can be expressed as:

Example 2:

The data sets used in this example were derived from BOSSbase v 1.01.01, BOWS2, and ALASKA #2. First, we have determined the best value of the normalized feature significance threshold ε and demonstrate the effectiveness and broad applicability of the method by conducting experiments on a variety of baseline steganalysis models. Furthermore, we have studied the effect of variations in the number of support set samples on the performance of model detection of unknown samples. Considering that resistant samples can mislead the steganography analyzer by specific embedding techniques, thus enhancing the security of steganography, we also evaluate the effectiveness of the proposed method in detecting these resistant steganography algorithms. Experimental results show that the method can provide stable performance improvement under different embedding rates or in the face of complex resistance samples, and consistency and reliability of the method are proved. In order to verify the rationality of the proposed feature enhanced low sample steganalysis method, we designed a series of comparative experiments, including different methods and parameter settings, and conducted detailed experimental analysis. The series of experiments not only verify the effectiveness of the method, but also provide basis for further optimization.

The feature enhanced steganalysis method architecture based on a small number of steganalysis sample pairs was constructed as described in fig. 3.

A probabilistic graph generation model with VAE as a core architecture is constructed by using pytorch frames, wherein ResNet-152 of an encoder is initialized by using pre-trained model weights for visual identification on an ImageNet dataset, the rest is initialized by adopting a He initialization method, an Adam optimizer is used, the learning rate is set to be 0.001, and the batch size is set to be 50. The hardware configuration used in the invention is that the display card is NVIDIA GeForce RTX,3090 with 24GB of display memory, the CUDA version is 12.2, the CPU is an Intel (R) Xeon (R) Silver 4314CPU@2.40GHz 32 core processor, the memory size is 16GB, and the operating system is Ubuntu 18.04.6LTS.

To explore the impact of different thresholds epsilon in the screening formula on the steganographic analyzer adjustment and to determine the best threshold in the experiment, we performed the experiment on BOWS data set. We set 10 different thresholds and used the S-UNIWARD steganographic algorithm to train CVTStego-Net as a pre-training network with data sets generated at 0.2bpp and 0.4bpp embeddings. To evaluate the model's performance in the unknown domain, we selected HUGO (highly undetectable steganography) and MiPOD datasets as the unknown set and randomly extracted 6 vector and steganography sample pairs from them as the support set for the less sample learning. Furthermore, we performed a zero sample test, i.e. applying the pre-trained model directly to the test set of the unknown set to obtain its detection accuracy and compare it as a baseline. The experimental results are shown in FIG. 6, which shows the detection performance under different embedding loads (a) with an embedding load of 0.2bpp and (b) with an embedding load of 0.4bpp. The result shows that the detection accuracy corresponding to almost all thresholds is higher than that of a zero sample test, and the feature enhancement method can improve the detection capability of the model on the unknown set steganography algorithm image. However, there is a significant difference in the performance of the different thresholds. When the threshold epsilon is set to 0.5 or 0.6, the detection accuracy reaches the highest. The detection accuracy decreases slowly as the threshold decreases from 0.5 or 0.6, while the detection accuracy decreases rapidly when exceeding 0.5 or 0.6. The low threshold results in the screening of too many non-representative samples, affecting the quality of the potential spatial prototype, thereby degrading detection performance. With the increase of the threshold value, non-representative samples are filtered, so that the steganography features are more concentrated, and the detection accuracy is improved. However, too high a threshold may result in an insufficient number of samples for training, affecting the ability of the model to build a steganographic feature prototype, thereby reducing detection accuracy. Based on the experimental results, we set the normalized feature significance threshold to 0.6, since this threshold can achieve the highest detection accuracy, further verifying the validity of this threshold.

Based on the above experiments, values of the threshold parameter ε were determined, we selected several previously excellent steganalysis models and pre-trained these models using the S-UNIWARD (spatially generalized wavelet relative distortion) dataset. Subsequently, we performed two fine-tuning of these pre-trained models and recorded the detection accuracy improvement of the fine-tuned model relative to the unadjusted model (i.e., zero sample test). The results are shown in table 1, and by analyzing the experimental data in the table, we can see that the pseudo-dense samples generated by using the feature enhancement method significantly improve the cross-domain detection capability of different pre-training models by 1% to 2% on average. Especially when tested on a CVTStego-Net model trained using the S-UNIWARD steganographic dataset and on 6 pairs of secret-containing samples generated by the MiPOD steganographic algorithm at 0.2bpp (minimizing the performance of the optimal detector), the steganographic analysis detection accuracy of the pre-trained model on the MiPOD test set is improved by 2.42%. This result shows that the invention not only has wide applicability, but also achieves significant performance improvement under specific conditions. Therefore, the validity and universality of the steganographic analysis method for the few-sample image are demonstrated through sufficient real verification.

TABLE 1

We explored the effect of support set sample pair numbers on the performance of model detection unknowns. Specifically, five experimental conditions of 2-shot, 4-shot, 6-shot, 8-shot and 10-shot are set, namely positive and negative samples with labels and respectively containing 2, 4, 6, 8 and 10 pairs in a supporting set. Two steganalysis models, CVTStego-Net and GBRAS-Net, were chosen and pre-trained on the S-UNIWARD dataset (payload 0.4 bpp). In order to evaluate the cross-domain detection capability of the model, datasets of two steganography algorithms, namely HUGO and MiPOD, are selected as target domain datasets. Furthermore, we also tested zero sample test accuracy of the two pre-trained models on the target domain for comparison as a benchmark. The experimental results are shown in fig. 7, where fig. 7a shows the use of HUGO steganographic dataset as the unknown set and fig. 7b shows the use of MiPOD steganographic dataset as the unknown set. The results show that the detection accuracy of CVTStego-Net and GBRAS-Net on HUGO and MiPOD test sets was reduced compared to baseline at support set sample pairs numbers 2 and 4. However, when the number of pairs of support sets increases to 6 and above, the model detection accuracy adjusted by our proposed method exceeds the baseline level, and the cross-domain detection performance of the model further improves as the number of pairs of samples increases. When the number of the support set sample pairs is small, the generated pseudo-secret-containing sample is single, so that the steganalysis model cannot fully learn the data characteristics of the target domain in the fine tuning process, and the detection performance of the unknown set image is reduced. In contrast, as the number of support set samples increases, the generated pseudo-secret samples become more diversified, so that the model can better capture the steganographic features of the target domain, and further the generalization capability and the detection capability of the unknown set steganographic images are improved. Based on the above experimental results, it can be concluded that properly increasing the number of pairs of support sets of samples helps to improve the cross-domain detection performance of the model.

Since the resistant samples mislead the steganography analyzer by using the resistant embedding technique, thus enhancing the security of steganography, we explored the performance improvement of the present invention in detecting resistant steganography algorithms. To this end, we generated MAE and Steg-GMAN resistant steganography algorithm samples with embedding rates of 0.2bpp and 0.4bpp as an unknown set. The experimental results are shown in FIG. 8, wherein FIG. 8a shows a MAE resistant steganography algorithm sample set and FIG. 8b shows Steg-GMAN resistant steganography algorithm sample sets. FAFSL in the figure represent model representations adjusted using a feature-enhancement-based low sample learning method. AUC (area under the receiver operating characteristic) is used to measure the performance of the test, with a larger AUC value indicating more excellent classification performance. After the pre-training model is adjusted by adopting the method, the ROC curve of the model is closer to the upper left corner, and the AUC value is obviously higher than the result of a zero sample migration test. This shows that the feature enhancement-based few-sample learning method can effectively improve the detection capability of the model on the resistance sample. In particular, the detection performance of the adjusted model on the resistant sample is significantly improved, and particularly under the condition of high embedding rate (0.4 bpp), the AUC value is obviously improved. This result verifies the effectiveness of the present invention in detecting resistant steganography attacks. In addition, experiments also show that the invention shows stable performance improvement under different embedding rates, and proves the consistency and reliability of the invention when complex challenge samples are processed.

In order to verify the rationality of the proposed feature enhanced steganographic analysis method with few samples, we designed various methods and parameters for experimental comparison. First, we discuss the impact of data enhancement techniques on model performance. The data enhancement increases the diversity and number of data sets by transforming the existing data (such as adding zero-mean Gaussian noise with different intensities, salt and pepper noise with different proportions, adjusting the brightness of the image, turning horizontally and vertically, rotating at different angles and reversing gray scales), thereby improving the generalization capability of the model. In the experiment, the data enhancement operation is performed on the 6 support set sample pairs, so that the data set is expanded to 100 times of the original data set, and the data set is ensured to be consistent with the generated pseudo-dense sample pair number. Furthermore, we have studied the impact of the cluster regularization term on model performance. Specifically, a cluster regularization term is introduced into the loss function of the VAE model, and different regularization coefficients λ are set. When λ=0, the representation does not contain a cluster regularization term, i.e., a standard VAE model is used. In this way the impact of the cluster regularization term on detection performance is evaluated. Under the condition that the effective load is 0.2bpp and 0.4bpp, the accuracy rate, the false alarm rate, the omission factor and the F1-score of the model are measured, and the target domain steganography algorithm is HUGO. Table 2 shows that the detection performance is significantly improved with the data enhancement method at loads of 0.2bpp and 0.4bpp compared to the zero sample test scheme. Specifically, the accuracy rate is improved by 0.81% and 0.52%, the false alarm rate is reduced by 0.0079 and 0.0032, the omission rate is reduced by 0.0083 and 0.072, and the F1-score is improved by 0.008 and 0.0055. This shows that the data enhancement can effectively improve the diversity of a small number of samples, and enhance the detection capability of the model to the unknown steganography algorithm. It was further found that the detection performance of the present invention is superior to the simple data enhancement method in both cases of λ=0 and λ=1. In particular, the CVTStego-Net+ FAFSL (lambda=1) scheme improves the accuracy by 1.3% compared with the data enhancement scheme when the load is 0.2bpp, and improves the accuracy by 0.82% when the load is 0.4 bpp. CVTStego-net+ FAFSL (λ=1) is further boosted at both loads compared to CVTStego-net+ FAFSL (λ=0). This shows that the clustering regularization term helps the model to more effectively aggregate the potential spatial distribution of each type of sample, generating more representative pseudo-dense samples, and thus improving the detection performance of the model.

TABLE 2

The above functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A few-sample steganalysis method based on feature enhancement and sample expansion, characterized by comprising the following steps:

Obtain a set of steganalytic images generated by an unknown steganalytic algorithm;

Performing feature extraction on each stego image in the stego image set to obtain a plurality of stego noise feature maps corresponding to each stego image;

The normalized feature significance function is used to screen the steganograph noise feature map of each steganograph image, and the significant feature map of each steganograph image is screened out;

Inputting the saliency feature map into a pre-trained variational autoencoder model to generate an embedding probability map for each stego image in the stego image set;

Steganographically perform stealth on the unsteganographic image based on the embedding probability map to generate pseudo-secret samples;

Fine-tune the pre-trained steganalysis model using pseudo-secret samples and steganalysis image sets to obtain the target steganalysis network.

The test samples of the unknown set are input into the target steganalysis network, and the output labels are used to determine whether the test samples hide secret information.

2. According to the method of claim 1, wherein the feature extraction of each steganalysis image in the steganalysis image set specifically comprises:

The high-frequency noise features in the stego image are extracted through the spatial domain rich model SRM. Specifically, the stego image is convolved with multiple high-pass filter kernels to obtain the residual feature map of the image. The formula is:

Among them, S ^{(i, j)} represents the i-th steganagraph image extracted by the spatial domain rich model SRM method The residual feature map obtained after the convolution operation with the j-th high-pass filter kernel ^sj ;

The stego image is convolved with Gabor filter in different directions and scales to extract the texture and edge features in the stego image. The formula is:

in, represents the i-th steganagraph image extracted by Gabor filter The texture and edge feature map obtained by convolution operation with the j-th Gabor filter kernel ^gj ;

The residual feature map S ^{(i, j)} extracted by SRM and Gabor filter is combined with the texture and edge feature maps Together they serve as the steganopic noise feature map of the steganopic image i.

3. According to claim 1, a few-sample steganalysis method based on feature enhancement and sample expansion is characterized in that the use of a normalized feature significance function to screen the steganograph noise feature map of each steganograph image specifically includes:

The normalized feature saliency function _ANES (x) is used to calculate the representativeness of the steganograph noise feature map of each steganograph image. The formula is:

Where G _M represents the average gradient amplitude of each pixel in the stegano noise feature map calculated using the Sobel operator, H and W are the height and width of the stegano noise feature map, Gx and Gy are the approximate gradient values in the horizontal and vertical directions respectively, C _mns represents the root mean square contrast of the stegano noise feature map, I _ij is the grayscale value of the pixel (i, j) in the stegano noise feature map, is the average gray value of the stegano noise feature map, H( _pi ) represents the image entropy, which is used to measure the complexity of the image, _pi is the frequency of the i-th gray value in the stegano noise feature map, n is the number of gray levels, Norm represents the minimum-maximum normalization processing, and A _NES (x) is the normalized feature significance value of the stegano noise feature map;

According to the normalized feature saliency value _ANES (x), the steganographic noise feature map with a saliency value higher than the preset threshold ∈ is selected as the saliency feature map of the steganographic image. The formula is:

in, represents the j-th significant feature map of the i-th steganatomy image after screening by the normalized feature significance threshold, x ^{(i, j)} is the j-th steganatomy noise feature map of steganatomy image i, and _ANES (x ^{(i, j)} ) is the normalized feature significance value of the steganatomy noise feature map x ^{(i, j)} .

4. According to the method of few-sample steganalysis based on feature enhancement and sample expansion in claim 1, the variational autoencoder model comprises an encoder E(x, z) and a decoder D(z, x), wherein the encoder is used to transform the input saliency feature map The decoder is used to reconstruct the original image and the encoder uses ResNet-152 as the feature extractor. The last fully connected layer is removed and two fully connected layers and batch normalization layers are connected to convert the feature vectors output by ResNet into two 256-dimensional vector outputs, one of which is the mean vector μ and the other is the natural logarithm of the variance logvar= ^lnσ2 .

5. A few-sample steganalysis method based on feature enhancement and sample expansion according to claim 1 or 4, characterized in that the step of inputting the saliency feature map into a pre-trained variational autoencoder model to generate an embedding probability map of each stego image in the stego image set specifically comprises:

All salient feature maps of stego-image i Input the pre-trained variational autoencoder model, the encoder of the variational autoencoder model outputs a significant feature map The corresponding mean vector μ and the natural logarithm of the variance logvar = lnσ ² ;

Based on the natural logarithm of the mean vector μ and the variance logvar=lnσ ² , the latent space vector z ^{(i, j)} is sampled from the standard normal distribution through the reparameterization technique;

Feature map based on all saliency of steganatomized image i The latent space vector z ^(i,j) of the stego-image i is calculated by taking the Gaussian mean value and covariance matrix Σ ⁱ ;

According to the calculated Gaussian average value of the stego image i The covariance matrix Σ ⁱ is uniformly sampled to generate R hidden variables z ^{(i, r)} of the stego image i, and the hidden variables z ^{(i, r)} are input into the decoder of the variational autoencoder model to generate the corresponding embedding probability map of the hidden variables z ^{(i, r)} The formula is:

Among them, Decoder represents the decoder operation of the variational autoencoder model.

6. According to the method of claim 5, a few-sample steganalysis method based on feature enhancement and sample expansion is characterized in that the generation formula of the latent space vector z ^{(i, j)} is:

z ^(i,j) =μ+eps×σ～N(μ,σ ² )

Among them, μ is the mean vector of the encoder output of the variational autoencoder model, σ is the exponent of the natural logarithm of the variance, that is, the standard deviation, eps is the noise sampled from the standard normal distribution N(μ,σ ² ), and z ^(i,j) is the significant feature map Latent space vector extracted by the encoder of the variational autoencoder model.

7. According to the method of claim 5, a few-sample steganalysis method based on feature enhancement and sample expansion is characterized in that the feature map based on all the saliency of the steganalysis image i The latent space vector z ^(i,j) of the stego-image i is calculated by taking the Gaussian mean value And the covariance matrix Σ ⁱ is calculated as:

Among them, z ^{(i, j)} is the jth salient feature map of the i-th stego image The latent space vector mapped by the encoder, is the feature map of all the saliency of the i-th stego image is the mean of the latent vector of , Σ ⁱ is the covariance matrix of the i-th steganagraph, and J is the number of saliency feature maps of the i-th steganagraph.

8. According to the method of claim 5, a few-sample steganalysis method based on feature enhancement and sample expansion is characterized in that the Gaussian average value of the steganalysis image i obtained by calculation is The covariance matrix Σ ⁱ is used to generate R hidden variables z ^(i,r) of the stego-image i through uniform sampling, including:

The Gaussian mean of the steganalysis image i And the covariance matrix Σ ⁱ is used as the upper and lower limits of the uniform distribution to generate R latent space vectors z ^(i,r) , where r represents the rth generated latent variable, and the generation formula is:

Among them, Φ ^-1 represents the inverse function of the standard normal distribution, α is the probability threshold, which is used to control the sampling range of generating latent variables, and U represents uniform distribution.

9. According to the method of claim 1, wherein the method of performing steganalysis on the non-steganographic image based on the embedding probability map specifically comprises:

Embed each probability map Input into Otsu threshold segmentation method to obtain the segmentation threshold T ^(i,r) , and determine the segmentation result by the following formula

in, represents the pixel value of the rth embedding probability map of the i-th stego-image at position (x, y), T ⁽ⁱ ^{, r)} is the segmentation threshold calculated by the Otsu threshold segmentation method, It is the binary pixel value of the segmented image at position (x, y). If the original pixel value is less than the threshold, it is 0, otherwise it is 1;

Generate a random number matrix W = (w(x,y)) ^H×W , and satisfy w(x,y) ~ U(0,1). According to the following rules, based on the segmentation results And random number matrix W, embedding modification:

in, Respectively represent the probability of a pixel being modified to +1 or -1 at position (x, y), N _±1 represents the number of pixels in the image that are modified to +1 or -1, is the probability map after segmentation The sum of all pixel values in , H and W are the height and width of the image respectively;

The random number matrix W is adjusted by the following embedding mapping rule to obtain the embedded modification matrix M ^(i,r) (x,y):

Where M ^(i,r) (x,y) is the value of the i-th stego-image at the position (x,y) of the r-th embedded modified image, and w _x,y is the random value at the position (x,y) in the random number matrix W;

Add the embedding modification matrix M ^{(i, r)} to _{the unsteganographic image Xus to obtain the pseudo-secret sample Xu′s} _, _the ^formula is:

_Xu ^′s ＝ _Xus +M ^(i,r ₎

Among them, _Xus is the non-steganographic image, ^{and Xu′s} _is the pseudo _- secret sample image after embedding modification.

10. The method for few-sample steganalysis based on feature enhancement and sample expansion according to claim 1, wherein the loss function of the variational autoencoder model is:

in, is the loss function of the variational autoencoder model, the first term is the reconstruction loss, which measures the mean square error between the decoder output and the input saliency feature map. is the Kullback-Leibler divergence loss, which measures the posterior distribution The difference between the standard normal distribution p(z)~N(0,I) and the third term R _cluster is the clustering regularization term, which encourages the latent variables to form a clearer clustering structure in the latent space. For the reconstruction loss, measure the input saliency feature map Reconstruct output with decoder is the mean square error between , μ and σ are the mean and standard deviation of the encoder output, λ is a hyperparameter, and Tr(Σ ⁱ ) is the trace of the covariance matrix, which represents the total variance of the latent variables.