CN118967424B - Screen shot image robust watermarking method based on attention mechanism and contrast learning - Google Patents
Screen shot image robust watermarking method based on attention mechanism and contrast learning Download PDFInfo
- Publication number
- CN118967424B CN118967424B CN202411448920.XA CN202411448920A CN118967424B CN 118967424 B CN118967424 B CN 118967424B CN 202411448920 A CN202411448920 A CN 202411448920A CN 118967424 B CN118967424 B CN 118967424B
- Authority
- CN
- China
- Prior art keywords
- feature map
- image
- layer
- discriminator
- watermark
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/005—Robust watermarking, e.g. average attack or collusion attack resistant
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a robust watermarking method for a screen shot image based on attention mechanism and contrast learning, which comprises the steps of generating an encoded image containing watermark information by using an encoder, inputting the encoded image and a carrier image into a discriminator, outputting a predicted value by using the discriminator, carrying out distortion simulation on the encoded image, inputting the distorted encoded image into a decoder, extracting watermark information hidden in the distorted encoded image, carrying out model training on the encoder, the discriminator and the decoder according to the predicted value of the discriminator and a joint loss function, forming a screen shot watermarking model by using the encoder and the decoder after training, and encoding and decoding the screen shot image by using the screen shot watermarking model. The method and the device effectively enhance the robustness of the watermark model in the real scene by optimizing the watermark image coding process while guaranteeing the invisibility of the coded image, and better guarantee the integrity of watermark information when facing screen shot noise.
Description
Technical Field
The invention relates to a screen shot image watermarking technology, in particular to a screen shot image robust watermarking method based on an attention mechanism and contrast learning.
Background
With the update iteration of intelligent equipment, the streaming and spreading of digital information become unprecedented convenient and rapid, the development of digital copyright protection is very perfect under the multimedia technical background of high-speed development, but the copyright protection under the scene facing screen shooting still has gaps, such as film and television works piracy, medical data piracy, military secret piracy and the like, and the traditional electronic information watermarking technology can not well prevent the malicious behaviors, so that new problems and challenges are brought to the technical field of image watermarking.
Most of traditional digital watermarking schemes are based on electronic channel propagation, and the design purpose is mainly to prevent watermark information extraction, such as Gaussian noise, JPEG compression, color distortion and the like, from being influenced when watermark images face electronic channel noise, however, the difference between shooting principles in real scenes and distortion principles of electronic channels is large, so that the watermarks in the traditional schemes are difficult to resist the distortion. This is due to many physical distortions in the panning scene, such as lens distortion, illumination distortion, motion blur, moire effects, etc., which are not present in the traditional electronic channel domain. Therefore, the screen robust watermark has been developed, and the purpose of the screen robust watermark is to enable watermark images to smoothly read hidden watermark information after the watermark images are shot by facing the screen, so that the security of user data is ensured.
The advent of deep learning has provided a powerful aid to the screening robust watermarking study, where the generation of an antagonism network has provided a new idea to the watermarking technology study, by generating antagonism training of the encoder and discriminator of the antagonism network, the position of watermark information embedding can be fitted to the image features as much as possible. In addition, the noise layer simulates specific noise that may occur in a real scene to enhance the robustness of the decoder to the specific noise. However, the following problems still remain in the prior art:
1. the robustness is insufficient, namely, the situation of insufficient watermark extraction rate can still occur in the encoder after the encoded image generated by the prior art is subjected to the panning attack.
2. The invisibility is insufficient, namely the embedding position of the watermark in the coded image generated by the prior art is poor in fit with the image characteristics, so that the similarity between the watermark and the original image is low, and a user can easily observe that the image possibly contains the watermark by naked eyes.
3. The model architecture is insufficient, and the problems of mode collapse, gradient disappearance and the like exist in the model architecture in the prior art, so that the upper limit of the model performance is blocked, and the overall performance of the model is influenced.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a screen shot image robust watermarking method based on an attention mechanism and contrast learning.
The invention discloses a screen shot image robust watermarking method based on an attention mechanism and contrast learning, which comprises the following steps:
Inputting the carrier image and the watermark information into an encoder, and generating an encoded image containing the watermark information by the encoder;
inputting the encoded image and the carrier image into a discriminator, and outputting a predicted value using the discriminator;
Performing distortion simulation on the coded image to obtain a distorted coded image;
Inputting the distorted coded image into a decoder, and extracting watermark information hidden in the distorted coded image;
And (3) carrying out model training on the encoder, the discriminator and the decoder according to the predicted value of the discriminator and the joint loss function, forming a screen shot watermark model by using the trained encoder and decoder, and encoding and decoding the screen shot image by using the screen shot watermark model.
Further, before inputting the carrier image and the watermark information into the encoder, comprising:
Randomly extracting n 0 numerical elements from standard uniform distribution in the interval obeying [0, 1), taking 1 for numerical values larger than 0.5 and 0 for numerical values not larger than 0.5 to form binary watermark ciphertext information as watermark information;
processing the original carrier image size to n 1 N 1 as carrier images.
Further, inputting the carrier image and watermark information into the encoder comprises:
Carrying out three times of downsampling operation on the carrier image to sequentially obtain local feature images F1, F2 and F3, carrying out maximum pooling operation after each downsampling operation, obtaining a local feature image F4 after the third maximum pooling operation, and carrying out global average pooling once again to obtain a global feature image F5;
Watermark information is subjected to a full-connection layer to obtain watermark tensor M, and the dimension of the watermark tensor M is the same as the dimension of the feature map F6;
splicing the local feature map F4, the feature map F6 and the watermark tensor M in the channel dimension, and carrying out up-sampling once to obtain a feature map D4;
splicing the local feature map F3, the feature map D4 and the watermark tensor M in the channel dimension, and obtaining a feature map D3 through a secondary convolution layer and an up-sampling layer;
splicing the local feature map F2, the feature map D3 and the watermark tensor M in the channel dimension, and obtaining a feature map D2 through a secondary convolution layer and an up-sampling layer;
Splicing the local feature map F1, the feature map D2 and the watermark tensor M in the channel dimension, and performing secondary convolution layer and 1 1 Convolving the layer to obtain the encoded image D1.
Further, performing the downsampling operation on the carrier image three times includes:
At each downsampling, sequentially passing the carrier image through a 3×3 convolution layer, a batch normalization layer, a first activation function layer, a 3×3 convolution layer, a batch normalization layer, a second activation function layer and a HWC attention module;
The HWC attention module includes HAttention, WAttention, and CAttention modules;
In HAttention module, input feature map x is subjected to self-adaptive maximum pooling operation and self-adaptive global average pooling operation respectively, the width is compressed to be 1, feature map max h and feature map avg h are obtained respectively, feature map max h sequentially passes through a 1×1 convolution layer, an activation function layer and a 1×1 convolution layer to obtain feature map se (max h), feature map avg h sequentially passes through the 1×1 convolution layer, the activation function layer and the 1×1 convolution layer to obtain feature map se (avg h), feature map se (max h) and feature map se (avg h) are spliced, and feature map A h is obtained after the activation function layer;
In WAttention module, the input feature map x is subjected to adaptive maximum pooling and adaptive global average pooling respectively, the height is compressed to 1, so as to obtain a feature map max w and a feature map avg w, the feature map max w sequentially passes through a 1×1 convolution layer, an activation function layer and a 1×1 convolution layer to obtain a feature map se (max w), the feature map avg w sequentially passes through the 1×1 convolution layer, the activation function layer and the 1×1 convolution layer to obtain a feature map se (avg w), the feature map se (max w) and the feature map se (avg w) are spliced, and then the feature map A w is obtained after the function layer is activated;
In CAttention module, the input feature map x is compressed to 1 through a self-adaptive maximum value pooling layer and a self-adaptive global average pooling layer respectively, then the two obtained tensors are spliced in the channel dimension, the dimension of the channel is reduced to 1 through a 1X1 convolution layer, and then the feature map A c is obtained through an activation function layer;
The input feature map x is multiplied by the feature map a h and the feature map a w in the height and width dimensions, then multiplied by the feature map a c in the channel dimension, and the obtained result is added to the input feature map x to be used as an output result of the HWC attention module.
Further, the discriminator generates an countermeasure network for spectrum normalization, the carrier picture and the watermark encoded image are input to the spectrum normalization generation countermeasure network, the output of the discriminator is true or false, the output result of the discriminator is input to the encoder, and the loss function L C of the encoder and the loss function L E of the discriminator are calculated, and the calculation expressions are respectively:
,
,
Where α and β are training hyper-parameters, X r and X w represent the carrier image and the encoded image, respectively, L nce and L gan represent NCE loss and hinge loss, respectively, C represents the weight parameters of the discriminator, Representing the weight parameters of the fixed discriminator, E represents the weight parameters of the encoder,Representing the weight parameters of the fixed encoder.
Further, performing distortion simulation on the encoded image includes:
randomly perturbing four corners of the encoded image by utilizing perspective transformation, and then performing bilinear resampling on the encoded image to create a perspective warped image;
performing illumination distortion and moire distortion simulation on the image subjected to perspective distortion by using an illumination simulation function and a moire simulation function;
and carrying out distortion simulation on the interference in the real scene by using Gaussian noise.
Further, the decoder includes 3 single convolution blocks, 3 residual convolution blocks, 1 single convolution block, 6 residual convolution blocks, 1 single convolution block, and 1 full connection layer, which are sequentially arranged.
Further, extracting watermark information hidden in the distorted encoded image includes:
and calculating a decoder loss function L D according to the decoded watermark information and the original watermark information, wherein the calculation formula is as follows:
,
Where M represents the original watermark information, M d represents the decoded watermark information, γ D represents the decoding super-parameters, I n represents the distorted encoded image, and D (γ D,In) represents the decoder to decode the distorted encoded image.
Further, the joint loss function is composed of an encoder loss function, a discriminator loss function, and a decoder loss function, and the formula is as follows:
,
Where λ 1、λ2 and λ 3 are weight parameters of the corresponding loss function.
Further, model training the encoder, the discriminator, and the decoder based on the predictor of the discriminator and the joint loss function comprises:
The encoder, the discriminator and the decoder are input to an Adam optimizer for iterative training, the maximum iteration number is set, and the joint loss function is utilized for back propagation.
Compared with the prior art, the invention has the remarkable advantages that:
1. the invention improves the performance of the encoder by designing a new multi-branch convolution and HWC attention mechanism module, so that the invisibility of the encoded watermark is improved;
2. The invention solves the problem of mode collapse in the traditional model architecture by comparing the learning with the multi-discriminator module, improves the upper limit of model training and enhances the performance of the model after training is finished;
3. by arranging the screen shot distortion simulation layer, the invention simulates several kinds of distortion possibly generated during screen shooting in a real scene, trains the decoder by using the screen shot distortion simulation layer, and improves the robustness of the decoder to the screen shot distortion.
Drawings
FIG. 1 is a block flow diagram of a robust watermarking method for a screen shot image based on attention mechanisms and contrast learning;
FIG. 2 is a schematic diagram of the structure of the HWC attention mechanism module;
FIG. 3 is a schematic diagram of pairing positive and negative samples in a contrast learning process;
FIG. 4 is a moire image under noise interference of different intensities;
fig. 5 is a screen shot image under different angular disturbances.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent.
The method for robust watermarking of a screen shot image based on an attention mechanism and contrast learning in this embodiment at least includes steps 1 to 5, and a flowchart is shown in fig. 1.
Step 1, inputting a carrier image and watermark information into an encoder, and generating an encoded image containing the watermark information by using the encoder;
Step 2, inputting the coded image and the carrier image into a discriminator, and outputting a predicted value by using the discriminator;
step 3, performing distortion simulation on the coded image to obtain a distorted coded image;
Step 4, inputting the distorted coded image into a decoder, and extracting watermark information hidden in the distorted coded image;
And 5, performing model training on the encoder, the discriminator and the decoder according to the predicted value of the discriminator and the joint loss function, forming a screen shot watermark model by using the trained encoder and decoder, and encoding and decoding the screen shot image by using the screen shot watermark model.
Further, before inputting the carrier image and the watermark information into the encoder, comprising:
Randomly extracting n 0 numerical elements from standard uniform distribution in the interval obeying [0, 1), taking 1 for numerical values larger than 0.5 and 0 for numerical values not larger than 0.5 to form binary watermark ciphertext information as watermark information;
processing the original carrier image size to n 1 N 1 as carrier images.
In one example, 30 elements may be randomly extracted from a standard uniform distribution within the [0, 1] interval according to the dimension of the encoder input to form watermark information to be embedded into the carrier image, n 1 takes on a value of 128, i.e., the original carrier image is processed to 128128, And inputting the processed watermark information and the carrier image into an encoder for encoding, thereby obtaining an encoded image containing the watermark information.
Further, in step 1, inputting the carrier image and the watermark information into the encoder comprises:
In the encoder, carrying out three downsampling operations on a carrier image to sequentially obtain local feature images F1, F2 and F3, carrying out maximum pooling operation after downsampling each time, obtaining a local feature image F4 after the third maximum pooling operation, and carrying out global average pooling once again to obtain a global feature image F5;
Watermark information is subjected to a full-connection layer to obtain watermark tensor M, and the dimension of the watermark tensor M is the same as the dimension of the feature map F6;
splicing the local feature map F4, the feature map F6 and the watermark tensor M in the channel dimension, and carrying out up-sampling once to obtain a feature map D4;
splicing the local feature map F3, the feature map D4 and the watermark tensor M in the channel dimension, and obtaining a feature map D3 through a secondary convolution layer and an up-sampling layer;
splicing the local feature map F2, the feature map D3 and the watermark tensor M in the channel dimension, and obtaining a feature map D2 through a secondary convolution layer and an up-sampling layer;
Splicing the local feature map F1, the feature map D2 and the watermark tensor M in the channel dimension, and performing secondary convolution layer and 1 1 Convolving the layer to obtain the encoded image D1.
Specifically, performing the downsampling operation on the carrier image three times includes:
at each downsampling, the carrier image is sequentially passed through a 3×3 convolution layer, a batch normalization layer, a first activation function layer, a 3×3 convolution layer, a batch normalization layer, a second activation function layer, and a HWC attention module.
As shown in fig. 2, the HWC attention module includes HAttention, WAttention, and CAttention modules;
In HAttention, the input feature map x of the HWC attention module is subjected to adaptive maximum pooling operation and adaptive global average pooling operation, the width is compressed to 1, that is, the original feature map size is changed from (B, C, H, W) to (B, C, H, 1), B, C, H, W respectively represent training batch, channel dimension, width dimension, and height dimension, respectively obtain feature map max h and feature map avg h, the feature map max h sequentially passes through a 1×1 convolution layer, relu activation function layer, and a 1×1 convolution layer, to obtain a feature map se (max h), the feature map avg h sequentially passes through a 1×1 convolution layer, relu activation function layer, and a 1×1 convolution layer, to obtain a feature map se (avg h), the feature map se (max h) and the feature map se (avg h) are spliced, the feature map a h is obtained after the sigid activation function layer, the channel number is reduced from C1 to C2 when the channel number passes through the first convolution layer, and the channel number is restored from C2 to the second channel number when the channel number passes through the first convolution layer.
In WAttention module, the input feature map x is respectively subjected to adaptive maximum pooling and adaptive global average pooling, the height is compressed to be 1, namely the original feature map size is changed from (B, C, H and W) to (B, C,1 and W), a feature map max w and a feature map avg w are obtained, the feature map max w sequentially passes through a 1×1 convolution layer, a Relu activation function layer and a 1×1 convolution layer to obtain a feature map se (max w), the feature map avg w sequentially passes through the 1×1 convolution layer, the Relu activation function layer and the 1×1 convolution layer to obtain a feature map se (avg w), the feature map se (max w) and the feature map se (avg w) are spliced, the feature map A w is obtained after the sigmoid activation function layer, the channel number is reduced from C1 to C2 when the first convolution layer passes through the second convolution layer, and the channel number is restored from C2 to the original C1 when the second convolution layer passes through the first convolution layer.
In CAttention module, the input feature image x is compressed to 1 through the self-adaptive maximum value pooling layer and the self-adaptive global average pooling layer respectively, namely the original feature image size is changed from (B, C, H and W) to (B, 1, H and W), then the two obtained tensors are spliced in the channel dimension, the channel dimension is reduced to 1 through the 1X 1 convolution layer, and then the feature image A c is obtained through the sigmoid activation function layer;
The input feature map x is multiplied by the feature map a h and the feature map a w in the height and width dimensions, then multiplied by the feature map a c in the channel dimension, and the obtained result is added to the input feature map x to be used as an output result of the HWC attention module.
In the encoder, for the convolutional layer, kaiming normal distribution initialization weights are used, for the batch normalization layer, the weights are initialized to be constant 1, the offsets are initialized to be constant 0, and for the full-connection layer, normal distribution initialization weights are used, the standard deviation is 0.001, and the offsets are initialized to be constant 0.
The HWC module has the main function that the encoder carries out attention mechanism operation in three dimensions of width, height and channel through three sub-modules HAttention, WAttention and CAttention respectively, so that the feature perceptibility of an input original carrier image is enhanced, the watermark embedding position is better determined, and the quality of the encoded image is improved. In addition, the weight optimization is used for initializing weight parameters so as to prevent the conditions of gradient disappearance, gradient explosion and the like in training from interfering with the normal training process.
The discriminator generates an countermeasure network SNGAN for spectrum normalization, the carrier picture and the watermark encoded image are input to the spectrum normalization generates the countermeasure network, the output of the discriminator is true or false, the output result of the discriminator is input to the encoder, and the loss function L C of the encoder and the loss function L E of the discriminator are calculated, and the calculation expressions are respectively:
,
,
Where α and β are training hyper-parameters, X r and X w represent the carrier image and the encoded image, respectively, L nce and L gan represent NCE loss and hinge loss, respectively, C represents the weight parameters of the discriminator, Representing the weight parameters of the fixed discriminator, E represents the weight parameters of the encoder,Representing the weight parameters of the fixed encoder.
Discriminator SNGAN comprises a plurality of local feature discriminators and a global feature discriminator, which, after extracting the local and global features, project into a higher dimensional reproduction kernel hilbert space (RKHS, reproducing Kernel Hilbert Space), capturing the similarity between the global and local features using linearly evaluated values, as shown in fig. 1. Then, these projection features go through a contrast learning sample pairing stage to create positive/negative sample pairs, the process is as shown in fig. 3, where local features and global features of an m×m input image are taken as positive sample pairs, other pictures in the same batch and pictures in different batches are taken as negative sample pairs, these positive and negative sample pairs are used to calculate the NCE loss in the later input loss function, and finally true or false is output and input to the next round of encoder.
The multiple local feature discriminators and the global feature discriminator in the example adopt a multi-discriminator structure, so that the problem of catastrophic forgetting which is easy to occur in a traditional single discriminator is relieved, the problem of mode collapse of the encoder is prevented by comparing the pairing of learning samples and NCE loss, the two methods optimize two common problems which occur in the prior art and lead to the fact that training cannot be performed normally, the framework structure of a training model is optimized, and the upper limit of model performance is improved.
Further, in step 3, performing distortion simulation on the encoded image includes:
s31, randomly perturbing four corners of the coded image by utilizing perspective transformation, and then performing bilinear resampling on the coded image to create a perspective distorted image;
S32, performing illumination distortion and moire distortion simulation on the image subjected to perspective distortion by using an illumination simulation function and a moire simulation function;
and S33, performing distortion simulation on the interference in the real scene by using Gaussian noise.
In step 4, watermark information hidden in the distorted encoded image is extracted using a decoder. The decoder comprises 3 single convolution blocks, 3 residual convolution blocks, 1 single convolution block, 6 residual convolution blocks, 1 single convolution block and 1 full connection layer which are sequentially arranged.
The structure of the single convolution block is a 3×3 convolution, batch normalization and activation function which are sequentially arranged. The structure of the residual convolution block is 3×3 convolution, batch normalization, activation function, 3×3 convolution, batch normalization and one jump connection which are sequentially arranged, and the final output is the sum of jump connection after the activation function and 3×3 convolution result.
Further, extracting watermark information hidden in the distorted encoded image includes:
and calculating a decoder loss function L D according to the decoded watermark information and the original watermark information, wherein the calculation formula is as follows:
,
Where M represents the original watermark information, M d represents the decoded watermark information, γ D represents the decoding super-parameters, I n represents the distorted encoded image, and D (γ D,In) represents the decoder to decode the distorted encoded image.
Further, the joint loss function is composed of an encoder loss function, a discriminator loss function, and a decoder loss function, and the formula is as follows:
,
Where λ 1、λ2 and λ 3 are weight parameters of the corresponding loss function. In this example, λ 1、λ2 and λ 3 can be set to 0.5, 0.001 and 3, respectively.
In step 5, model training the encoder, discriminator and decoder based on the predictor of the discriminator and the joint loss function comprises:
The encoder, the discriminator and the decoder are input into an Adam optimizer for iterative training, the maximum iteration times are set, and the joint loss function is utilized for back propagation, so that a final trained model is obtained.
And forming a screen shot watermark model by the trained encoder and decoder, inputting the carrier image and watermark information into the encoder in the screen shot watermark model to obtain an encoded image, displaying the encoded image on a display, and decoding the watermark information in the screen shot image by using the decoder in the screen shot watermark model after shooting by a mobile phone to obtain decoded watermark information.
To verify the effectiveness of the method of the invention, the method of the invention was tested as follows:
According to the invention, 15000 images are randomly selected from a COCO training set to serve as a training set, 500 images are randomly selected from a COCO testing set to serve as a testing set, pyTorch is selected from a programming language, NVIDIA RTX 3070 GPU is used as training equipment, AOC LV273HUPR and CSO 1609 are used as displays for experiments, and Realme RMX3366 and HUAWEI DBY-W09 are used as shooting equipment. The training batch was set to 100 and the batch size was set to 16. The invention in tables 1 to 4 below are all used to represent the panning watermark models proposed in the present invention, and other comparative models are all existing models, including STEGASTAMP (STEGASTAMP: robust HYPERLINKS IN PHYSICAL photographs) model, RIHOOP (RIHOOP: robust HYPERLINKS IN Offline and Online Photographs) model 、PIMoG(PIMoG: An effective screen-shooting noise-layer simulation for deep-learning-based watermarking network) model, and the average bit correct extraction rate is selected as an evaluation index for the robustness experiment.
The results of the robustness experiments of the proposed model of the invention against moire noise are given in table 1, compared with several other models. In view of the fact that the moire noise intensity does not have a fixed evaluation index, the consistency of the moire noise is ensured by fixing the distance and angle of the device in this example, and the moire noise is carried out under the condition that the fixed distance is 20cm and the angle is 0 degrees, namely, the moire noise intensity is opposite to the screen, and the moire image under the interference of the strong, medium and weak noise intensities is shown in fig. 4.
TABLE 1 Moire noise test results at different intensities
From table 1, it can be derived that the screen watermark model of the invention shows more excellent performance, and the average extraction rate of the screen watermark model against moire noise is up to 99.019%.
The robustness experiments of the panning watermark model and other models under the interference of 40 degrees left, 20 degrees left, 0 degrees right, 20 degrees right and 40 degrees right are shown in table 2, and the panning images under the interference of different angles are shown in fig. 5. From the data in table 2, the model provided by the invention shows more excellent performance, and the extraction rate is higher than that of the existing other models under the interference of five different angles.
Table 2 results of robustness experiments at different angles
The robustness experiments of the screen shot watermark model and other comparison models under different conditions are shown in the table 3, and the experimental results show that the model provided by the invention is better than the existing scheme under different brands of equipment and has higher robustness.
TABLE 3 results of experiments with different devices
The comparison of the image quality experimental results of the model proposed by the present invention with other comparative models, including peak signal-to-noise ratio PSNR and structural similarity SSIM, is given in table 4, and the image quality of the present invention is superior to the existing scheme, indicating the effectiveness of the scheme proposed by the present invention.
TABLE 4 results of image quality experiments
The comparison result shows that the watermark method ensures the quality of the coded image, has higher extraction rate than other models, and has invisibility and robustness.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411448920.XA CN118967424B (en) | 2024-10-17 | 2024-10-17 | Screen shot image robust watermarking method based on attention mechanism and contrast learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411448920.XA CN118967424B (en) | 2024-10-17 | 2024-10-17 | Screen shot image robust watermarking method based on attention mechanism and contrast learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118967424A CN118967424A (en) | 2024-11-15 |
CN118967424B true CN118967424B (en) | 2025-03-14 |
Family
ID=93393397
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411448920.XA Active CN118967424B (en) | 2024-10-17 | 2024-10-17 | Screen shot image robust watermarking method based on attention mechanism and contrast learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118967424B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN120374346B (en) * | 2025-06-26 | 2025-08-26 | 南京信息工程大学 | Anti-screen robust watermarking method based on generation of antagonism network and multiple tokens |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200710A (en) * | 2020-10-08 | 2021-01-08 | 东南数字经济发展研究院 | Self-adaptive invisible watermark synchronous detection method based on deep learning |
CN116992407A (en) * | 2023-08-18 | 2023-11-03 | 湖南大学 | An anti-screen distortion watermarking method based on reversible bijection structure |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114648436A (en) * | 2022-03-16 | 2022-06-21 | 南京信息工程大学 | Screen shot resistant text image watermark embedding and extracting method based on deep learning |
CN118037518A (en) * | 2024-01-17 | 2024-05-14 | 武汉大学 | Traceable anti-watermark generation method and system for face image |
-
2024
- 2024-10-17 CN CN202411448920.XA patent/CN118967424B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200710A (en) * | 2020-10-08 | 2021-01-08 | 东南数字经济发展研究院 | Self-adaptive invisible watermark synchronous detection method based on deep learning |
CN116992407A (en) * | 2023-08-18 | 2023-11-03 | 湖南大学 | An anti-screen distortion watermarking method based on reversible bijection structure |
Also Published As
Publication number | Publication date |
---|---|
CN118967424A (en) | 2024-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Robust invisible video watermarking with attention | |
CN111491170B (en) | Method for embedding watermark and watermark embedding device | |
Fang et al. | Encoded feature enhancement in watermarking network for distortion in real scenes | |
CN115131188B (en) | Robust image watermarking method based on generation countermeasure network | |
CN118967424B (en) | Screen shot image robust watermarking method based on attention mechanism and contrast learning | |
CN114445256A (en) | Training method, device, equipment and storage medium for digital watermark | |
Fu et al. | Waverecovery: Screen-shooting watermarking based on wavelet and recovery | |
He et al. | Robust blind video watermarking against geometric deformations and online video sharing platform processing | |
CN114862645B (en) | Anti-printing digital watermarking method and device based on combination of U-Net network and DFT optimal quality radius | |
Cao et al. | Screen-shooting resistant image watermarking based on lightweight neural network in frequency domain | |
CN120374346B (en) | Anti-screen robust watermarking method based on generation of antagonism network and multiple tokens | |
CN118279119B (en) | Image watermark information processing method, device and equipment | |
CN117455749A (en) | A robust screen watermarking method based on wavelet domain cascade network and reverse recovery | |
Liao et al. | GIFMarking: The robust watermarking for animated GIF based deep learning | |
Zhang et al. | Embedding Guided End‐to‐End Framework for Robust Image Watermarking | |
Liu et al. | Hiding functions within functions: Steganography by implicit neural representations | |
Zhang et al. | A convolutional neural network-based blind robust image watermarking approach exploiting the frequency domain | |
CN117611422A (en) | Image steganography method based on Moire pattern generation | |
KR101169826B1 (en) | Bit-accurate film grain simulation method based on pre-computed transformed coefficients | |
CN115526758A (en) | Hadamard transform screen-shot-resistant watermarking method based on deep learning | |
CN114119330B (en) | Robust digital watermark embedding and extracting method based on neural network | |
Liu et al. | Screen shooting resistant watermarking based on cross attention | |
Zhong et al. | Enhanced attention mechanism-based image watermarking with simulated JPEG compression | |
Chen et al. | Rowsformer: A robust watermarking framework with swin transformer for enhanced geometric attack resilience | |
CN118741149B (en) | A watermark updating method suitable for multi-stage transmission process |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |