[go: up one dir, main page]

CN112364856B - Method and device for identifying flip image, computer equipment and storage medium - Google Patents

Method and device for identifying flip image, computer equipment and storage medium Download PDF

Info

Publication number
CN112364856B
CN112364856B CN202011271882.7A CN202011271882A CN112364856B CN 112364856 B CN112364856 B CN 112364856B CN 202011271882 A CN202011271882 A CN 202011271882A CN 112364856 B CN112364856 B CN 112364856B
Authority
CN
China
Prior art keywords
image
convolutional
information
module
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011271882.7A
Other languages
Chinese (zh)
Other versions
CN112364856A (en
Inventor
石强
刘雨桐
熊娇
王国勋
张兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Resources Digital Technology Co Ltd
Original Assignee
China Resources Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Resources Digital Technology Co Ltd filed Critical China Resources Digital Technology Co Ltd
Priority to CN202011271882.7A priority Critical patent/CN112364856B/en
Publication of CN112364856A publication Critical patent/CN112364856A/en
Application granted granted Critical
Publication of CN112364856B publication Critical patent/CN112364856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/224Character recognition characterised by the type of writing of printed characters having additional code marks or containing code marks
    • G06V30/2247Characters composed of bars, e.g. CMC-7
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a device, computer equipment and a storage medium for identifying a flip image. The method comprises the steps of obtaining sample images and generating first label information corresponding to each sample image, carrying out model training on a preset convolutional neural network model based on the sample images to obtain theoretical label information of the sample images, calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model, inputting a detection image to the convolutional neural network model, calculating and obtaining second label information of the detection image, and judging whether the detection image is a flap image or not according to the category information of the second label information. The method ensures the identification accuracy of the flip image and improves the generalization of the model identification flip image.

Description

Method and device for identifying flip image, computer equipment and storage medium
Technical Field
The present invention relates to the field of image identification, and in particular, to a method, an apparatus, a computer device, and a storage medium for identifying a flipped image.
Background
The current solution for identifying the flipped image is generally realized based on image features such as frames, reflection or moire, for example, by extracting image features such as LBP (local binary pattern), SIFT (scale invariant feature transform), HOG (direction gradient histogram), haar (Haar), and the like of the image, and then sending the extracted feature values to class models such as SVM (support vector machine) or XGBOOST (extreme gradient lifting) for training. However, with the continuous development of photographing technology, the quality of the flipped image is higher and higher, and the characteristics of mole lines, frames and the like of the flipped image are not very obvious, so that the misidentification rate of the flipped image identification technology is higher and the limitation is more and more obvious.
With the rapid development of artificial intelligence technology, neural network-based two-class networks are also beginning to be applied to the identification of flip images. The current method for identifying the flip image based on the neural network has good effect on a certain specific flip image, such as the flip image shot by a certain type of specific imaging equipment. However, the sources of the flip images are various, and the flip images acquired by imaging devices of different models and different manufacturers are different, so that it is difficult to acquire all types of flip images, and the generalization of the neural network-based flip image identification method is poor.
Disclosure of Invention
The embodiment of the invention provides a method, a device, computer equipment and a storage medium for identifying a flip image, and aims to solve the problem that the accuracy and generalization of the flip image cannot be considered in the prior art.
In a first aspect, an embodiment of the present invention provides a method for identifying a flipped image, including:
Acquiring sample images and generating first label information corresponding to each sample image, wherein the first label information is a multidimensional vector and at least comprises real category information and real spectrum information of the sample images;
Based on the sample image, carrying out model training on a preset convolutional neural network model to obtain theoretical label information of the sample image, wherein the theoretical label information at least comprises theoretical category information and theoretical spectrum information of the sample image;
Calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, and optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model;
Inputting a detection image to the convolutional neural network model, and calculating and acquiring second label information of the detection image;
judging whether the detected image is a flip image or not according to the category information of the second label information.
In a second aspect, an embodiment of the present invention provides a reproduction image recognition apparatus, including:
the acquisition unit is used for acquiring sample images and generating first label information corresponding to each sample image, wherein the first label information is a multidimensional vector and at least comprises real category information and real spectrum information of the sample images;
the first calculation unit is used for carrying out model training on a preset convolutional neural network model based on the sample image to obtain theoretical label information of the sample image, wherein the theoretical label information at least comprises theoretical category information and theoretical spectrum information of the sample image;
The adjustment unit is used for calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, and optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model;
The second calculating unit is used for inputting a detection image to the convolutional neural network model and calculating second label information of the detection image;
and the judging unit is used for judging whether the detection image is a flip image or not according to the category information of the second label.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the method for identifying a flipped image according to the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program when executed by a processor causes the processor to perform the above-mentioned method for identifying a flipped image according to the first aspect.
The embodiment of the invention provides a method, a device, computer equipment and a storage medium for identifying a flip image. The method comprises the steps of obtaining sample images, generating first label information corresponding to each sample image, carrying out model training on a preset convolutional neural network model based on the sample images to obtain theoretical label information of the sample images, calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model, inputting a detection image to the convolutional neural network model, calculating and obtaining second label information of the detection image, and judging whether the detection image is a flap image or not according to the category information of the second label information. According to the method, spectrum information is added into the convolutional network to carry out auxiliary supervision training, so that the convolutional neural network also has the capability of learning the spectrum information, the accuracy of the convolutional neural network model in identifying the reproduction image is guaranteed, and meanwhile the generalization of the model is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for identifying a flipped image according to an embodiment of the present invention;
Fig. 2 is a flowchart of step S10 of the method for identifying a flipped image according to the embodiment of the present invention;
Fig. 3 is a flowchart of step S12 of the method for identifying a flipped image according to the embodiment of the present invention;
fig. 4 is a flowchart of step S50 of the method for identifying a flipped image according to the embodiment of the present invention;
FIG. 5 is a schematic block diagram of a device for identifying a flipped image according to an embodiment of the present invention;
Fig. 6 is a schematic block diagram of an acquisition unit 110 of the apparatus for identifying a flip image according to the embodiment of the present invention;
FIG. 7 is a schematic block diagram of a transformation subunit 112 of a flipped image recognition device provided in an embodiment of the present invention;
fig. 8 is a schematic block diagram of a judging unit 150 of the apparatus for identifying a flipped image according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1, fig. 1 is a flowchart of a method for identifying a flipped image according to an embodiment of the present invention, where the flipped image identifying method is applied to a server, and the method is executed by application software installed in the server.
As shown in FIG. 1, the method includes steps S10 to S50.
Step S10, acquiring sample images and generating first label information corresponding to each sample image, wherein the first label information is a multidimensional vector and at least comprises category information and real spectrum information of the sample images.
The image reproduction refers to a process of reproducing a document by a photographing method, for example, an image obtained by displaying an existing image on a display or laser printing and then passing through an imaging device, that is, a reproduction image. For example, in the fast-food industry, a manufacturer may schedule a patrol to a store for a patrol, and take a live picture to upload to the system as a proof. However, the situation of falsification and falsification often occurs, and a patrol person can turn over the image from the screen of other electronic equipment and upload the turned-over image to the system. The roll-over cheating means not only brings direct economic loss to enterprises, but also brings misleading information to marketing management of the enterprises. Thus, identifying whether an image uploaded by a store is flipped or not is an urgent need for a quick-care business. Of course, not only the quick-service industry, but also other industries, there is the above-mentioned need.
In the embodiment, a sample image is obtained through shooting by a terminal camera, or the sample image is obtained from a local album, or the sample image is obtained from a server at a network side. The sample images comprise real images and flip images, and each sample image is processed into label information expressed by multi-dimensional vectors for training a convolutional neural network model. The label information is presented in a multidimensional vector, and the label information comprises real category information and real spectrum information of the sample image. Wherein the category information refers to whether the sample image is a real image or a flip image. Because the real image and the flip image have certain differences in light, environment and the like, the frequency distribution, the energy distribution and the like of the two images on the spectrogram have larger differences, so that the intrinsic image characteristics of the images in the frequency domain can be extracted through the frequency spectrum information.
In one embodiment, as shown in fig. 2, step S10 includes:
Step S11, acquiring a sample image, and carrying out graying treatment on the sample image to obtain a gray image of the sample image;
S12, carrying out Fourier transform on the gray level image according to a preset rule to obtain an optimal Fourier spectrogram;
S13, converting a frequency domain matrix corresponding to the optimal Fourier spectrogram into a multidimensional vector to obtain real spectrum information of the sample image;
Step S14, obtaining real category information of the sample image, and combining the real category information with real spectrum information to obtain first tag information of the sample information.
In this embodiment, according to the imaging difference of the real image and the flip image in different environments and light, the frequency distribution, the energy distribution and the like of the two types of images on the spectrogram have larger difference, so that the spectrum information of the real image and the flip image is extracted, and the label information for training the convolutional neural network model is manufactured.
First, an image contains three channels, a red channel, a green channel, and a blue channel, respectively. Each pixel in the image has three channel component values, red (R), green (G) and blue (B) component values. And carrying out gray scale processing on each pixel point in one sample image, and obtaining the gray scale value of each pixel point through a weighted average method formula, thereby obtaining the gray scale image of the image. The formula of the weighted average method is that Y=0.299R+0.578G+0.114B, wherein Y is a gray component value of each pixel point, R is a red component value of each pixel point, G is a green component value of each pixel point, and B is a blue component value of each pixel point. After the gray level image of each sample image is obtained, carrying out Fourier transform on the gray level image according to a preset rule to obtain an optimal Fourier spectrum diagram of the image. Wherein, the fourier transform formula is as follows:
Where f (x, y) represents an image matrix of size m×n, where x=0, 1,2,..m-1 and y=0, 1,2,..n-1;F (u, v) represents the frequency domain matrix after discrete fourier transform, where u=0, 1,2,..m-1 and v=0, 1,2,..n-1;e is the base of the natural logarithm and j represents the imaginary unit.
And converting the frequency domain matrix corresponding to the obtained optimal Fourier spectrogram into a multidimensional vector, and taking the multidimensional vector as the frequency spectrum information of the sample image. And finally, representing the category information of the sample image by using a vector with a preset dimension, and combining the category information with the multidimensional vector of the frequency spectrum information to obtain the label information of the sample information. For example, the class information of one sample image is represented by a 2-dimensional vector, and the spectrum information of one sample image is represented by a 100-dimensional vector. The 2-dimensional class information is combined with the 100-dimensional spectrum information to generate tag information representing a 102-dimensional vector of the sample image.
In one embodiment, as shown in fig. 3, step S12 includes:
s121, carrying out Fourier transform on the gray level image to obtain an initial Fourier spectrogram;
And step S122, carrying out normalization processing on the initial Fourier spectrogram, and adjusting the initial Fourier spectrogram according to a preset size specification to obtain an optimal Fourier spectrogram.
In this embodiment, the range of the initial fourier spectrum data obtained by the gray image transformation is relatively large, which is not beneficial to convolutional neural network model training. Therefore, normalization processing is needed to be carried out on the initial Fourier spectrogram, and the data range is further narrowed. The data of the initial fourier spectrogram is converted into the range of 0-1 using a maximum and minimum normalization method, and the normalization formula is as follows:
wherein X norm is normalized data, X is original data in an initial Fourier spectrum diagram, and X max、Xmin is the maximum value and the minimum value in all original data respectively. After normalization processing, in order to facilitate data representation of the initial Fourier spectrogram, the frequency domain matrix of the initial Fourier spectrogram is adjusted to a preset size specification, and the optimal Fourier spectrogram is obtained. The preset size specification may be m×n, preferably 10×10 sizes.
Step 20, based on the sample image, carrying out model training on a preset convolutional neural network model to obtain theoretical label information of the sample image, wherein the theoretical label information at least comprises theoretical category information and theoretical spectrum information of the sample image;
And step S30, calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, and optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain the optimal convolutional neural network model.
In this embodiment, in order to avoid excessive model parameter errors of the convolutional neural network model, a preset convolutional neural network model is model-trained by using a sample image to obtain theoretical label information of the sample image, and then based on the real label information and the theoretical label information of the sample image, class loss and spectrum loss of the sample image are calculated according to a loss function, so that model parameters are optimized. It should be noted that the convolutional neural network model outputs a multidimensional vector, and the multidimensional vector includes theoretical category information and theoretical spectrum information. For example, a vector of 102 dimensions, the first 2 dimensions store the theoretical class information of the image, and the second 100 dimensions store the theoretical spectrum information of the image.
In one embodiment, the network structure of the convolutional neural network model is shown in table 1.
TABLE 1
The convolutional neural network model comprises 4 convolutional modules and 1 output module, wherein each convolutional module comprises 2 convolutional layers and 1 pooling layer, and the output module comprises 2 fully-connected layers. Each convolution layer is a convolution layer of a3 x3 convolution kernel and each pooling layer is a pooling layer of a 2x 2 pooling size. The step length of the convolution layers in the first convolution module is 1, and the step length of the convolution layers in the second convolution module, the third convolution module and the fourth convolution module is 2. The padding for each convolution layer is 1. The number of the convolution layer channels of the first convolution module is 64, the number of the convolution layer channels of the second convolution module is 128, the number of the convolution layer channels of the third convolution module is 256, and the number of the convolution layer channels of the fourth convolution module is 512. It should be noted that the structure of the convolutional neural network model can be designed according to the needs.
In this embodiment, to improve the situation that the middle layer data distribution of the convolutional neural network model changes during the training process. And after each convolution layer, the label information is subjected to batch normalization processing, so that the training speed can be increased, the model training precision is improved, and the regularization function is also realized. The calculation formula for batch normalization is as follows:
yi=γx′i+β,
Where x i is the ith input data, m is the number of input data per batch, μ is the mean, σ 2 is the variance, ε is a very small number, x' i is normalized data, γ is the scale factor, β is the offset, γ and β are the parameters to be learned, and y i is the batch normalized data.
In an embodiment, the loss function comprises two parts, namely a class loss and a spectrum loss, since the output of the model comprises two parts, namely class information and spectrum information, wherein the class loss uses cross entropy and the spectrum information loss uses a square loss function. The initial network parameters are optimized by a loss function. When the convolutional neural network model is trained, a sample image is input to the model to obtain theoretical label information, the loss of the calculated spectrum information and the actual spectrum information is calculated according to a loss function based on the calculated spectrum information in the theoretical label information and the actual spectrum information in the actual label information of the sample image, and then model parameters are optimized. The loss function formula is as follows:
where N is the number of sample images, M is the vector dimension of the real spectral information, Is the probability that the model predicts that the sample image is a true image, c i takes a value of 1 if the sample image is a true image, or 0;y i takes a value of the true spectrum information corresponding to the sample image,Is theoretical spectrum information output by the model.
Further, in order to avoid the overfitting of the convolutional neural network model, a Dropout layer (discarding layer) is added after the pooling layer and the full-connection layer, and the parameters of the Dropout layer are self-defined according to actual needs. For example, the Dropout layer parameter is set to 0.25, i.e., the neurons of that layer are randomly 25% likely to be discarded (deactivated) at each iteration of training and not engaged in training.
And S40, inputting a detection image into the convolutional neural network model, and calculating second label information of the detection image.
In this embodiment, after the optimal convolutional neural network model is determined through the sample image, the convolutional neural network model, and the loss function in the above embodiment. And calculating the label information of the detection image according to the convolutional neural network model function of the determined optimal model parameters.
And S50, judging whether the detected image is a flip image or not according to the category information of the second label information.
In this embodiment, the multidimensional vector representing the category information in the second label information is identified according to the second label information output by the convolutional neural network model, so as to determine that the detected image is a real image or a flip image.
For example, label information of 102-dimensional vectors output by the convolutional neural network model is preset, wherein the first 2-dimensional vectors represent category information. The first position in the first 2-dimensional representation represents the probability that the image is a true image and the second position represents the probability that the image is a flipped image. If the value of the first position is larger than that of the second position, the image is judged to be a real image, otherwise, the image is judged to be a flip image.
In one embodiment, as shown in fig. 4, step S50 includes:
Step S51, identifying category information of the second tag, wherein the category information comprises a first probability of a real image and a second probability of a flip image;
step S52, judging whether the first probability is larger than a second probability;
Step S531, if the first probability is larger than the second probability, judging that the detected image is a real image;
and S532, if the first probability is smaller than the second probability, judging that the detected image is a flip image.
In this embodiment, the category information representing the category of the image in the second tag information of the detected image is identified, and the category information may be a true image probability (i.e., a first probability) and a reproduction image probability (i.e., a second probability), and then the detected image is determined to be the true image or the reproduction image by comparing the true image probability and the reproduction image probability. Preferably, if the true image probability is larger than the flip image probability, the detection image is judged to be the true image, and if the true image probability is smaller than the flip image probability, the detection image is judged to be the flip image.
For example, if the probability of representing the real image in the class information of the current detection image is 1 and the probability of representing the flipped image is 0, the current detection image is the real image, and if the probability of representing the real image in the class information of the current detection image is 0 and the probability of representing the flipped image is 1, the current detection image is the flipped image.
The method ensures the accuracy of the convolutional neural network model in identifying the flip image and improves the generalization of the model.
The embodiment of the invention also provides a device for identifying the flipped image, which is used for executing any embodiment of the method for identifying the flipped image. Specifically, referring to fig. 5, fig. 5 is a schematic block diagram of a flipped image recognition apparatus according to an embodiment of the present invention. The apparatus 100 for recognizing a flip image may be configured in a server.
As shown in fig. 5, the apparatus 100 for recognizing a reproduction image includes an acquisition unit 110, a first calculation unit 120, an adjustment unit 130, a second calculation unit 140, and a judgment unit 150.
The acquiring unit 110 is configured to acquire a sample image and generate first tag information corresponding to each sample image, where the training tag includes at least category information and real spectrum information of the sample image.
The first computing unit 120 is configured to perform model training on a preset convolutional neural network model based on the sample image, and obtain theoretical label information of the sample image, where the theoretical label information at least includes theoretical category information and theoretical spectrum information of the sample image;
And the adjusting unit 130 is configured to calculate a class loss and a spectrum loss according to the first tag information and the theoretical tag information by using a preset loss function, and optimize model parameters of the convolutional neural network model according to the class loss and the spectrum loss, so as to obtain the optimal convolutional neural network model.
And a second calculating unit 140, configured to input a detection image to the convolutional neural network model, and calculate second label information for acquiring the detection image.
And a judging unit 150, configured to judge whether the detected image is a flip image according to the category information of the second tag information.
In one embodiment, as shown in fig. 6, the acquisition unit 110 includes:
An obtaining subunit 111, configured to obtain a sample image, and perform graying processing on the sample image to obtain a gray image of the sample image;
a transformation subunit 112, configured to perform fourier transformation on the gray-scale image according to a preset rule, so as to obtain an optimal fourier spectrum diagram;
a conversion subunit 113, configured to convert the frequency domain matrix corresponding to the optimal fourier spectrum graph into a multidimensional vector, so as to obtain real spectrum information of the sample image;
And the combining subunit 114 is configured to obtain the real class information of the sample image, and combine the real class information with the real spectrum information to obtain the first tag information of the sample information.
In one embodiment, as shown in FIG. 7, the transform subunit 112 includes:
A fourier transform subunit 1121, configured to perform fourier transform on the gray-scale image to obtain an initial fourier spectrum diagram;
And the normalization subunit 1122 is configured to normalize the initial fourier spectrum, and adjust the initial fourier spectrum according to a preset size specification, so as to obtain an optimal fourier spectrum.
In one embodiment, as shown in fig. 8, the determining unit 150 includes:
An identifying subunit 151, configured to identify category information of the second tag, where the category information includes a first probability of a real image and a second probability of a flip image;
and a determining subunit 152 configured to determine whether the first probability is greater than a second probability, determine that the detected image is a real image if the first probability is greater than the second probability, and determine that the detected image is a flip image if the first probability is less than the second probability.
The specific content of the embodiment of the image-capturing device corresponds to the specific content of the embodiment of the image-capturing method, and the specific details thereof may refer to the description of the embodiment and are not repeated herein.
In another embodiment of the present invention, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for identifying a flipped image as described above when executing the computer program.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program when executed by a processor realizes the steps of acquiring sample images and generating first label information corresponding to each sample image, carrying out model training on a preset convolutional neural network model based on the sample images to acquire theoretical label information of the sample images, calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model, inputting a detection image to the convolutional neural network model, calculating second label information for acquiring the detection image, and judging whether the detection image is a flap image or not according to the category information of the second label information.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and unit described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein. Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the units is merely a logical function division, there may be another division manner in actual implementation, or units having the same function may be integrated into one unit, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present invention.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units may be stored in a storage medium if implemented in the form of software functional units and sold or used as stand-alone products. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes various media capable of storing program codes, such as a usb (universal serial bus), a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (7)

1. A method for identifying a flip image, comprising:
Acquiring sample images and generating first label information corresponding to each sample image, wherein the first label information is a multidimensional vector and at least comprises real category information and real spectrum information of the sample images;
Based on the sample image, carrying out model training on a preset convolutional neural network model to obtain theoretical label information of the sample image, wherein the theoretical label information at least comprises theoretical category information and theoretical spectrum information of the sample image;
Calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, and optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model; the convolutional neural network model comprises 4 convolutional modules and 1 output module, wherein each convolutional module comprises 2 convolutional layers and 1 pooling layer, each output module comprises 2 full-connection layers, each convolutional layer is a3×3 convolutional-core convolutional layer, each pooling layer is a pooling layer with the pooling size of 2×2 pooling, the 4 convolutional modules are respectively a first convolutional module, a second convolutional module, a third convolutional module and a fourth convolutional module, the step length of the convolutional layer in the first convolutional module is 1, the step length of the convolutional layer in the second convolutional module, the step length of the convolutional layer in the third convolutional module and the step length of the convolutional layer in the fourth convolutional module are 2, the filling of each convolutional layer is 1, the number of the convolutional layer channels of the first convolutional module is 64, the number of the convolutional layer channels of the second convolutional module is 128, the number of the convolutional layer channels of the third convolutional module is 256, the number of the convolutional layer channels of the fourth convolutional module is 512, and the error information of each convolutional layer is normalized in batches;
Inputting a detection image to the convolutional neural network model, and calculating and acquiring second label information of the detection image;
judging whether the detection image is a flip image or not according to the category information of the second tag information;
The method comprises the steps of obtaining a sample image, carrying out gray processing on the sample image to obtain a gray image of the sample image, carrying out Fourier transformation on the gray image according to a preset rule to obtain an optimal Fourier spectrogram, converting a frequency domain matrix corresponding to the optimal Fourier spectrogram into a multidimensional vector to obtain real spectrum information of the sample image, obtaining real class information of the sample image, representing the class information of the sample image by a vector with a preset dimension, and combining the real class information with the real spectrum information to obtain the first label information of the sample information;
The method comprises the steps of carrying out Fourier transform on the gray level image according to a preset rule to obtain an optimal Fourier spectrum diagram, carrying out Fourier transform on the gray level image to obtain an initial Fourier spectrum diagram, carrying out normalization processing on the initial Fourier spectrum diagram, and adjusting the initial Fourier spectrum diagram according to a preset size specification to obtain the optimal Fourier spectrum diagram.
2. The method of claim 1, wherein the fourier transform is formulated as follows:
Where f (x, y) represents an image matrix of size m×n, where x=0, 1,2,..m-1 and y=0, 1,2,..n-1;F (u, v) represents the frequency domain matrix after fourier transform, where u=0, 1,2,..m-1 and v=0, 1,2,..n-1;e is the bottom of the natural logarithm and j represents the imaginary unit.
3. The method of claim 1, wherein the loss function formula is as follows:
where N is the number of sample images, M is the vector dimension of the real spectral information, Is the probability that the model predicts that the sample image is a true image, c i takes a value of 1 if the sample image is a true image, or 0;y i takes a value of the true spectrum information corresponding to the sample image,Is theoretical spectrum information output by the model.
4. The method of claim 1, wherein determining whether the detected image is a reproduction image according to the category information of the second tag information comprises:
Identifying category information of the second tag information, wherein the category information comprises a first probability that the detected image is a real image and a second probability that the detected image is a flip image;
Judging whether the first probability is larger than a second probability;
if the first probability is larger than the second probability, judging that the detection image is a real image;
And if the first probability is smaller than the second probability, judging that the detection image is a flip image.
5. A flip image recognition device, comprising:
the acquisition unit is used for acquiring sample images and generating first label information corresponding to each sample image, wherein the first label information is a multidimensional vector and at least comprises real category information and real spectrum information of the sample images;
the first calculation unit is used for carrying out model training on a preset convolutional neural network model based on the sample image to obtain theoretical label information of the sample image, wherein the theoretical label information at least comprises theoretical category information and theoretical spectrum information of the sample image;
The adjustment unit is used for calculating category loss and spectrum loss by using a preset loss function based on the first label information and the theoretical label information, and optimizing model parameters of the convolutional neural network model according to the category loss and the spectrum loss to obtain an optimal convolutional neural network model; the convolutional neural network model comprises 4 convolutional modules and 1 output module, wherein each convolutional module comprises 2 convolutional layers and 1 pooling layer, each output module comprises 2 full-connection layers, each convolutional layer is a 3×3 convolutional-core convolutional layer, each pooling layer is a pooling layer with the pooling size of 2×2 pooling, the 4 convolutional modules are respectively a first convolutional module, a second convolutional module, a third convolutional module and a fourth convolutional module, the step length of the convolutional layer in the first convolutional module is 1, the step length of the convolutional layer in the second convolutional module, the step length of the convolutional layer in the third convolutional module and the step length of the convolutional layer in the fourth convolutional module are 2, the filling of each convolutional layer is 1, the number of the convolutional layer channels of the first convolutional module is 64, the number of the convolutional layer channels of the second convolutional module is 128, the number of the convolutional layer channels of the third convolutional module is 256, the number of the convolutional layer channels of the fourth convolutional module is 512, and the error information of each convolutional layer is normalized in batches;
The second calculating unit is used for inputting a detection image to the convolutional neural network model and calculating second label information of the detection image;
the judging unit is used for judging whether the detection image is a flip image or not according to the category information of the second label;
The method comprises the steps of obtaining a sample image, carrying out gray processing on the sample image to obtain a gray image of the sample image, carrying out Fourier transformation on the gray image according to a preset rule to obtain an optimal Fourier spectrogram, converting a frequency domain matrix corresponding to the optimal Fourier spectrogram into a multidimensional vector to obtain real spectrum information of the sample image, obtaining real class information of the sample image, representing the class information of the sample image by a vector with a preset dimension, and combining the real class information with the real spectrum information to obtain the first label information of the sample information;
The method comprises the steps of carrying out Fourier transform on the gray level image according to a preset rule to obtain an optimal Fourier spectrum diagram, carrying out Fourier transform on the gray level image to obtain an initial Fourier spectrum diagram, carrying out normalization processing on the initial Fourier spectrum diagram, and adjusting the initial Fourier spectrum diagram according to a preset size specification to obtain the optimal Fourier spectrum diagram.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of identifying a flipped image as claimed in any one of claims 1 to 4 when executing the computer program.
7. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform a method of identifying a flipped image as claimed in any one of claims 1 to 4.
CN202011271882.7A 2020-11-13 2020-11-13 Method and device for identifying flip image, computer equipment and storage medium Active CN112364856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011271882.7A CN112364856B (en) 2020-11-13 2020-11-13 Method and device for identifying flip image, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011271882.7A CN112364856B (en) 2020-11-13 2020-11-13 Method and device for identifying flip image, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112364856A CN112364856A (en) 2021-02-12
CN112364856B true CN112364856B (en) 2024-11-29

Family

ID=74515601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011271882.7A Active CN112364856B (en) 2020-11-13 2020-11-13 Method and device for identifying flip image, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112364856B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033324B (en) * 2021-03-03 2024-03-08 广东省地质环境监测总站 Geological disaster precursor factor identification method and device, electronic equipment and storage medium
CN114708630A (en) * 2022-03-23 2022-07-05 贝壳找房网(北京)信息技术有限公司 Method, apparatus, electronic device, medium, and program for recognizing image
CN114820347B (en) * 2022-04-01 2025-04-04 原力图新(重庆)科技有限公司 Image processing method, electronic device, storage medium and computer program product
CN115239649A (en) * 2022-07-08 2022-10-25 中国烟草总公司广西壮族自治区公司 Picture copying identification method, system and device and storage medium
CN116259113A (en) * 2022-12-21 2023-06-13 深圳市视美泰技术股份有限公司 Method for assisting supervision and training of living body detection model based on Fourier spectrogram

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886275A (en) * 2019-01-16 2019-06-14 深圳壹账通智能科技有限公司 Reproduction image recognition method, device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK2024899T3 (en) * 2005-09-05 2016-02-15 Alpvision S A Means of use of the material surface microstructure as a unique identifier
CN111428740A (en) * 2020-02-28 2020-07-17 深圳壹账通智能科技有限公司 Detection method and device for network-shot photo, computer equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886275A (en) * 2019-01-16 2019-06-14 深圳壹账通智能科技有限公司 Reproduction image recognition method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112364856A (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN112364856B (en) Method and device for identifying flip image, computer equipment and storage medium
US11625805B2 (en) Learning systems and methods
CN109993201B (en) Image processing method, device and readable storage medium
CN111444765B (en) Image re-identification method, training method of related model, related device and equipment
CN114139013B (en) Image searching method, device, electronic equipment and computer readable storage medium
JP2010134957A (en) Pattern recognition method
CN110298394B (en) Image recognition method and related device
CN114444565B (en) Image tampering detection method, terminal equipment and storage medium
CN116246174B (en) Sweet potato variety identification method based on image processing
AU2017443986B2 (en) Color adaptation using adversarial training networks
CN112633340B (en) Target detection model training and detection method, device and storage medium
Mohammed et al. Proposed approach for automatic underwater object classification
CN110610131B (en) Face movement unit detection method and device, electronic equipment and storage medium
CN115205155A (en) Distorted image correction method and device and terminal equipment
CN117576617B (en) Decoding system based on automatic adjustment of different environments
Biswas et al. A novel inspection of paddy leaf disease classification using advance image processing techniques
CN110781812A (en) Method for automatically identifying target object by security check instrument based on machine learning
Islam Full reference image quality assessment using siamese neural network
US20250274572A1 (en) Fast adaptation for cross-camera color constancy
US20250191118A1 (en) Semantic knowledge-based texture prediction for enhanced image restoration
CN118411568B (en) Training of target recognition model, target recognition method, system, equipment and medium
CN120355999A (en) Quick screening and label classification method and system for rehabilitation scene images
Prakash et al. Multi-modal Vit-Wavenet: A novel algorithm for precise identification and classification of image noise
Gopal et al. Blind Image De-Blurring with PNN and Random Forest Regression Model
CN119417769A (en) Intelligent monitoring method, system, device and storage medium for inkjet printer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: Room 801, building 2, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen, Guangdong 518000

Applicant after: China Resources Digital Technology Co.,Ltd.

Address before: Room 801, building 2, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen, Guangdong 518000

Applicant before: Runlian software system (Shenzhen) Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant