Disclosure of Invention
The invention mainly aims to provide an image acquisition method, an image acquisition device, image acquisition equipment and a computer readable storage medium, so as to solve the problem that an existing card image acquisition mode lacks an auditing mechanism.
Aiming at the technical problems, the embodiment of the invention is solved by the following technical scheme:
The embodiment of the invention provides an image acquisition method, which comprises the steps of acquiring a target image, inputting the target image into a pre-trained target detection model, and acquiring a detection result output by the target detection model, wherein the detection result comprises card frame confidence coefficient and feature frame confidence coefficient of a card image detected by the target detection model in the target image, and executing image acquisition operation for the target image under the condition that the card frame confidence coefficient is larger than a first confidence coefficient threshold value and the feature frame confidence coefficient is larger than a second confidence coefficient threshold value.
The detection result further comprises card frame opposite vertex angle coordinates of the card image, the image acquisition operation for the target image is performed, the image acquisition operation comprises the steps of determining whether the card image is in a preset area range of the target image according to the card frame opposite vertex angle coordinates of the card image, acquiring the target image if the card image is in the preset area range of the target image, otherwise, discarding the target image, and acquiring a next target image in a video stream to which the target image belongs, so that the next target image is input into the target detection model.
The method further comprises the step of displaying preset card position prompt information before the next target image is acquired in the video stream to which the target image belongs if the card image is not in the preset area range of the target image under the condition that the video stream to which the target image belongs is a video stream shot in real time, so as to prompt the position of a moving card, and the card image in the next target image is in the preset area of the next target image.
The method further comprises the steps of discarding the target image and acquiring a next target image in a video stream to which the target image belongs under the condition that the confidence of the card frame is smaller than or equal to the first confidence threshold and/or the confidence of the characteristic frame is smaller than or equal to the second confidence threshold, so that the next target image is input into the target detection model.
The method further comprises the step of displaying preset card correction prompt information before the next target image is acquired in the video stream to which the target image belongs so as to prompt the type of the replacement card, and if the card frame confidence is smaller than or equal to the first confidence threshold and/or the characteristic frame confidence is smaller than or equal to the second confidence threshold, the card frame confidence of the card image in the next target image is larger than the first confidence threshold and the characteristic frame confidence is larger than the second confidence threshold.
The method further comprises the steps of determining image acquisition time from the moment when the first target image is acquired by the video stream to the current moment when the confidence coefficient of the card frame is smaller than or equal to the first confidence coefficient threshold value and/or the confidence coefficient of the characteristic frame is smaller than or equal to the second confidence coefficient threshold value, and updating the first confidence coefficient threshold value and the second confidence coefficient threshold value when the image acquisition time is longer than a preset time threshold value, wherein the updated first confidence coefficient threshold value is smaller than the first confidence coefficient threshold value before updating, and the updated second confidence coefficient threshold value is smaller than the second confidence coefficient threshold value before updating.
The method comprises the steps that if a first confidence coefficient threshold value, which is larger than the card frame confidence coefficient of a card image in a target image, is an updated first confidence coefficient threshold value and a second confidence coefficient threshold value, which is larger than the characteristic frame confidence coefficient of the card image in the target image, is an updated second confidence coefficient threshold value before the target image is acquired, a corresponding confidence coefficient threshold degradation identifier is set for the target image when the target image is acquired.
The number of parameters of the target detection model is smaller than a preset parameter number threshold, and/or the number FLOPs of floating point operations executed by the target detection model is smaller than a preset floating point operation number threshold, and/or the frame rate of the target detection model is larger than a preset frame rate threshold.
The embodiment of the invention also provides an image acquisition device which comprises an acquisition module, a detection module and an execution module, wherein the acquisition module is used for acquiring a target image, the detection module is used for inputting the target image into a pre-trained target detection model and acquiring a detection result output by the target detection model, the detection result comprises card frame confidence coefficient and characteristic frame confidence coefficient of a card image detected by the target detection model in the target image, and the execution module is used for executing image acquisition operation for the target image under the condition that the card frame confidence coefficient is larger than a first confidence coefficient threshold value and the characteristic frame confidence coefficient is larger than a second confidence coefficient threshold value.
The embodiment of the invention also provides image acquisition equipment which comprises a processor and a memory, wherein the processor is used for executing an image acquisition program stored in the memory so as to realize the image acquisition method of any one of the above.
Embodiments of the present invention also provide a computer-readable storage medium storing one or more programs executable by one or more processors to implement the image acquisition method described in any one of the above.
The embodiment of the invention has the following beneficial effects:
According to the embodiment of the invention, the card image in the target image can be audited, the card frame confidence coefficient and the characteristic frame confidence coefficient of the card image are determined, whether the image acquisition operation is executed is determined according to the card frame confidence coefficient and the characteristic frame confidence coefficient, if the card frame confidence coefficient is larger than the first confidence coefficient threshold value and the characteristic frame confidence coefficient is larger than the second confidence coefficient threshold value, the image acquisition operation is executed, the problem of acquiring the card image of the wrong type card is avoided, the operation amount and the operation duration of the subsequent image identification are reduced, and the image identification efficiency is improved.
Detailed Description
The present invention will be described in further detail with reference to the drawings and the embodiments, in order to make the objects, technical solutions and advantages of the present invention more apparent.
According to an embodiment of the present invention, there is provided an image acquisition method. Fig. 1 is a flowchart of an image acquisition method according to an embodiment of the invention.
Step S110, a target image is acquired.
The target image is the target image to be audited. Further, a card image of the target card may be included in the target image.
Specifically, in the received video stream, one of the video images is acquired, and the video image is taken as a target image. The video stream is a video stream photographed in real time or a video stream photographed in advance.
For example, a camera can be called to shoot a video stream in real time, the video stream returned by the camera is received, and a first frame video image of the video stream is obtained as a target image.
And step S120, inputting the target image into a pre-trained target detection model, and obtaining a detection result output by the target detection model, wherein the detection result comprises card frame confidence coefficient and feature frame confidence coefficient of a card image detected by the target detection model in the target image.
The target detection model is used for detecting whether a card image of a suspected target type card is contained in the target image, if the card image of the suspected target type card is contained in the target image, extracting a card frame and a feature frame of the suspected target type image, and determining a card frame confidence coefficient corresponding to the card frame and a feature frame confidence coefficient corresponding to the feature frame.
The target type card refers to the type of card to be checked. Such as a borrowing card, a banking card, etc.
The confidence of the card frame refers to the credibility of the image area surrounded by the card frame as the target type card.
The confidence of the feature frame refers to the credibility of the image area surrounded by the feature frame as the target type. The characteristic frame is the frame of the region where the characteristic information in the target type card is located.
The number of feature information in the target class card may be one or more. For example, the characteristic information in the borrowing card comprises a borrowing card number and a borrowing card photo. As another example, the characteristic information in the bank card includes a bank card number. The target detection model can be used for detecting the confidence level of the characteristic frame corresponding to the frame of the area where each characteristic information is located.
Step S130, executing an image acquisition operation for the target image when the card frame confidence is greater than a first confidence threshold and the feature frame confidence is greater than a second confidence threshold.
And the first confidence threshold is used for measuring whether the card frame of the card image is the card frame of the target type card. If the confidence coefficient of the card frame of the card image is larger than the first confidence coefficient threshold value, the card frame of the card image is the card frame of the target type card, otherwise, the card frame of the card image is not the card frame of the target type card.
And the second confidence threshold is used for measuring whether the characteristic frame of the card image is the characteristic frame of the target type card. If the confidence coefficient of the characteristic frame of the card image is larger than the second confidence coefficient threshold, the characteristic frame of the card image is the characteristic frame of the target type card, otherwise, the characteristic frame of the card image is not the characteristic frame of the target type card.
In this embodiment, since the card frame is easier to extract, and the feature frame is more difficult to extract, the first confidence threshold is greater than the second confidence threshold. For example, the first confidence threshold is set to 0.8, the second confidence threshold is set to 0.6, the card frame confidence is 0.9 (0.9 > 0.8), the feature frame confidence is 0.7 (0.7 > 0.6), the image acquisition operation is performed, the card frame confidence is 0.85 (0.85 > 0.8), the feature frame confidence is 0.5 (0.5 < 0.6), and the image acquisition operation is not performed.
The target image may be directly acquired when an image acquisition operation for the target image is performed, or may be acquired when the target image meets a preset image acquisition condition. The image acquisition condition is, for example, that the card image in the target image is in the central region of the target image.
After the target image is acquired, the target image is transmitted to the back end so that the back end performs image recognition processing on the target image.
According to the embodiment, the card image in the target image can be audited, the card frame confidence coefficient and the characteristic frame confidence coefficient of the card image are determined, whether the image acquisition operation is executed is determined according to the card frame confidence coefficient and the characteristic frame confidence coefficient, if the card frame confidence coefficient is larger than a first confidence coefficient threshold value and the characteristic frame confidence coefficient is larger than a second confidence coefficient threshold value, the image acquisition operation is executed, the problem of acquiring the card image of the wrong type card is avoided, the operation amount and operation duration of subsequent image recognition are reduced, and the image recognition efficiency is improved.
The training process of the object detection model is described below.
In this embodiment, the target detection model is a lightweight model, so that the image acquisition method of this embodiment can be integrated into the front end, and occupies a smaller storage space, so that the front end can operate smoothly.
Further, the number of parameters of the target detection model is smaller than a preset parameter number threshold, and/or the number of Floating Point Operations (FLOPs) performed by the target detection model is smaller than a preset Floating Point operation number threshold, and/or the frame rate (FRAME RATE) of the target detection model is larger than a preset frame rate threshold. The parameter quantity threshold value, the floating point operation number threshold value and the frame rate threshold value are all empirical values.
Further, the object detection model may optionally be constructed using SSD (Single Shot MultiBox Detector) algorithm or YOLO (You Only Look Once) algorithm. The SSD algorithm is, for example, mobilenet-SSD algorithm.
In the sample labeling, the card frame of the target type card can be labeled as a card, and the feature frame of the target type card can be labeled as a number. When the card frame is marked, the area where the card frame can contain the whole target type card is required to be ensured, and when the characteristic frame is marked, the area where the characteristic frame can contain the whole characteristic information is required to be ensured. For example, fig. 2 is a schematic diagram of sample labeling, where a bank card is a target card, a card frame of the bank card is a card frame of the bank card, and a number is a frame of an area where a bank card number is located according to an embodiment of the present invention.
The target detection model is trained by using the marked plurality of samples until the target detection model converges.
The card frame and the characteristic frame of the card image of the target type card are detected by using the trained target detection model, the accuracy is high, the generalization capability is high, the card face pattern of the target type card is designed to be complex even if the design sense of the card is improved, the target type card is placed in a complex background environment when the target type card is shot, the target detection model can accurately detect the card frame and the characteristic frame of the card image, and the card frame confidence and the characteristic frame confidence are determined.
Further, since the sample used when training the target detection model is an image of a real target class card, the target detection model has a characteristic of discriminating the authenticity of the target class card. For example, in some scenes, a user's bank card image needs to be acquired, but in order to be beneficial, an illegal user steals a photocopy of another bank card to acquire the bank card image, but the photocopy cannot have larger differences with a real bank card image in terms of color, brightness or texture, so that the difference between the photocopy and the real bank card image is easily detected by a target detection model, and further, the confidence of a card frame and the confidence of a characteristic frame output by the target detection model are smaller, so that the image acquisition operation is not executed on the photocopy.
Another case of the embodiment of the present invention will be described below.
If there is only one target image, the image acquisition process of the embodiment may be ended and image acquisition failure prompt information may be displayed when the confidence level of the card frame is less than or equal to the first confidence level threshold and/or the confidence level of the feature frame is less than or equal to the second confidence level threshold.
If the target image is a video image obtained from a video stream, discarding the target image and obtaining a next target image in the video stream to which the target image belongs in order to input the next target image into the target detection model when the card frame confidence is smaller than or equal to the first confidence threshold and/or the feature frame confidence is smaller than or equal to the second confidence threshold. That is, the next video image is acquired from the video stream, and the newly acquired video image is taken as a new target image. If no target image capable of executing image acquisition operation exists, the image acquisition flow of the embodiment of the invention can be circularly executed on a plurality of video images in the video stream, so that the target image meeting the confidence requirement can be dynamically found in the video stream.
Further, if the video stream to which the target image belongs is a video stream photographed in real time, and/or if the card frame confidence is less than or equal to the first confidence threshold and/or the feature frame confidence is less than or equal to the second confidence threshold, a preset card correction prompt message is displayed before the next target image is acquired in the video stream to which the target image belongs, so as to prompt the type of the replacement card, so that the card frame confidence of the card image in the next target image is greater than the first confidence threshold and the feature frame confidence is greater than the second confidence threshold. If the card frame confidence and the feature frame confidence of the card image are both lower than the corresponding threshold values, there is a possibility that an error occurs in the type of the photographed card, for example, a card borrowing image needs to be collected, if the user photographs a membership card image, the sizes of the card frames are different, and the frame (feature frame) of the area where the card borrowing number is located cannot be extracted from the membership card image.
In one embodiment, to avoid the problem of unsuccessful acquisition for a long period of time, a first confidence threshold and a second confidence threshold may be set to a plurality of gradients. Each gradient corresponds to a group of first confidence threshold value and second confidence threshold value, the first confidence threshold value among the gradients is set to be different values, the second confidence threshold value among the gradients is set to be different values, the change trend of the first confidence threshold value and the second confidence threshold value among the gradients is the same, the first confidence threshold value is ranked from large to small, and the second confidence threshold value is ranked from large to small from the 1 st gradient to the N th gradient. In order to collect a clearer target image, initial values of the first confidence coefficient threshold value and the second confidence coefficient threshold value can be set to be maximum values in a plurality of gradients, and degradation processing can be performed on the first confidence coefficient threshold value and the second confidence coefficient threshold value in order to avoid the problem that the target image of the target type card cannot be collected normally.
The method comprises the steps of determining image acquisition time from the moment when a first target image is acquired by a video stream to the current moment when the confidence of a card frame is smaller than or equal to a first confidence threshold value and/or the confidence of a characteristic frame is smaller than or equal to a second confidence threshold value, and updating the first confidence threshold value and the second confidence threshold value when the image acquisition time is longer than a preset time threshold value, wherein the updated first confidence threshold value is smaller than the first confidence threshold value before updating, and the updated second confidence threshold value is smaller than the second confidence threshold value before updating.
The image acquisition duration may be an empirical value or a value obtained through experimentation. Further, a plurality of time length thresholds can be set, when the time length of the image acquisition is greater than the minimum time length threshold, the first confidence coefficient threshold and the second confidence coefficient threshold are updated to be the first confidence coefficient threshold and the second confidence coefficient threshold in the 2 nd gradient according to the sequence from the large value to the small value, when the time length of the image acquisition is greater than the next minimum time length threshold, the first confidence coefficient threshold and the second confidence coefficient threshold are updated to be the first confidence coefficient threshold and the second confidence coefficient threshold in the 3 rd gradient according to the sequence from the large value to the small value, and so on, if the first confidence coefficient threshold and the second confidence coefficient threshold in all gradients are used up, the image acquisition process of the embodiment of the invention is ended and the image acquisition failure prompt information is displayed.
If the updated first confidence threshold value and the updated second confidence threshold value are respectively compared with the card frame confidence value and the feature frame confidence value, a corresponding confidence threshold degradation mark is required to be set for the target image, so that the confidence threshold value used by the target image is represented by the confidence threshold degradation mark as not being the largest confidence threshold value. In other words, if, before the target image is acquired, a first confidence threshold with greater card frame confidence of a card image in the target image is an updated first confidence threshold, and a second confidence threshold with greater card frame confidence of a card image in the target image is an updated second confidence threshold, then a corresponding confidence threshold degradation identifier is set for the target image when the target image is acquired.
In another embodiment, to further increase the efficiency of image recognition, the integrity of the card may also be determined. The determination of the integrity of the card may be achieved by determining the location of the area in the target image where the card image is located.
Further, the target detection model is also used for determining the coordinates of the opposite vertex angles of the card frame of the card image. Thus, the detection result of the target detection model also comprises the coordinates of the opposite vertex angles of the card frame of the card image. The card frame opposite vertex angle coordinates comprise vertex coordinates of two opposite corners of the card image. For example, the top left corner of the card image and the bottom right corner.
As shown in fig. 3, a flowchart of steps of an image acquisition operation according to an embodiment of the present invention is shown.
Step S310, determining whether the card image is in the preset area range of the target image according to the card frame opposite vertex angle coordinates of the card image, if so, executing step S320, and if not, executing step S330.
The preset region range may be a central region of the target image, and the central region does not overlap with an edge of the target image.
For example, the vertex coordinates of the upper left corner (x 1,y1) and the vertex coordinates of the lower right corner (x 2,y2) of the card image. The width w and the height h of the target image. The boundary threshold t is the closest distance from the edge of the card image to the edge of the target image, which may demarcate the centered area of the target image.
When the vertex coordinates (x 1,y1) of the upper left corner and the vertex coordinates (x 2,y2) of the lower right corner of the card image meet the following conditions, the card image is indicated to be in the central area of the target image, and the card image is complete and can be acquired:
the upper left corner of the target image is taken as the origin of the coordinate system, the direction of the vertical axis is downward along the edge passing through the origin in the target image, and the direction of the horizontal axis is rightward along the edge passing through the origin in the target image.
Step S320, if the card image is within the preset area range of the target image, acquiring the target image.
Step S330, if the card image is not in the preset area range of the target image, discarding the target image, and acquiring a next target image in the video stream to which the target image belongs, so as to input the next target image into the target detection model.
And if the video stream to which the target image belongs is a video stream shot in real time, displaying preset card position prompt information before acquiring the next target image in the video stream to which the target image belongs, so as to prompt the position of a moving card, and enabling the card image in the next target image to be in the preset area of the next target image.
If the card image is not in the preset area range of the target image, the card image may not be complete enough, and then the user adjusts the position of the card through the card position prompt information, so that the card is in the preset area range when appearing in the target image again.
The embodiment of the invention also provides an image acquisition device. Fig. 4 is a block diagram of an image capturing device according to an embodiment of the present invention.
The image acquisition device comprises an acquisition module 410, a detection module 420 and an execution module 430.
An acquisition module 410 is configured to acquire a target image.
The detection module 420 is configured to input the target image into a pre-trained target detection model, and obtain a detection result output by the target detection model, where the detection result includes a card frame confidence coefficient and a feature frame confidence coefficient of a card image detected by the target detection model in the target image.
An execution module 430 is configured to execute an image acquisition operation for the target image if the card border confidence is greater than a first confidence threshold and the feature border confidence is greater than a second confidence threshold.
The functions of the apparatus according to the embodiments of the present invention have been described in the foregoing method embodiments, so that the descriptions of the embodiments are not exhaustive, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.
The embodiment also provides image acquisition equipment. As shown in fig. 5, a block diagram of an image capturing apparatus according to an embodiment of the present invention.
In this embodiment, the image acquisition device includes, but is not limited to, a processor 510, a memory 520.
The processor 510 is configured to execute an image acquisition program stored in the memory 520 to implement the image acquisition method described above.
Specifically, the processor 510 is configured to execute an image acquisition program stored in the memory 520, so as to obtain a target image, input the target image into a pre-trained target detection model, and obtain a detection result output by the target detection model, where the detection result includes a card frame confidence level and a feature frame confidence level of a card image detected by the target detection model in the target image, and execute an image acquisition operation for the target image when the card frame confidence level is greater than a first confidence level threshold and the feature frame confidence level is greater than a second confidence level threshold.
The detection result further comprises card frame opposite vertex angle coordinates of the card image, the image acquisition operation for the target image is performed, the image acquisition operation comprises the steps of determining whether the card image is in a preset area range of the target image according to the card frame opposite vertex angle coordinates of the card image, acquiring the target image if the card image is in the preset area range of the target image, otherwise, discarding the target image, and acquiring a next target image in a video stream to which the target image belongs, so that the next target image is input into the target detection model.
The method further comprises the step of displaying preset card position prompt information before the next target image is acquired in the video stream to which the target image belongs if the card image is not in the preset area range of the target image under the condition that the video stream to which the target image belongs is a video stream shot in real time, so as to prompt the position of a moving card, and the card image in the next target image is in the preset area of the next target image.
The method further comprises the steps of discarding the target image and acquiring a next target image in a video stream to which the target image belongs under the condition that the confidence of the card frame is smaller than or equal to the first confidence threshold and/or the confidence of the characteristic frame is smaller than or equal to the second confidence threshold, so that the next target image is input into the target detection model.
The method further comprises the step of displaying preset card correction prompt information before the next target image is acquired in the video stream to which the target image belongs so as to prompt the type of the replacement card, and if the card frame confidence is smaller than or equal to the first confidence threshold and/or the characteristic frame confidence is smaller than or equal to the second confidence threshold, the card frame confidence of the card image in the next target image is larger than the first confidence threshold and the characteristic frame confidence is larger than the second confidence threshold.
The method further comprises the steps of determining image acquisition time from the moment when the first target image is acquired by the video stream to the current moment when the confidence coefficient of the card frame is smaller than or equal to the first confidence coefficient threshold value and/or the confidence coefficient of the characteristic frame is smaller than or equal to the second confidence coefficient threshold value, and updating the first confidence coefficient threshold value and the second confidence coefficient threshold value when the image acquisition time is longer than a preset time threshold value, wherein the updated first confidence coefficient threshold value is smaller than the first confidence coefficient threshold value before updating, and the updated second confidence coefficient threshold value is smaller than the second confidence coefficient threshold value before updating.
The method comprises the steps that if a first confidence coefficient threshold value, which is larger than the card frame confidence coefficient of a card image in a target image, is an updated first confidence coefficient threshold value and a second confidence coefficient threshold value, which is larger than the characteristic frame confidence coefficient of the card image in the target image, is an updated second confidence coefficient threshold value before the target image is acquired, a corresponding confidence coefficient threshold degradation identifier is set for the target image when the target image is acquired.
The number of parameters of the target detection model is smaller than a preset parameter number threshold, and/or the number FLOPs of floating point operations executed by the target detection model is smaller than a preset floating point operation number threshold, and/or the frame rate of the target detection model is larger than a preset frame rate threshold.
The embodiment of the invention also provides a computer readable storage medium. The computer-readable storage medium here stores one or more programs. The computer readable storage medium may include volatile memory, such as random access memory, or nonvolatile memory, such as read only memory, flash memory, hard disk, or solid state disk, or a combination of the foregoing.
The one or more programs in the computer-readable storage medium may be executed by the one or more processors to implement the image acquisition methods described above. Since the image acquisition method has been described in detail above, a detailed description thereof is omitted here.
The above description is only an example of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.