CN111145124A

CN111145124A - Image tilt correction method and device

Info

Publication number: CN111145124A
Application number: CN201911387689.7A
Authority: CN
Inventors: 沈来信; 孙明东; 米坤; 李锴; 桂越
Original assignee: Beijing Thunisoft Information Technology Co ltd
Current assignee: Beijing Thunisoft Information Technology Co ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-05-12

Abstract

The invention provides a method and a device for correcting image inclination, which comprises the following steps: forming a key corner point prediction network model; predicting the document image to be corrected by using a key corner prediction network model to generate 16 key points of the document image to be corrected; fitting to obtain 4 paper corner points; 4 paper corner points are used as original points, 4 converted target corner points are obtained through calculation, a perspective matrix can be obtained through a perspective transformation principle, and a corrected image is obtained through perspective transformation. The invention adopts different methods for finding the boundary contour line and the edge point of the document aiming at different document types of images, can effectively find the boundary contour and the edge point of the document image, improves the perspective transformation effect and is very helpful for improving the distortion correction of the document image.

Description

Image tilt correction method and device

Technical Field

The present invention relates to the field of image recognition technologies, and in particular, to a method and an apparatus for correcting an image tilt.

Background

The conversion of the paper document into the electronic document can effectively save the document management cost and improve the office efficiency. In the process of converting a paper document into an electronic document, scanning, photographing and the like are generally adopted, and document scanning inclination often occurs in the scanning or photographing process. The inclination correction preprocessing work of the paper file can effectively improve the accuracy of optical character recognition, and is an important link in the document digitization process. In the process of identifying an image, the angle of characters in the identified image is often inverted, or the inclination angle of characters in the image presents other angles, and if the image to be identified is directly treated, the condition that the image identification fails or the identification is inaccurate is easy to occur, and the image needs to be corrected.

Chinese patent application publication No. CN104463126A, entitled "method for automatically detecting tilt angle of scanned document image", which utilizes region edge extraction and document writing direction judgment to obtain an appropriate region edge image, and performs region growing, region feature extraction and line validity judgment on the region edge to obtain an effective line, and extracts the tilt of the scanned document.

The disclosed document can process complex situations of mixed arrangement of various images and texts, coexistence of various writing directions and the like, but the document effect of images in the document with foreground colors close to background colors is not good, and particularly under the condition that corners are shielded, the edges of the regions are difficult to obtain, and the gradient of the document is difficult to find.

The chinese patent application publication No. CN103413271A, entitled "local information-based document image correction method," determines the scale of a text line according to the average gradient value of a document image, then performs text line tracking according to the self-similarity of blank lines to obtain the upper and lower boundaries of the text line, then obtains the quadrilateral boundary of the text, and performs slant deformation correction and local bilinear interpolation correction in sequence to complete the correction of the whole image.

The publication uses the character line characteristics to select the text area and then performs the slant deformation correction, and it is difficult to find the text border for some non-text documents, such as table documents and graphic documents.

In addition, for a document with a table and a graph, it is difficult to find the paper corner points (upper left corner point, upper right corner point, lower left corner point) of the document, and for a method for performing perspective correction by using edge detection, when the foreground color of the document is close to the background color, especially when the paper corner points are partially or completely shielded, it is also difficult to find the paper edge and the paper corner points by the commonly used hough transform line detection method.

Disclosure of Invention

In view of the above, in order to solve the problems of the prior art, an object of the present invention is to provide a method and an apparatus for correcting an image tilt, in which a document corner point labeling method, a constraint-based corner point prediction network, a corner point connecting line and fitting are designed to obtain 4 edges, then 4 paper corner points are obtained, and the correction of a tilted image can be completed through perspective transformation in consideration of the situation that a document image may have corners that are blocked.

The purpose of the invention is realized by the following technical scheme:

in a first aspect, the present invention provides a method for correcting an image tilt, comprising the steps of:

step S1, forming a key corner point prediction network model;

step S2, predicting the document image to be corrected by using the key corner point prediction network model to generate 16 key points of the document image to be corrected;

step S3, connecting every two of 6 key points on each edge, respectively fitting the connected curves to obtain 4 edges, and extending the 4 edges to obtain 4 paper corner points;

and step S4, calculating to obtain the longest height and width of the corresponding paper by taking 4 paper corner points as an origin, calculating to obtain new upper right corner point, lower right corner point and lower left corner point by taking the upper left corner point as a reference, then obtaining 4 converted target corner points, obtaining a perspective matrix by a perspective transformation principle, and obtaining a corrected image by using perspective transformation.

Further, the step of forming the key corner point prediction network model includes:

step S101, marking each sample document image to be trained by using a document corner point marking method to form a training set for marking the corner points of the document;

and S102, constructing a key corner point prediction network model based on constraint conditions on the training set labeled by the document corner points.

Further, the document corner point marking method comprises the following steps: each document image is labeled with 16 key points, including 8 corner point labels and 8 intra-edge point labels.

Further, when the corner point is not shielded, the positions marked by 2 corner points coincide.

Furthermore, the 8 edge interior points are formed by 2 interior points respectively taken on 4 edges.

Further, two inner points on each side are respectively at one third and two thirds of the side length.

Further, the constraint conditions are as follows: the second key point, the third key point, the fourth key point and the fifth key point are on the same straight line; the sixth key point, the seventh key point, the eighth key point and the ninth key point are on the same straight line; the tenth key point, the eleventh key point, the twelfth key point and the thirteenth key point are on the same straight line; the fourteenth keypoint, the fifteenth keypoint, the sixteenth keypoint and the first keypoint are on a straight line.

In a second aspect, the invention provides a device for correcting image inclination, which comprises an acquisition module, a key corner prediction module, a paper corner fitting module and a correction module; the input end of the acquisition module acquires a document image to be corrected; the output end of the acquisition module is connected with the input end of the key angular point prediction module, and the output end of the key angular point prediction module is connected with the input end of the paper angular point fitting module; the output end of the paper corner fitting module is connected with the input end of the correction module, and the output end of the correction module outputs the corrected document image.

Further, an image tilt correction apparatus, characterized in that:

an acquisition module: the method comprises the steps of obtaining a document image to be corrected;

the key corner point prediction module: a key corner point prediction model is stored and used for predicting 16 key points of the document image to be corrected;

a paper corner fitting module: fitting and generating 4 paper corner points of the document image according to the 16 key points of the document image;

a correction module: and calculating to obtain 4 target corner points according to the generated 4 paper corner points, obtaining a perspective matrix by a perspective transformation principle, and obtaining a corrected image of the original document image by using perspective transformation.

The invention has the beneficial effects that:

the invention adopts different content contour line discovery and edge point discovery methods aiming at different document type images, can effectively discover the content contour and the edge point of the document image, improves the perspective transformation effect and is very helpful for improving the distortion correction of the document image.

Drawings

FIG. 1 is a flowchart illustrating a method for correcting image tilt according to the present invention;

FIG. 2 is a schematic illustration of document corner point labeling according to the present invention;

FIG. 3 is a schematic structural diagram of an image tilt correction apparatus according to the present invention.

DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION

The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

Example one

The invention firstly constructs a corner prediction network based on a shallow convolutional neural network, realizes a corner point prediction algorithm, and the key points meet certain constraint conditions, namely the close key points belong to the same side, because four paper corner points of a document image are possibly partially or completely shielded, when the corner points of the document image are shielded, 2 corner points are needed to be used for describing, so that 8 corner point methods can be used for describing all the corner points of a paper under various conditions (including the shielded number of the corner points are respectively 0, 1, 2, 3 and 4, wherein 0 represents not shielded, and 1-4 respectively represent the shielded number of the paper corner points), and when the corner points are not shielded, the corresponding 2 labels are repeated in position. Meanwhile, in order to well depict the edge direction when the angular point is shielded, 2 edge interior points need to be taken on each edge respectively to be used as fitting assistance of the edge direction.

By a corresponding document image corner point labeling algorithm, each document image is labeled by using 16 key points, including 8 paper corner point labels and 8 edge interior point labels, so that 6 label points exist on each edge, a shallow convolutional neural network is designed, and a constraint condition-based key corner point prediction network model is completed.

4 edges (upper edge, right edge, lower edge and left edge) of the document can be obtained through the obtained connecting line and fitting of the corner points, 4 corner points can be obtained through the 4 edges, perspective transformation is completed through the perspective matrix, perspective transformation of the document image can be completed, and a regular document image is obtained.

The embodiment provides a method for correcting image tilt, which specifically comprises the following steps:

step S1, forming a key corner point prediction network model; the method specifically comprises the following steps:

and S101, marking each sample document image to be trained by using a document corner point marking method to form a training set for marking the corner points of the document.

The document corner point marking method comprises the following steps: each document image is labeled by 16 key points, including 8 corner point labels (an upper left corner point, an upper right corner point, a lower right corner point, and a lower left corner point, when no shielding exists, the positions of two numbered labels coincide, such as 1 and 2,9 and 10 in fig. 2, and respectively represent the same position), when shielding exists, sequential number labels are adopted, such as 5 and 6,13 and 14 in fig. 2, and labels of 8 edge inner points (each edge takes 2 inner points, respectively at one third and two thirds of the edge length), so that each document image needs to be labeled with 16 edge points, and each edge has 6 edge points, as shown in fig. 2.

S102, constructing a key corner point prediction network model based on constraint conditions on a training set labeled by document corner points: the constraint conditions are that corner points (2, 3, 4, 5), (6, 7, 8, 9), (10, 11, 12, 13), (14, 15, 16, 1) are respectively on a straight line, namely the directions between any two points are the same (the slopes of the corresponding 6 straight lines are the same), a thermodynamic diagram key point prediction network is used, the prediction network structure comprises 6 convolution layer and pooling layer combined pairs, the loss function adopts the average value of the prediction errors of 16 points and the average value of the errors of 4 sides generated by 24-side fitting (4 key points on each side form 6 curves, and the fitting of the curves of 4 sides is completed by adopting a curve fitting algorithm), and a key point prediction network model is obtained by training.

And step S2, predicting the unmarked document image to be corrected by using the key corner point prediction network model, and generating 16 key points corresponding to the document image.

And S3, connecting every two of the 6 key points on each edge, respectively fitting the connected curves to obtain 4 edges, and extending the 4 edges to obtain 4 paper corner points.

For 16 key points generated by the document image, 6 key points on each edge can be obtained, so that a curve with any 2 points can be obtained, and 1 curve on each edge can be obtained by using a similar curve fitting algorithm. These 4 edges are then extended to obtain 4 intersections, i.e. 4 corner points of the paper.

And step S4, calculating to obtain the longest height and width of the corresponding paper by taking 4 paper corner points as an origin, calculating to obtain new upper right corner point, lower right corner point and lower left corner point by taking the upper left corner point as a reference, then obtaining 4 converted target corner points, obtaining a perspective matrix by a perspective transformation principle, and obtaining a corrected image of the original document inclination by using perspective transformation.

Example two

The embodiment provides a device for correcting image inclination, which comprises an acquisition module, a key corner prediction module, a paper corner fitting module and a correction module;

an acquisition module: for acquiring a document image to be corrected.

The key corner point prediction module: a keypoint prediction model is stored for predicting 16 keypoints of the document image to be corrected.

A paper corner fitting module: and fitting to generate 4 paper corner points of the document image according to the 16 key points of the document image.

The above description is for the purpose of illustrating embodiments of the invention and is not intended to limit the invention, and it will be apparent to those skilled in the art that any modification, equivalent replacement, or improvement made without departing from the spirit and principle of the invention shall fall within the protection scope of the invention.

Claims

1. A method of correcting an image tilt, characterized by: the method comprises the following steps:

step S1, forming a key corner point prediction network model;

2. The method according to claim 1, wherein the image tilt correction method comprises: the step of forming the key corner point prediction network model comprises the following steps:

3. The method according to claim 2, wherein: the document corner point marking method comprises the following steps: each document image is labeled with 16 key points, including 8 corner point labels and 8 intra-edge point labels.

4. A method of correcting an image tilt according to claim 3, characterized in that: when the corner point is not shielded, the positions marked by 2 corner points coincide.

5. A method of correcting an image tilt according to claim 3, characterized in that: the 8 edge interior points are formed by respectively taking 2 interior points on 4 edges.

6. The method according to claim 5, wherein: two inner points on each edge, one third and two thirds of the edge length, respectively.

7. The method according to claim 2, wherein: the constraint conditions are as follows: the second key point, the third key point, the fourth key point and the fifth key point are on the same straight line; the sixth key point, the seventh key point, the eighth key point and the ninth key point are on the same straight line; the tenth key point, the eleventh key point, the twelfth key point and the thirteenth key point are on the same straight line; the fourteenth keypoint, the fifteenth keypoint, the sixteenth keypoint and the first keypoint are on a straight line.

8. An image tilt correction apparatus characterized by: the device comprises an acquisition module, a key corner prediction module, a paper corner fitting module and a correction module; the input end of the acquisition module acquires a document image to be corrected; the output end of the acquisition module is connected with the input end of the key angular point prediction module, and the output end of the key angular point prediction module is connected with the input end of the paper angular point fitting module; the output end of the paper corner fitting module is connected with the input end of the correction module, and the output end of the correction module outputs the corrected document image.

9. The acquisition module of claim 8: an image tilt correction apparatus characterized by: