CN114387598A

CN114387598A - Document labeling method and device, electronic equipment and storage medium

Info

Publication number: CN114387598A
Application number: CN202111467402.9A
Authority: CN
Inventors: 蒋晓海; 谢春鸿
Original assignee: Beijing Testin Information Technology Co Ltd
Current assignee: Beijing Testin Information Technology Co Ltd
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-04-22

Abstract

The application discloses a document labeling method, a document labeling device, electronic equipment and a storage medium, belongs to the field of data processing, and achieves fewer labeled points when labeling content in a picture document. The method comprises the following steps: acquiring a content display area of a document picture, wherein the content display area is contained in a quadrilateral and comprises at least one text area needing to be marked; acquiring a first external rectangle of the quadrangle, and mapping the content display area into a first rectangle frame corresponding to the first external rectangle based on the one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first external rectangle; marking diagonal endpoints of a second external rectangle of the text area in the first rectangular frame, and determining the second external rectangle of the text area; converting the four vertices of the second circumscribed rectangle into the document picture.

Description

Document labeling method and device, electronic equipment and storage medium

Technical Field

The application belongs to the field of data processing, and particularly relates to a document labeling method and device, electronic equipment and a storage medium.

Background

In the process of photographing a document, due to the problem of the photographing angle, the document often has a condition of inclined deformation in the photographed document picture. In this case, if text regions on a document are to be labeled, each text region needs to be labeled as a quadrangle, that is, each text region needs to be labeled at least 4 points, which results in a large labeling workload and a high labeling cost.

Disclosure of Invention

The embodiment of the application provides a document labeling method and device, electronic equipment and a storage medium, and can solve the problem of large workload caused by excessive labeling points when obliquely deformed picture documents are labeled in the related art.

In a first aspect, an embodiment of the present application provides a method for document annotation, where the method includes: acquiring a content display area of a document picture, wherein the content display area is contained in a quadrilateral and comprises at least one text area needing to be marked; acquiring a first external rectangle of the quadrangle, and mapping the content display area into a first rectangle frame corresponding to the first external rectangle based on the one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first external rectangle; marking diagonal endpoints of a second external rectangle of the text area in the first rectangular frame, and determining the second external rectangle of the text area; converting the four vertices of the second circumscribed rectangle into the document picture.

In a second aspect, an embodiment of the present application provides an apparatus for document annotation, where the apparatus includes: the mapping module is used for acquiring a content display area of the document picture, wherein the content display area is contained in a quadrilateral and comprises at least one text area required to be marked; acquiring a first external rectangle of the quadrangle, and mapping the content display area into a first rectangle frame corresponding to the first external rectangle based on the one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first external rectangle; the conversion module is used for labeling a diagonal endpoint of a second external rectangle of the text area in the first rectangular frame and determining the second external rectangle of the text area; converting the four vertices of the second circumscribed rectangle into the document picture.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In the embodiment of the application, a content display area of a document picture is obtained, wherein the content display area is contained in a quadrilateral and comprises at least one text area needing to be marked; acquiring a first external rectangle of the quadrangle, and mapping the content display area into a first rectangle frame corresponding to the first external rectangle based on the one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first external rectangle; marking the diagonal endpoints of a second external rectangle of the text area in the first rectangular frame, and determining the second external rectangle of the text area; the four vertexes of the second external rectangle are converted into the document picture, and the content display area in the document picture is mapped into the rectangular frame, so that the text in the document picture is corrected into the rectangular display, only the diagonal end points of the second external rectangle in the text area can be labeled, the number of labeled points is reduced, the problem of large workload caused by excessive labeled points when the picture document with inclined deformation is labeled in the related art is solved, and the labeling workload and the labeling cost are reduced.

Drawings

FIG. 1 is a schematic illustration of a document picture provided herein;

FIG. 2 is a flowchart illustrating a method for labeling a document according to an embodiment of the present application;

FIG. 3 is a schematic diagram of another document picture provided by an embodiment of the present application;

FIG. 4 is a flowchart illustrating another document annotation method provided in an embodiment of the present application;

FIG. 5 is a flowchart illustrating a method for labeling a document according to an embodiment of the present application;

FIG. 6 is a flowchart illustrating a method for labeling a document according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an apparatus for annotating a document according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to another embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

Specifically, in the process of photographing a document, due to the problem of the photographing angle, the document in the photographed document picture may be inclined and deformed. For example, as shown in fig. 1, in the captured document picture P in fig. 1, a content display region N inside the document picture P may be obliquely deformed, which in turn causes a plurality of text regions (e.g., a text region N1 and a text region N2) included in the content display region N to be also obliquely deformed. In this case, when labeling text regions in the related art, it is necessary to label four vertices of a quadrangle containing each text region, which makes the labeling workload large in the case where the content display region P contained in the document picture contains a large number of text regions.

In view of the above, according to the method, a quadrilateral first external rectangle is determined by obtaining a quadrilateral content display area which includes all text contents in a document picture and is formed by manual labeling, and the content display area is mapped into a first rectangular frame corresponding to the first external rectangle based on a one-to-one correspondence relationship between a quadrilateral first vertex coordinate and four second vertex coordinates of the first external rectangle, so that the content display area which is obliquely deformed in fig. 1 is displayed in a rectangular shape, and the text area is also displayed in a rectangular shape, so that the text area passes through the first rectangular frame; in addition, the diagonal endpoints of the second external rectangle of the text region are labeled, the second external rectangle of the text region can be determined, and by converting four vertexes of the second external rectangle into the document picture, the labeling of the text region which is obliquely deformed in the document picture is completed by labeling two points. Therefore, the labeling of the text region which is obliquely deformed in the document picture can be completed by labeling only two points on the text region of the first rectangular frame, and compared with the method for labeling four points when each text region is labeled in the related art, the method and the device effectively reduce the number of labeling points.

A method, an apparatus, an electronic device, and a storage medium for document annotation provided in the embodiments of the present application are described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Fig. 2 illustrates a method for document annotation provided by an embodiment of the present invention, which may be performed by an electronic device, and the electronic device may include: a server and/or a terminal device, wherein the terminal device can be a mobile phone terminal or the like. In other words, the method may be performed by software or hardware installed in the electronic device, the method comprising the steps of:

step 201: and acquiring a content display area of the document picture.

Wherein the content display area is contained in a quadrilateral, which may be manually labeled. In addition, the content display area comprises at least one text area needing to be marked.

Specifically, the document picture may refer to a picture obtained by shooting a file with text, for example, as shown in a document picture P in fig. 1.

It is understood that the text-bearing document may include an identification card, a shopping receipt, a receipt, an invoice, a bank card, a train ticket, etc., and is not specifically limited thereto.

The content display area in the document picture can be the area occupied by the file taken by the user in the picture. As one example, the content display area may be area N in fig. 1, for example. Further, the text region refers to a region where characters are displayed, and as one example, as shown in fig. 1, the text region includes N1 and N2.

Specifically, in this embodiment, a quadrangle marking may be performed on the content display area first, and in this process, four points may be marked manually, so that the content display area is included in a quadrangle formed by connecting the four points. Of course, the quadrilateral includes all text regions that need to be labeled.

In addition, the quadrangle is a rectangle or a nearly rectangle in a real scene. In the process of labeling the quadrangle, four vertices of the quadrangle may be sequentially labeled according to a predetermined order, for example, the predetermined order may be that the four vertices are sequentially labeled clockwise from the upper left, so that after the labeling of the quadrangle is completed, the document picture can be rotated to the correct direction according to the predetermined order. For example, as an example, the four vertices labeled for region N in fig. 1 are connected to form a quadrilateral. Of course, the file shot by the user corresponding to the content display area may be rectangular in the real scene.

Step 202: and acquiring a first external rectangle of the quadrangle, and mapping the content display area into a first rectangle frame corresponding to the first external rectangle based on the one-to-one correspondence between the four first vertex coordinates of the quadrangle and the four second vertex coordinates of the first external rectangle.

Specifically, in this step, a first circumscribed rectangle of the quadrangle may be determined, and optionally, the first circumscribed rectangle may be a circumscribed rectangle with a minimum area of the quadrangle. Then, four first vertex coordinates of the quadrangle and four second vertex coordinates of the first circumscribed rectangle are obtained, and a one-to-one correspondence relationship between the first vertex coordinates and the second vertex coordinates is determined.

For example, as shown in fig. 1, four first vertex coordinates of a quadrangle of the content display area N are set to A, B, C and D, respectively, in clockwise order from the top left vertex, a first circumscribed rectangle of the quadrangle is Y1, and second vertex coordinates of the first circumscribed rectangle Y1 having the smallest distances from A, B, C and D, respectively, may be set to a1, B1, C1, and D1, respectively, where A, B, C and D correspond to a1, B1, C1, and D1, respectively, one-to-one.

Continuing, the content display area may be mapped into the first rectangle frame corresponding to the first circumscribed rectangle based on a one-to-one correspondence between the four first vertex coordinates of the quadrangle and the four second vertex coordinates of the first circumscribed rectangle. For example, the content display area N in fig. 1 may be mapped into a first rectangular frame corresponding to the first circumscribed rectangle Y1 shown in fig. 3, at this time, since the mapping is performed based on the one-to-one correspondence between the four first vertex coordinates of the quadrangle and the four second vertex coordinates of the first circumscribed rectangle, the mapped content display area N can be displayed in a rectangular shape as shown in fig. 3, and the text areas N1 and N2 in the content display area N are also displayed in a rectangular shape, so that the content display area N is corrected into a rectangular shape, and certainly, the text areas N1 and N2 in the content display area are also corrected into rectangular shapes at the same time, so as to provide a basis for labeling the rectangular shapes of the subsequent text areas.

Step 203: marking the diagonal endpoints of the second external rectangle of the text area in the first rectangular frame, and determining the second external rectangle of the text area.

In the first rectangular frame, the text region therein may be labeled through a labeling point, and since the text region has been corrected to be rectangular in step 202, the labeling manner for the text region in the first rectangular frame may be a rectangular labeling manner, that is, any two diagonal endpoints of the second external rectangle of the text region are labeled, and the second external rectangle corresponding to the text region may be determined through the diagonal endpoints of the second external rectangle.

For example, as shown in fig. 3, within a first rectangular frame corresponding to the first circumscribed rectangle Y1, a diagonal end point a of the second circumscribed rectangle S2 passing through the text region N2 is shown in fig. 3₁、A₃Labeling, a second circumscribed rectangle S2 of the text region N2 may be determined; of course, the diagonal line-end points of the second circumscribed rectangle S1 of the text region N1 may also be labeled to determine the second circumscribed rectangle S1 of the text region N1. Thus, the second circumscribed rectangle of the text region can be determined by labeling the text region with diagonal endpoints of the second circumscribed rectangle, which greatly reduces the amount of labeling compared to labeling the text region with four points in the related art.

Step 204: converting the four vertices of the second circumscribed rectangle into the document picture.

In this step, the four vertices of the second circumscribed rectangle may be converted into a document picture.

Alternatively, after the second circumscribed rectangle of the text region is determined, four vertices of the second circumscribed rectangle may be further acquired. Continuing, the four vertices of the second circumscribed rectangle may be transformed into the content display area in the document picture, thereby completing labeling of the text area in the content display area. In this way, the labeling of the text region which is obliquely deformed in the document picture is indirectly finished by labeling two points of the text region in the first rectangular box.

Specifically, for example, if the content display area displayed in an inclined manner in the document picture includes 500 text areas, in the related art, in order to label the 500 text areas, four points need to be labeled on each text area so that a quadrangle formed by connecting the labeled points encloses each text area, and thus 2000 points are labeled in total. However, according to the solution stated in this embodiment, the content display area is corrected to be rectangular, and the text area inside the content display area is also corrected to be rectangular, and the text area is labeled by using two diagonal end points of the second external rectangle of the corrected text area, so that the points to be labeled in the text area are four points in the content display area and two points to be labeled in each text area, and the total is 1004 points (i.e., 4 plus 500 times 2). Obviously, compared with the number of the annotation points for the text region in the related art, the number of the annotation points for the text region in the embodiment is reduced by 996, and the number of the annotation points is greatly reduced.

In this way, in this embodiment, the content display area is included in the quadrangle, after the first external rectangle of the quadrangle is determined, the content display area is mapped into the first rectangle frame corresponding to the first external rectangle according to the one-to-one correspondence between the four vertices of the quadrangle and the four vertices of the first external rectangle, the diagonal endpoints of the second external rectangle corresponding to the text area are labeled in the first rectangle frame, and then the second external rectangle of the text area is determined, and then the four vertices of the second external rectangle are converted into the document picture, so that the text area in the document picture that is obliquely deformed is corrected into the text area in the first rectangle frame that is rectangular, on the basis of which only the two diagonal endpoints of the second external rectangle corresponding to the text area in the first rectangle frame can determine the second external rectangle, thereby completing labeling of the text area in the document picture, the method and the device realize that the labeling of each text region is completed by labeling two points, and greatly save labor compared with the method that four points are required to be labeled for labeling each text region in the related technology.

In a possible implementation manner, when the first circumscribed rectangle of the quadrangle is obtained, four first vertex coordinates of the quadrangle may be obtained, and based on the four first vertex coordinates, a circumscribed minimum area rectangle of the quadrangle is obtained by calculation, and the circumscribed minimum area rectangle is determined as the first circumscribed rectangle.

Specifically, when the first circumscribed rectangle of the quadrangle is obtained, four first vertex coordinates of the quadrangle can be obtained, and then the circumscribed minimum area rectangle of the quadrangle can be obtained by calculation based on the first vertex coordinates. The circumscribed minimum area rectangle is a rectangle having a smallest area among all circumscribed rectangles including the quadrangle, and the circumscribed minimum area rectangle can be determined as a first circumscribed rectangle. In this way, the circumscribed minimum area rectangle is determined as the first circumscribed rectangle so that the first circumscribed rectangle does not largely enclose therein an area that does not belong to the content display area.

In one implementation, before the mapping the content display area into the first rectangle frame corresponding to the first circumscribed rectangle based on a one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first circumscribed rectangle, the method further includes:

acquiring the length and the width of the first circumscribed rectangle; establishing the first external rectangle in a new coordinate system based on the length and the width of the first external rectangle, and obtaining four second vertex coordinates of the first external rectangle in the new coordinate system; and establishing a one-to-one corresponding relation between the four first vertex coordinates and the four second vertex coordinates.

Specifically, the length and width of the first circumscribed rectangle of the quadrangle may be determined, and optionally, the first circumscribed rectangle may be reconstructed based on the length and width of the first circumscribed rectangle in the newly created coordinate system. For example, the length and width of the first circumscribed rectangle Y1 in the document picture in fig. 1 may be obtained, and the first circumscribed rectangle Y1 may be reconstructed in the new coordinate system based on the length and width. Continuously, four second vertex coordinates of the first circumscribed rectangle may be determined in the new coordinate system, and a one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates in the new coordinate system is established.

Specifically, assume that four vertices of the quadrangle are X, and four vertices X ', and X' of the first circumscribed rectangle of the quadrangle are in one-to-one correspondence with X ', and X', respectively, and vertices of the first circumscribed rectangle in the newly created coordinate system are X ", and X", and X "are in one-to-one correspondence with X ', and X', respectively, so X, and X are in one-to-one correspondence with X", and X ", respectively.

In one possible implementation, as shown in fig. 4, step 202 may specifically include the following steps:

step 2021: and calculating to obtain a conversion matrix of the four first vertex coordinates mapped to the four second vertex coordinates through perspective transformation based on the one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates.

Specifically, the four first vertex coordinates may be mapped to four second vertex coordinates corresponding to the four first vertex coordinates one by one through perspective transformation, respectively, based on a one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates, and a transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates is obtained through calculation. Naturally, through the perspective transformation, the content display area in the document picture can be mapped into the first rectangular frame.

Step 2022: mapping the content display area into the first rectangular box based on the transformation matrix.

Specifically, the four first vertex coordinates A, B, C, D of the quadrangle including the content display area N in fig. 1 may be mapped to the four second vertex coordinates a1, B1, C1, and D1 of the first circumscribed rectangle Y1 in fig. 3, respectively, by the above-described conversion matrix. Naturally, on this basis, the content display area N in the document picture in fig. 1 can be mapped into the first circumscribed rectangle Y1 in fig. 3 by the conversion matrix. Therefore, the content display area is mapped to be rectangular, so that when the text area in the content display area is marked, rectangular marking of the text area can be realized only by marking any two diagonal endpoints of the text area, and compared with the method that four points need to be marked when the text area in a document picture which is obliquely deformed is marked in the related art, the number of marking points of the text area is greatly reduced.

In a possible implementation manner, as shown in fig. 5, step 204 may specifically include the following steps:

step 2041: and calculating to obtain a reverse transformation matrix of the four first vertex coordinates mapped to the four second vertex coordinates through perspective transformation based on the one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates.

Specifically, based on the one-to-one correspondence relationship between the four first vertex coordinates and the second vertex coordinates, a reverse transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates may be obtained through perspective transformation calculation, so that the content in the first circumscribed rectangle may be subsequently reversely mapped to the document picture.

Step 2042: converting four vertices of the second circumscribed rectangle into the document picture based on the inverse conversion matrix.

In this way, based on the reverse transformation matrix, the four vertices of the second external rectangle can be transformed into the document picture, and the labeling of the text region in the document picture is completed. For example, as an example, the labeling of the text region N2 in the document picture P may be completed by reversely mapping four vertices of the second circumscribed rectangle S2 corresponding to the text region N2 in fig. 3 into the document picture P shown in fig. 1 based on the reverse transformation matrix.

Therefore, through the steps, the text region in the document picture can be labeled through the diagonal end point of the second external rectangle for labeling the text region, and compared with the method that four points are required to be labeled in each text region when a large number of text regions are labeled in the related technology, the labeling amount of the text regions is greatly reduced, and the labeling cost is reduced.

Optionally, an embodiment of the present application is described below with reference to fig. 6, where the embodiment includes the following steps:

step 601: acquiring a content display area of the document picture, wherein the content display area is contained in a quadrilateral and comprises at least one text area needing to be marked.

Specifically, a content display area to be marked in the document picture can be determined, the content display area includes all text areas to be marked, and the content display area can be included in a quadrangle formed by manual marking. Optionally, the document picture may include a bill, an invoice, and the like, and the quadrangle may be a diamond or an irregular quadrangle.

Step 602: and acquiring four first vertex coordinates of the quadrangle, calculating to obtain a first external rectangle of the quadrangle based on the four first vertex coordinates, and establishing a one-to-one correspondence relationship between the four first vertex coordinates and four second vertex coordinates of the first external rectangle.

The first external rectangle is a quadrilateral external rectangle with a minimum area.

Specifically, as an example, as shown in fig. 1, A, B, C and D may be sequentially set for the four first vertex coordinates of the above-mentioned quadrangle in a clockwise order from the upper left first vertex, and based on the above-mentioned first vertex coordinates, a circumscribed minimum area rectangle of the quadrangle, that is, a first circumscribed rectangle may be calculated. In the first circumscribed rectangle, four second vertex coordinates a1, B1, C1 and D1 have a one-to-one correspondence with four first vertex coordinates A, B, C and D, respectively.

Optionally, in the process of establishing a one-to-one correspondence between four first vertex coordinates and four second vertex coordinates of the first external rectangle, the length and the width of the first external rectangle may be obtained, the first external rectangle is established in a new coordinate system based on the length and the width of the first external rectangle, the four second vertex coordinates of the first external rectangle in the new coordinate system are obtained, and a one-to-one correspondence between the four first vertex coordinates and the four second vertex coordinates of the first external rectangle is established.

Specifically, the coordinates of the four second vertices of the first circumscribed rectangle in the newly created coordinate system may be a1, a2, A3, and a4, where a1, a2, A3, and a4 are points corresponding to the four second vertices a1, B1, C1, and D1 of the first circumscribed rectangle in the document image in fig. 1, while a1, B1, C1, and D1 are in one-to-one correspondence with the coordinates of the four first vertices a A, B, C, D of the quadrilateral in fig. 1, and it can be seen that the coordinates of the four first vertices A, B, C, D of the quadrilateral are in one-to-one correspondence with the coordinates of the four second vertices a1, a2, A3, and a4 of the first circumscribed rectangle.

Step 603: based on the one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates, calculating to obtain a transformation matrix of the four first vertex coordinates mapped to the four second vertex coordinates through perspective transformation; mapping the content display area into the first rectangular box based on the transformation matrix.

Specifically, based on the one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates, a conversion matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates is calculated, and the content display area is mapped into a first rectangular frame corresponding to a first circumscribed rectangle through the conversion matrix.

Step 604: marking the diagonal endpoints of the second external rectangle of the text area in the first rectangular frame, and determining the second external rectangle of the text area.

Specifically, in the first rectangular frame, diagonal endpoints of the second external rectangle corresponding to the text region in the first rectangular frame may be labeled, and the diagonal endpoints may be any two endpoints forming a diagonal in the second external rectangle. Based on the diagonal endpoints of the two annotations, a second circumscribed rectangle for the text region may be determined.

Step 605: based on the one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates, calculating to obtain a reverse transformation matrix of the four first vertex coordinates mapped to the four second vertex coordinates through perspective transformation; converting four vertices of the second circumscribed rectangle into the document picture based on the inverse conversion matrix.

Based on the one-to-one correspondence relationship between the four first vertex coordinates and the second vertex coordinates, a transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates may be obtained by calculation through perspective transformation, and naturally, a reverse transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates may also be obtained by calculation through perspective transformation.

Based on the reverse conversion matrix, four vertexes in the second external rectangle of the text region can be converted into the document picture, so that the text region in the document picture is labeled.

In this way, in the document picture subjected to oblique deformation, the content display region including all the text regions is mapped into the first rectangular frame, so that the text regions subjected to oblique deformation are corrected into rectangles, and further, when the text regions in the first rectangular frame are subjected to rectangular labeling, the second external rectangle can be determined only by labeling the diagonal end points of the second external rectangle of the text regions, and thus, the labeling of the text regions is completed. On the basis, the four vertexes of the second external rectangle are converted into the document picture, so that the labeling of the text region in the document picture, which is inclined and deformed, can be realized, and compared with the prior art that four points need to be labeled when the text region in the document picture is labeled, the number of the labeled points is effectively reduced.

It should be noted that, in the document annotation method provided in the embodiment of the present application, the execution subject may be a device for annotating a document, or a control module in the device for annotating a document, for executing a document annotation method. In the embodiment of the present application, a document labeling apparatus executes a document labeling method as an example, and a document labeling apparatus provided in the embodiment of the present application is described.

FIG. 7 is a schematic structural diagram of a document labeling apparatus according to an embodiment of the present invention. As shown in fig. 8, an apparatus 700 for labeling a document includes: an obtaining module 710, a mapping module 720, a determining module 730, and a labeling module 740.

An obtaining module 710, configured to obtain a content display area of a document picture, where the content display area is included in a quadrilateral and includes at least one text area to be labeled; a mapping module 720, configured to obtain a first circumscribed rectangle of the quadrangle, and map the content display area into a first rectangle frame corresponding to the first circumscribed rectangle based on a one-to-one correspondence between four first vertex coordinates of the quadrangle and four second vertex coordinates of the first circumscribed rectangle; a determining module 730, configured to label, within the first rectangular frame, a diagonal end point of a second external rectangle of the text region, and determine the second external rectangle of the text region; and the labeling module 740 is used for converting the four vertexes of the second external rectangle into the document picture.

In an implementation manner, the obtaining module 710 is specifically configured to obtain four first vertex coordinates of the quadrangle, and calculate to obtain a circumscribed minimum area rectangle of the quadrangle based on the four first vertex coordinates; and determining the circumscribed minimum area rectangle as the first circumscribed rectangle.

In an implementation manner, the mapping module 720 is further configured to calculate, based on a one-to-one correspondence between the four first vertex coordinates and the four second vertex coordinates, a reverse transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates through perspective transformation; converting four vertices of the second circumscribed rectangle into the document picture based on the inverse conversion matrix.

In an implementation manner, the mapping module 720 is specifically configured to calculate, based on a one-to-one correspondence between the four first vertex coordinates and the four second vertex coordinates, a transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates through perspective transformation; mapping the content display area into the first rectangular box based on the transformation matrix.

In an implementation manner, the labeling module 740 is specifically configured to calculate, based on a one-to-one correspondence relationship between the four first vertex coordinates and the four second vertex coordinates, a reverse transformation matrix in which the four first vertex coordinates are mapped to the four second vertex coordinates through perspective transformation; converting four vertices of the second circumscribed rectangle into the document picture based on the inverse conversion matrix.

The device for document marking in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The device for labeling the document in the embodiment of the application can be a device with an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The device for document labeling provided in the embodiment of the present application can implement each process implemented in the method embodiments of fig. 2 and fig. 4 to fig. 6, and is not described herein again to avoid repetition.

Optionally, as shown in fig. 8, an electronic device 800 is further provided in this embodiment of the present application, and includes a processor 801, a memory 802, and a program or an instruction stored in the memory 802 and executable on the processor 801, where the program or the instruction is executed by the processor 801 to implement each process of the above-mentioned embodiment of the document labeling method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above-mentioned embodiment of the document labeling method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the above-mentioned document labeling method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. a method for document marking, characterized in that, comprising:

Obtaining the content display area of the document picture, wherein the content display area is included in a quadrilateral, and the content display area includes at least one text area to be marked;

Acquire the first circumscribed rectangle of the quadrilateral, and based on the one-to-one correspondence between the coordinates of the four first vertices of the quadrilateral and the coordinates of the four second vertices of the first circumscribed rectangle, display the content in the content display area. mapped into the first rectangular frame corresponding to the first circumscribed rectangle;

In the first rectangular frame, the diagonal endpoints of the second circumscribed rectangle of the text area are marked to determine the second circumscribed rectangle of the text area;

Converting the four vertices of the second circumscribed rectangle into the document picture.

2. The method for document labeling according to claim 1, wherein the acquiring the first circumscribed rectangle of the quadrilateral comprises:

Acquiring the coordinates of the four first vertices of the quadrilateral, and calculating the circumscribed minimum area rectangle of the quadrilateral based on the coordinates of the four first vertices;

The circumscribed minimum area rectangle is determined as the first circumscribed rectangle.

3. The method for document labeling according to claim 1 or 2, wherein, between the coordinates of the four first vertices based on the quadrilateral and the coordinates of the four second vertices of the first circumscribed rectangle The one-to-one correspondence between the content display area and the first rectangular frame corresponding to the first circumscribed rectangle further includes:

obtaining the length and width of the first circumscribed rectangle;

Based on the length and width of the first circumscribed rectangle, the first circumscribed rectangle is established in a new coordinate system, and the coordinates of four second vertices of the first circumscribed rectangle in the new coordinate system are obtained;

A one-to-one correspondence between the four first vertex coordinates and the four second vertex coordinates is established.

4 . The method for document marking according to claim 1 , wherein the method is based on one-to-one between the coordinates of the four first vertices of the quadrilateral and the coordinates of the four second vertices of the first circumscribed rectangle. 5 . Corresponding relationship, mapping the content display area to the first rectangular frame corresponding to the first circumscribed rectangle, including:

Based on the one-to-one correspondence between the coordinates of the four first vertices and the coordinates of the four second vertices, through perspective transformation, it is calculated that the coordinates of the four first vertices are mapped to the coordinates of the four second vertices The transformation matrix of ;

Based on the transformation matrix, the content display area is mapped into the first rectangular frame.

5. The method for document labeling according to claim 1, wherein the converting the four vertices of the second circumscribed rectangle into the document picture comprises:

Based on the one-to-one correspondence between the coordinates of the four first vertices and the coordinates of the four second vertices, through perspective transformation, it is calculated that the coordinates of the four first vertices are mapped to the coordinates of the four second vertices The inverse transformation matrix of ;

Based on the inverse transformation matrix, the four vertices of the second circumscribed rectangle are transformed into the document picture.

6. A device for document marking, comprising:

an acquisition module, configured to acquire a content display area of a document picture, wherein the content display area is included in a quadrilateral, and the content display area includes at least one text area to be marked;

The mapping module is used to obtain the first circumscribed rectangle of the quadrilateral, and based on the one-to-one correspondence between the coordinates of the four first vertices of the quadrilateral and the coordinates of the four second vertices of the first circumscribed rectangle, convert the The content display area is mapped into the first rectangular frame corresponding to the first circumscribed rectangle;

a determining module, configured to mark the diagonal endpoints of the second circumscribed rectangle of the text area in the first rectangular frame, and determine the second circumscribed rectangle of the text area;

An annotation module, configured to convert the four vertices of the second circumscribed rectangle into the document picture.

7. The device according to claim 6, wherein the acquisition module is specifically used for:

8. The apparatus according to claim 6 or 7, wherein the mapping module is further used for:

9. An electronic device, characterized in that it comprises a processor, a memory and a program or instruction that is stored on the memory and can run on the processor, the program or instruction being implemented when executed by the processor The steps of the method for document marking according to any one of claims 1-5.

10. A readable storage medium, wherein a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the document marking according to any one of claims 1-5 is implemented steps of the method.