CN112001390B

CN112001390B - File scanning processing method, intelligent terminal and storage medium

Info

Publication number: CN112001390B
Application number: CN201910375483.6A
Authority: CN
Inventors: 丁骥
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-05-07
Filing date: 2019-05-07
Publication date: 2024-09-13
Anticipated expiration: 2039-05-07
Also published as: CN112001390A

Abstract

The embodiment of the invention provides a file scanning processing method, an intelligent terminal and a storage medium, wherein the method comprises the following steps: scanning a file to be processed to obtain target document scanning data, wherein the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed; determining a first type of gaps in the horizontal direction, and dividing a plurality of document content blocks into N document block sets according to the first type of gaps; performing gap detection on a first set in the N document block sets in the vertical direction to determine a second type of gap; dividing the first set into a plurality of document blocks according to the second class of gaps; and sequencing the document blocks in the second set and the plurality of document blocks obtained by dividing the first set in the N document block sets according to a preset sequence to obtain a sequencing result. The embodiment of the invention can effectively improve the efficiency of dividing and sequencing the document blocks and increase the application scenes of dividing and sequencing the document blocks.

Description

File scanning processing method, intelligent terminal and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a file scanning processing method, an intelligent terminal, and a storage medium.

Background

Optical character recognition (Optical Character Recognition, OCR) refers to the process of an electronic device (e.g., a scanner or digital camera, etc.) checking characters printed on paper, determining character shapes by detecting dark and light patterns, and then translating the character shapes into computer text using a character recognition method; that is, the technology of converting the characters in the paper document into the image file of black-white lattice by optical mode and converting the characters in the image into the text format by the recognition software for further editing and processing by the word processing software is adopted.

In the prior art OCR solutions, document blocks in a file are typically ordered based on a predefined set of rules, but may also be ordered based on computer vision and machine learning algorithms. However, the two technical schemes are very complex to implement, have higher requirements on the running environment, are not suitable for deployment and application at the mobile terminal, and have long time consumption and low efficiency in document block ordering.

Disclosure of Invention

The embodiment of the invention provides a file scanning processing method, an intelligent terminal and a storage medium, which can effectively improve the efficiency of document block segmentation and sequencing and increase the application scenes of the document block segmentation and sequencing.

In one aspect, an embodiment of the present invention provides a method for processing file scanning, where the method includes:

scanning a file to be processed to obtain target document scanning data, wherein the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed;

determining a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content blocks, and dividing the plurality of document content blocks into N document block sets according to the first type of gap, wherein N is a positive integer greater than 1;

in the vertical direction of the file to be processed, detecting gaps in a first set comprising two or more file blocks in the N file block sets, and determining a second type of gaps;

Dividing the first set into a plurality of document blocks according to the second class of gaps;

And sequencing the document blocks in the second set which only comprises one document block in the N document block sets and the plurality of document blocks obtained by dividing the first set according to a preset sequence to obtain a sequencing result.

In another aspect, an embodiment of the present invention provides a document scanning processing apparatus, including:

The scanning unit is used for scanning the file to be processed to obtain target document scanning data, wherein the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed;

The processing unit is used for determining a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content blocks, and dividing the document content blocks into N document block sets according to the first type of gap, wherein N is a positive integer greater than 1;

the processing unit is further configured to perform gap detection on a first set including two or more document blocks in the N document block sets in a vertical direction of the file to be processed, so as to determine a second type gap;

The processing unit is further used for dividing the first set into a plurality of document blocks according to the second class of gaps;

The sorting unit is used for sorting the document blocks in the second set which only comprises one document block in the N document block sets and the plurality of document blocks obtained by dividing the first set according to a preset sequence to obtain a sorting result.

In still another aspect, an embodiment of the present invention provides an intelligent terminal, including: the device comprises a processor and a memory, wherein the memory stores executable program codes, and the processor is used for calling the executable program codes and executing the file scanning processing method.

Accordingly, an embodiment of the present invention further provides a storage medium having instructions stored therein, which when executed on a computer, cause the computer to perform the above-described file scanning processing method.

According to the embodiment of the invention, a plurality of document content blocks are segmented into N document block sets according to the first type of gaps in the horizontal direction, the first set in the N document block sets is subjected to gap detection in the vertical direction, the second type of gaps is determined, the first set is segmented into a plurality of document blocks according to the second type of gaps, then the document blocks in the second set in the N document block sets and the plurality of document blocks obtained by segmentation of the first set are sequenced according to a preset sequence, and a sequencing result is obtained, so that the document blocks in the files can be segmented and sequenced based on the gaps, the technical scheme is short in time consumption and easy to realize, the efficiency of segmentation and sequencing of the document blocks can be effectively improved, and the application scene of segmentation and sequencing of the document blocks is increased.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a document scanning method according to an embodiment of the present invention;

FIG. 2a is a schematic diagram of a page of a document to be processed according to a first embodiment of the present invention;

FIG. 2b is a schematic diagram of a tree constructed from the segmentation results shown in FIG. 2 a;

FIG. 3a is a schematic diagram of a page of a document to be processed according to a second embodiment of the present invention;

FIG. 3b is a schematic diagram of the document content block shrink process shown in FIG. 3 a;

FIG. 3c is a schematic diagram of a page of a document to be processed according to a third embodiment of the present invention;

FIG. 3d is a schematic diagram of the document content blocks shown in FIG. 3c after merging;

FIG. 3e is a schematic diagram of a page of a document to be processed according to a fourth embodiment of the present invention;

fig. 4 is a schematic structural diagram of a document scanning device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an intelligent terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

In the embodiment of the invention, the intelligent terminal can be a terminal such as an intelligent Mobile Phone (e.g. an Android Mobile Phone, an iOS Mobile Phone, a Windows Phone Mobile Phone, etc.), a tablet personal computer, a Mobile internet device (Mobile INTERNET DEVICES, MID), a computer, etc. The intelligent terminal scans an image file corresponding to a file to be processed to obtain target document scanning data, wherein the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed; then determining a first type of Gap in the horizontal direction of the file to be processed according to the position information of the document content blocks, wherein the first type of Gap refers to a Gap (Gap) for dividing the document content blocks into at least two parts in the vertical direction of the file to be processed; after determining the first class of voids, partitioning the plurality of document content blocks into a plurality of document block sets according to the first class of voids. Further, the intelligent terminal detects gaps of a first set in the plurality of document block sets in the vertical direction of the file to be processed, and determines second-class gaps. The first set is a document block set comprising two or more document blocks in the document block sets, and the second type of gaps refers to gaps for dividing the first set into at least two parts in the horizontal direction of the file to be processed. After determining the second class of gaps, the intelligent terminal divides the first set into a plurality of document blocks according to the second class of gaps. Further, the intelligent terminal sorts the document blocks in the second set, which only comprises one document block, and the plurality of document blocks obtained by dividing the first set according to a preset sequence, so as to obtain a sorting result. By adopting the mode, the document blocks in the file can be segmented and ordered based on the gaps in the horizontal direction and the vertical direction, the technical scheme is short in time consumption and easy to realize, the efficiency of segmentation and ordering of the document blocks can be effectively improved, and the application scenes of segmentation and ordering of the document blocks are increased.

Referring to fig. 1, fig. 1 is a flowchart of a file scanning processing method according to an embodiment of the invention. The file scanning processing method described in the embodiment of the invention comprises the following steps:

S101, an intelligent terminal scans a file to be processed to obtain target document scanning data, wherein the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed.

In the embodiment of the invention, the target document scanning data comprises a plurality of document content blocks and position information of the document content blocks in the file to be processed. The document content blocks are obtained by primarily dividing the content in the file to be processed in the process of scanning the file to be processed by the intelligent terminal; the file to be processed can comprise text, pictures and other contents; one or more sub-document content blocks may be included in the document content block. The position information of the document content block is used for indicating the relative coordinates of the document content block in the page of the file to be processed, and particularly can be used for indicating the limit coordinate value of the document content block in the page of the file to be processed, wherein the limit coordinate value comprises one or more of an upper limit coordinate value, a lower limit coordinate value, a left limit coordinate value and a right limit coordinate value.

The document to be processed may refer to an image file into which a paper document is converted, or may refer to a paper document. For the condition that the file to be processed is an image file converted from a paper file, the intelligent terminal directly scans the file to be processed to obtain target document scanning data. In an embodiment, the intelligent terminal may scan the file to be processed by using the OCR module configured by the intelligent terminal to obtain target document scan data. For the condition that the file to be processed is a paper file, the intelligent terminal firstly controls an image acquisition device configured by the intelligent terminal to acquire an image file corresponding to the file to be processed; and then acquiring an image file corresponding to the file to be processed, which is output by the image acquisition device, and scanning the image file corresponding to the file to be processed to obtain target document scanning data. In an embodiment, the image acquisition device may be a scanner or a camera. The image acquisition device can be arranged on the intelligent terminal and also can be independent of the intelligent terminal. When the image acquisition device is independent of the intelligent terminal, the intelligent equipment is communicated with the image acquisition device in a wired or wireless mode so as to control the image acquisition device to execute task operation.

In an embodiment, before the intelligent terminal scans the image file corresponding to the file to be processed, the intelligent terminal first performs image preprocessing on the image file corresponding to the file to be processed to obtain a preprocessed image file. Further, the intelligent terminal scans the preprocessed image file to obtain target document scanning data. Wherein the image preprocessing includes binarization of an image, tilt correction, and the like. Binarization of an image means converting a color or gray-scale image into a binary image, and tilt correction is used to compensate for the tilt of the image to correct the tilted image. The image pretreatment is carried out on the image file corresponding to the file to be treated, so that the speed of the subsequent scanning treatment can be increased to a certain extent, and the accuracy of the scanning treatment result can be improved.

In one embodiment, the intelligent terminal scans an image file corresponding to a file to be processed to obtain initial document scanning data, wherein the initial document scanning data comprises a plurality of initial document content blocks and position information of the initial document content blocks in a page of the file to be processed; and then judging whether the plurality of initial document content blocks meet preset merging conditions or not. And if the intelligent terminal judges that the plurality of initial document content blocks do not meet the preset merging condition, the initial document scanning data is used as target document scanning data. If the intelligent terminal judges that the plurality of initial document content blocks meet the preset merging conditions, merging the combinable document content blocks in the plurality of initial document content blocks to obtain a plurality of document content block sets; and then acquiring the document scanning data corresponding to each document content block set in the plurality of document content block sets, and taking the document scanning data corresponding to each document content block set as target document scanning data respectively.

Wherein, the plurality of initial document content blocks conform to preset merging conditions means that: the number of the initial document content blocks is larger than a preset number threshold, and/or the distribution mode of the initial document content blocks in the file to be processed is a target distribution mode. The target distribution manner may refer to: the method comprises the steps that a target class gap exists in a file to be processed, wherein the target class gap refers to a gap which is formed by dividing the initial document content blocks into at least two parts in the horizontal direction of the file to be processed and penetrates through the page of the file to be processed in the vertical direction of the file to be processed.

The combinable document content blocks refer to: a document content block in the file to be processed, the position offset between the document content block and the adjacent document content block being within a preset offset range, and the adjacent document content block. Specifically, assuming that the first document content block is any one of the plurality of initial document content blocks, the second document content block is a document content block adjacent to the first document content block in the up or down direction among the plurality of initial document content blocks; if the intelligent terminal detects that the difference value between the coordinate value of the left limit position of the first document content block and the coordinate value of the left limit position of the second document content block is smaller than or equal to a first preset offset threshold value, and detects that the difference value between the coordinate value of the right limit position of the first document content block and the coordinate value of the right limit position of the second document content block is smaller than or equal to a second preset offset threshold value; the first document content block and the second document content block are determined to be combinable ones of the plurality of initial document content blocks. Assuming that the third document content block is any one of the plurality of initial document content blocks, the fourth document content block is a document content block adjacent to the third document content block in the left or right direction among the plurality of initial document content blocks; if the intelligent terminal detects that the difference value between the coordinate value of the upper limit position of the third document content block and the coordinate value of the upper limit position of the fourth document content block is smaller than or equal to a third preset offset threshold value, and detects that the difference value between the coordinate value of the right limit position of the third document content block and the coordinate value of the right limit position of the fourth document content block is smaller than or equal to a fourth preset offset threshold value; then the third and fourth blocks of document content are determined to be combinable ones of the plurality of initial blocks of document content.

S102, the intelligent terminal determines a first type of gaps in the horizontal direction of the file to be processed according to the position information of the document content blocks, and divides the document content blocks into N document block sets according to the first type of gaps.

In the embodiment of the invention, the direction corresponding to the first type of gap is parallel to the horizontal direction of the file to be processed, and the first type of gap refers to a gap which is formed by dividing a plurality of document content blocks included in the target document scanning data into at least two parts in the vertical direction of the file to be processed and penetrates through the page of the file to be processed in the horizontal direction of the file to be processed. The void value corresponding to the first type of void is greater than or equal to a first void reference value, and the void value corresponding to the first type of void refers to: a first type of void divides the plurality of blocks of document content into two adjacent portions of a size of space between them.

In one embodiment, the first gap reference value is determined based on a gap value of all or part of the gaps across the page of the document to be processed in the horizontal direction of the document to be processed. The intelligent terminal obtains a gap value of all or part of gaps penetrating through the page of the file to be processed in the horizontal direction of the file to be processed; then counting the obtained gap values, and determining the gap quantity corresponding to each gap value; the gap value with the largest corresponding gap number is used as the first gap reference value. For example, the intelligent terminal uses the histogram to count the obtained gap values, and determines the number of gaps corresponding to each gap value; assuming that the obtained gap values comprise 10mm and 20mm, obtaining the gap number corresponding to the gap value of 10mm as 1 and the gap number corresponding to the gap value of 20mm as 4 according to the statistical result of the histogram; the intelligent terminal takes the gap value of 20mm as a first gap reference value. In another embodiment, the first gap reference value is determined according to a gap value of a gap that penetrates through a page of the document to be processed in a horizontal direction of the document to be processed, and the gap value is greater than or equal to a first preset value. Since there will also be gaps between text lines, the gap value between text lines is typically small; therefore, when the first gap reference value is determined, gaps among text lines can be filtered, so that the first gap reference value can be ensured to be larger than the gap value among the text lines, the text lines are prevented from being segmented, and the segmented document blocks are ensured to be text paragraphs to a certain extent.

In one embodiment, the intelligent terminal sorts the plurality of document content blocks included in the target document data according to the order from small to large of the upper limit coordinate values indicated by the position information of the document content blocks; in the sorting process, recording a current maximum lower limit coordinate value, and traversing the document content blocks for comparison; if the comparison result indicates that the upper limit coordinate value of a certain document content block is larger than the current maximum lower limit coordinate value, and the difference value between the upper limit coordinate value and the lower limit coordinate value is larger than or equal to a preset threshold value, or larger than or equal to the first preset interval threshold value; and the intelligent terminal determines that a gap exists between the document content blocks corresponding to the current maximum lower limit coordinate value. Further, the intelligent terminal detects whether the gap can penetrate through the page of the file to be processed in the horizontal direction of the file to be processed, and if the gap is detected to penetrate through the page of the file to be processed in the horizontal direction of the file to be processed, the gap is determined to be a first type of gap. In this manner, all first type gaps between the plurality of document content blocks may be determined. In another embodiment, if the intelligent terminal detects that a gap exists in a certain document content block according to the position information of the sub-document content block in the file to be processed; and further detecting whether the gap can penetrate through the page of the file to be processed in the horizontal direction of the file to be processed, and if so, determining that the gap is a first type of gap. By adopting the mode, the first type gaps in each document content block can be determined.

Further, after determining a first type of gap in the horizontal direction of the file to be processed, the intelligent terminal divides the plurality of document content blocks into N document block sets according to the first type of gap. N is a positive integer greater than 1, and each document block set in the N document block sets comprises one, two or more document blocks; one, two or more document blocks included in each document block set may belong to the same document content block or belong to different document content blocks; the document block may be a document content block or a sub-document content block in the document content block.

In one embodiment, the intelligent terminal firstly judges whether a first type of gap exists in the horizontal direction of the file to be processed according to the position information of the document content block; if the first type of gaps exist in the horizontal direction of the file to be processed, the plurality of document content blocks are divided into N document block sets according to the determined first type of gaps. If the first type of gaps do not exist in the horizontal direction of the file to be processed, the intelligent terminal firstly performs shrinkage processing on the plurality of document content blocks according to the position information of the document content blocks in a preset proportion. Specifically, the intelligent terminal respectively determines shrinkage datum points of all the document content blocks in the plurality of document content blocks according to the position information of the document content blocks, and respectively performs shrinkage processing of preset proportions on all the document content blocks according to the shrinkage datum points of all the document content blocks. The shrinkage reference point may be any one of a center position point, an upper left corner vertex, a lower left corner vertex, an upper right corner vertex, and a lower right corner vertex of the document content block. The preset ratio is, for example, 5%, 10%, etc.

Further, the intelligent terminal acquires the position information of the document content block after the shrinkage processing in the file to be processed, and determines a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content block after the shrinkage processing; and then dividing the plurality of document content blocks into N document block sets according to the determined first type of gaps. In another embodiment, after the intelligent terminal obtains the position information of the document content block after the shrinkage processing in the file to be processed, firstly judging whether a first type of gap exists in the horizontal direction of the file to be processed according to the position information of the document content block after the shrinkage processing; if the first type of gaps exist in the horizontal direction of the file to be processed, the plurality of document content blocks are divided into N document block sets according to the determined first type of gaps. If the first type of gaps still exist in the horizontal direction of the file to be processed, the intelligent terminal performs the shrink processing on the plurality of document content blocks after the shrink processing according to the position information of the document content blocks after the shrink processing. Repeating the steps until the first type of gaps exist in the horizontal direction of the file to be processed.

S103, the intelligent terminal detects gaps in the vertical direction of the file to be processed on a first set comprising two or more file blocks in the N file block sets, and determines a second type of gaps.

In the embodiment of the invention, the direction corresponding to the second type of gap is parallel to the vertical direction of the file to be processed, and the second type of gap refers to a gap which divides the first set into at least two parts in the horizontal direction of the file to be processed and penetrates through the page area where the first set is located in the vertical direction of the file to be processed. Wherein the first set refers to a document block set comprising two or more document blocks in the N document block sets. The void value corresponding to the second class of voids is greater than or equal to the second void reference value, and the void value corresponding to the second class of voids refers to: a second class of voids divides the first set into two adjacent portions of the size of the space between them. In an embodiment, the second gap reference value is determined according to a gap value of all or part of the gaps in the vertical direction of the file to be processed, wherein the gap value penetrates through the page area where the first set is located. The intelligent terminal obtains a gap value of all or part of gaps penetrating through the page area where the first set is located in the vertical direction of the file to be processed; then counting the obtained gap values, and determining the gap quantity corresponding to each gap value; and taking the gap value with the largest corresponding gap number as a second gap reference value. In another embodiment, the second gap reference value is determined according to a gap value of a gap that penetrates through a page area where the first set is located in a vertical direction of the file to be processed, and the gap value is greater than or equal to a second preset value.

S104, the intelligent terminal divides the first set into a plurality of document blocks according to the second class of gaps.

In the embodiment of the invention, after the intelligent terminal determines the second class of gaps, the first set is divided into a plurality of subsets according to the second class of gaps; and then, in the horizontal direction of the file to be processed, detecting gaps on a first subset comprising two or more file blocks in the subsets, and judging whether a third type of gaps exist in the horizontal direction of the file to be processed. If the third type of gaps exist in the horizontal direction of the file to be processed, determining the third type of gaps, and dividing the first subset into a plurality of file blocks according to the determined third type of gaps. The third type of gaps are used for dividing the first subset into at least two parts in the vertical direction of the file to be processed, and penetrate through the page area where the first subset is located in the horizontal direction of the file to be processed. The void value corresponding to the third type of void is greater than or equal to the third void reference value, and the void value corresponding to the third type of void refers to: a third type of void divides the first subset into two adjacent portions of the size of the space between them. In one embodiment, the third gap reference value is determined according to a gap value of all or part of the gaps in the horizontal direction of the document to be processed, which extends through the page area of the first subset. The intelligent terminal obtains a gap value of all or part of gaps penetrating through the page area where the first subset is located in the horizontal direction of the file to be processed; then counting the obtained gap values, and determining the gap quantity corresponding to each gap value; the gap value with the largest number of corresponding gaps is used as the third gap reference value. In another embodiment, the third gap reference value is determined according to a gap value of a gap that penetrates through the page area where the first subset is located in the horizontal direction of the file to be processed, and the gap value is greater than or equal to a third preset value.

S105, the intelligent terminal sorts the document blocks in the second set which only comprises one document block in the N document block sets and the plurality of document blocks obtained by dividing the first set according to a preset sequence, and a sorting result is obtained.

In the embodiment of the present invention, the preset sequence refers to a sequence from top to bottom and from left to right. And the intelligent terminal sorts the document blocks in the second set which only comprises one document block in the N document block sets and the plurality of document blocks obtained by dividing the first set according to the distribution positions of the document blocks in the file to be processed and the sequence from top to bottom and from left to right corresponding to the document blocks, so as to obtain a sorting result. The document blocks obtained by dividing the first set comprise document blocks in a second subset which only comprises one document block in the subsets and the document blocks obtained by dividing the first subset. The segmented document blocks are sequenced in the sequence from top to bottom and from left to right, so that the sequencing result accords with the natural reading sequence of the user.

In an embodiment, in the process of dividing a plurality of document content blocks included in target document scanning data according to a gap in a horizontal direction and a gap in a vertical direction, an intelligent terminal constructs a tree for a document block set and the document block according to a document block set obtained by dividing and a distribution position of the document block in a file to be processed; the leaf nodes in the constructed tree are used to represent the document blocks, the leaf nodes constituting the final ranking result. Therefore, the intelligent terminal can quickly obtain the ordering result of the document blocks according to the constructed tree.

In an embodiment, after the intelligent terminal sorts the segmented document blocks according to a preset sequence to obtain a sorting result, the document blocks in the second set, which only includes one document block, in the N document block sets, and the plurality of segmented document blocks in the first set are sequentially input into a document block output queue according to the sorting result, so that the intelligent terminal sequentially outputs the segmented document blocks according to the document block output queue, so that a user or the intelligent terminal can perform subsequent processing. By adopting the mode, not only can the output based on the document blocks, but also the sequence of the document blocks of the to-be-processed file can be effectively restored, and the document blocks can be understood as document paragraphs; therefore, the problem that paragraphs and sequences of the document cannot be restored only based on text line output due to the limitation of OCR technology can be effectively solved, and the reading experience of a user is effectively improved.

In order to better understand the document scanning processing method in the embodiment of the present invention, the following description is given by way of example. Referring to fig. 2a together, fig. 2a is a schematic diagram of a page of a file to be processed according to a first embodiment of the present invention. After the intelligent terminal scans the file to be processed, target document scanning data is obtained, and as shown in fig. 2a, the target document scanning data comprises 5 document content blocks in total, namely 21, 22, 23, 24 and 25; 21. the areas shown at 22, 23, 24 and 25 represent the positions of 5 document content blocks included in the target document scanning data in the page of the document to be processed. One or more document blocks are included in the document content block, for example, document content block 21 includes Wen Dangkuai and 202, and document content block 23 includes document block 206.

The intelligent terminal performs gap detection in the horizontal direction to determine the first type of gap, as shown in fig. 2a, the intelligent terminal can determine 11, 12 and 13 total 3 first type of gaps, and the areas shown by 11, 12 and 13 represent the positions of the first type of gaps in the file to be processed. Wherein the first type of voids 11, 12 and 13 divide the 5 document content blocks into two parts in the vertical direction and penetrate the pages of the document to be processed in the horizontal direction, respectively. For example, the first-type void 12 vertically divides the document content blocks 21, 22, 23, 24, and 25 into two parts, one part including the document content block 21 and the other part including the document content blocks 22, 23, 24, and 25. Further, the intelligent terminal may divide the 5 document content blocks into {201}, {202}, {203, 204, 205} and {206, 207, 208, 209, 210} sets of 4 document blocks according to the first class of slots 11, 12, and 13.

Further, the intelligent terminal respectively performs gap detection on the document block sets {203, 204, 205} and {206, 207, 208, 209, 210} comprising two or more document blocks in the 4 document block sets in the vertical direction so as to determine a second type of gap; as shown in fig. 2a, the intelligent terminal may determine that there are a total of 3 second type gaps 14, 15 and 16, and the areas shown by 14, 15 and 16 represent the locations of the second type gaps in the file to be processed. The document block set {203, 204, 205} can be divided into three parts of subsets {203}, {204}, and {205} in the horizontal direction according to the second type gaps 14 and 15, and the page areas where the sets {203, 204, 205} are located are all penetrated in the vertical direction. The document block set {206, 207, 208, 209, 210} may be divided into two parts in the horizontal direction according to the second class of slots 16, one part comprising the subset {206, 207} and the other part comprising the subset {208, 209, 210}; and runs vertically through the page area where the set 206, 207, 208, 209, 210 is located.

Further, the intelligent terminal respectively detects gaps in the horizontal direction of a subset {206, 207} and {208, 209, 210} comprising two or more document blocks in a plurality of subsets obtained by the gap segmentation according to the second type so as to determine a third type of gap; as shown in fig. 2a, the intelligent terminal may determine that there are 3 third class gaps 17, 18 and 19, and the areas shown by 17, 18 and 19 represent the locations of the third class gaps in the file to be processed. The third type of gap 17 can divide the subset {206, 207} into two parts of a document block {206} and a document block {207} in the vertical direction, and penetrates the page area where the subset {206, 207} is located in the vertical direction. The subset 208, 209, 210 may be divided into three parts of document block 208, document block 209 and document block 210 in the horizontal direction according to the third type of gaps 18 and 19 and extend through the page area of the subset 208, 209, 210 in the vertical direction. To this end, the intelligent terminal divides the plurality of document content blocks into 10 document blocks of {210}, {209}, {208}, {207}, {206}, {205}, {204}, {203}, {202} and {201 }. Further, the intelligent terminal sorts the 10 document blocks according to the sequence of the positions of the 10 document blocks in the file to be processed from top to bottom and from left to right, and the obtained sorting result is as follows: {201} - {202} - {203} - {204} - {205} - {206} - {207} - {208} - {209} - {210}.

In the process of dividing the document content blocks, the document block sets and the subset, the intelligent terminal can construct a tree for the document block sets and the document blocks according to the dividing result. Referring to fig. 2b, fig. 2b is a schematic diagram of a tree constructed according to the segmentation result shown in fig. 2 a. As shown in FIG. 2b, the leaf nodes in the constructed tree are used to represent segmented document blocks, one leaf node for each of the segmented document blocks {210}, {209}, {208}, {207}, {206}, {205}, {204}, {203}, {202} and {201 }. The leaf node is the last node of each branch in the tree. The positions of the leaf nodes in the constructed tree represent the ordering order of the corresponding document blocks. For example, if the leaf node corresponding to the document block {201} is located at the leftmost position of all the leaf nodes in the tree of the structure, it indicates that the ranking order corresponding to the document block {201} is the first. Wen Dangkuai the leaf node corresponding to 205 is the fifth left of all leaf nodes in the constructed tree, then the rank order corresponding to the document block 205 is the fifth bit. The intelligent terminal can obtain the sequencing result of the 10 file blocks according to the leaf nodes in the constructed tree, wherein the sequencing result is as follows: {201} - {202} - {203} - {204} - {205} - {206} - {207} - {208} - {209} - {210}.

In another embodiment, please refer to fig. 3a, fig. 3a is a schematic diagram of a page of a file to be processed according to a second embodiment of the present invention. After the intelligent terminal scans the file to be processed, target document scanning data is obtained, and as shown in FIG. 3a, the target document scanning data comprises 5 document content blocks in total, namely 31, 32, 33, 34 and 35; 31. the areas indicated by 32, 33, 34, and 35 represent the positions of 5 document content blocks included in the target document scanning data in the file to be processed. The intelligent terminal performs gap detection in the horizontal direction to determine whether a first type of gap exists in the horizontal direction, and for the layout of the document content blocks shown in fig. 3a, the intelligent terminal can determine that gaps exist among the document content blocks 31, 32, 33, 34 and 35, but the gaps existing among the document content blocks 31, 32, 33, 34 and 35 cannot penetrate through the page of the file to be processed. The first type of void in the horizontal direction cannot be determined at this time.

When the first kind of gap in the horizontal direction cannot be determined, the intelligent terminal performs a contraction process of 25% proportion to the document content blocks 31, 32, 33, 34 and 35, respectively, according to the center position points of the document content blocks 31, 32, 33, 34 and 35. Referring to fig. 3b, fig. 3b is a schematic diagram of the document content block shown in fig. 3a after the shrinkage process. As shown in fig. 3a, the document content blocks 31, 32, 33, 34, and 35 after the shrinkage processing at the 25% ratio are performed. At this time, the intelligent terminal performs gap detection in the horizontal direction, so that two first types of gaps 41 and 42 can be determined, and the first types of gaps 41 and 42 divide the 5 document content blocks after shrinkage processing into two parts in the vertical direction respectively and penetrate through pages of the file to be processed in the horizontal direction. According to the method, the first type of gaps are determined by performing shrinkage processing on the document content blocks, so that the problem that the first type of gaps cannot be determined in the horizontal direction of the file to be processed can be solved to a certain extent, and application scenes of document block segmentation and sequencing are increased.

In another embodiment, please refer to fig. 3c, fig. 3c is a schematic diagram of a page of a file to be processed according to a third embodiment of the present invention. After the intelligent terminal scans the file to be processed, initial document scanning data is obtained, and as shown in fig. 3c, the initial document scanning data comprises A, B, C, D, E and F total 6 initial document content blocks; A. the areas B, C, D, E and F show the positions of 6 initial document content blocks included in the initial document scan data in the file to be processed. Assuming that the initial document content blocks A, B, C, D, E and F are directly segmented and ordered in the manner described above, a first type of gap may be determined between the initial document content block A and the document block set { B, E }, and a first type of gap may also be determined between the document block set { B, E } and { C, D, F }; further, a second type of gap may be determined between the set of document blocks { C, D } and the initial document content block F. The initial document content blocks A, B, C, D, E and F are segmented and ranked based on the first class of gaps and the second class of gaps, and the ranking result is A-B-E-C-D-F. But assuming that the correct ordering order should be a-B-C-D-E-F, this indicates that the ordering result is erroneous.

Referring to fig. 3d, fig. 3d is a schematic diagram of the initial document content blocks shown in fig. 3c after being combined. The whole page is divided horizontally due to the presence of the target class gap 43 extending through the whole page in the vertical direction in the page shown in fig. 3c or 3 d. In this case, if the division is still performed in the vertical direction first as described above, a wrong ordering may result; in this case, the division needs to be performed in the horizontal direction. The intelligent terminal firstly determines a target class gap 43 in the vertical direction, the target class gap 43 divides the initial document content block A, B, C, D, E and the initial document content block F into two parts in the horizontal direction, one part comprises the initial document content block B, C, D, and the other part comprises the initial document content block A, E, F; and extends through the entire page in the vertical direction. Further, the intelligent terminal merges the combinable document blocks in the initial document content block B, C, D, as shown in fig. 3D, the left limit offset between the initial document content block C and the adjacent document content blocks B and D is within a preset range, and the right limit offset between the initial document content block C and the adjacent document content blocks B and D is also within the preset range; the three original document content blocks B, C and D may be combined to obtain the document content block set B, C, D. The intelligent terminal merges the combinable document blocks in the initial document content block A, E, F, as shown in fig. 3d, the left limit position offset between the initial document content block E and the adjacent document content block F is within a preset range, and the right limit position offset between the two is also within the preset range; the original document content blocks E and F may be combined to obtain the document content block set E, F. Because the left limit offset between the initial document content block A and the adjacent initial document content block E is larger and is not in the preset range, the initial document content blocks A and F cannot be combined; the initial document content block a can only be taken as one document content block set { a }. Further, the intelligent terminal adopts the mode of the scheme to divide the document content block sets { A }, { B, C, D }, { E, F }, and sort the obtained document blocks, so that the correct sorting order A-B-C-D-E-F can be obtained.

In another embodiment, please refer to fig. 3e, fig. 3e is a schematic diagram of a page of a file to be processed according to a fourth embodiment of the present invention. After the intelligent terminal scans the file to be processed, target document scanning data is obtained, and as shown in fig. 3e, the target document scanning data comprises G, H, I, J document content blocks and K total document content blocks; G. the areas H, I, J and K show the locations of 5 blocks of document content included in the target document scan data in the file to be processed. It can be seen that there is an overlap between document content block G and document content blocks I and K, and an overlap between document content block H and document content blocks J and K; however, if the document content blocks G and H are removed, there is no overlap between the document content blocks I, J and K. At this time, the intelligent terminal adopts the mode in the embodiment to divide the document blocks I and K respectively; then, the document content block set { I, J, K } is subjected to the division processing in the manner in the present embodiment. And finally, sorting the segmented document blocks. The intelligent terminal determines a target document content block from a plurality of document content blocks overlapping with other document content blocks in the case that overlapping exists among the plurality of document content blocks included in the target document scanning data, and then preferentially performs segmentation processing on the target document content block in the mode of the embodiment. Wherein if the target document content block is filtered out of a plurality of document content blocks included in the target document scanning data, there is no coincidence between the remaining document content blocks.

It should be noted that if the document block obtained by the segmentation can be further segmented, the segmentable document block can be further segmented in the above manner or in a similar manner; specific implementations may refer to the foregoing descriptions, and are not repeated here.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a document scanning device according to an embodiment of the invention. The file scanning processing device described in the embodiment of the invention corresponds to the intelligent terminal, and includes:

A scanning unit 401, configured to scan a file to be processed to obtain target document scanning data, where the target document scanning data includes a plurality of document content blocks and position information of the document content blocks in the file to be processed;

A processing unit 402, configured to determine a first type of gap in a horizontal direction of the file to be processed according to position information of the document content blocks, and divide the plurality of document content blocks into N document block sets according to the first type of gap, where N is a positive integer greater than 1;

the processing unit 402 is further configured to perform gap detection on a first set including two or more document blocks in the N document block sets in a vertical direction of the file to be processed, to determine a second type of gap;

the processing unit 402 is further configured to divide the first set into a plurality of document blocks according to the second class of gaps;

And the sorting unit 403 is configured to sort, according to a preset order, the document blocks in the second set that only includes one document block in the N document block sets, and the plurality of document blocks obtained by dividing the first set, so as to obtain a sorting result.

In an embodiment, the first type of gap refers to a gap that divides the plurality of document content blocks into at least two parts in a vertical direction of the to-be-processed file and penetrates through a page of the to-be-processed file in a horizontal direction of the to-be-processed file; and the gap value corresponding to the first type of gaps is larger than or equal to the first gap reference value.

In an embodiment, the second type of gap refers to a gap that divides the first set into at least two parts in a horizontal direction of the file to be processed, and penetrates through a page area where the first set is located in a vertical direction of the file to be processed; and the gap value corresponding to the second type of gaps is larger than or equal to a second gap reference value.

In one embodiment, the processing unit 402 is specifically configured to, when dividing the first set into a plurality of document blocks according to the second class of slots: dividing the first set into a plurality of subsets according to the second class of gaps; in the horizontal direction of the file to be processed, detecting gaps of a first subset comprising two or more file blocks in the subsets, and determining third type gaps; dividing the first subset into a plurality of document blocks according to the third type of gaps; the document blocks obtained by dividing the first set comprise document blocks in a second subset which only comprises one document block in the subsets and the document blocks obtained by dividing the first subset.

In an embodiment, the third type of gap refers to a gap that divides the first subset into at least two parts in a vertical direction of the document to be processed and penetrates through a page area where the first subset is located in a horizontal direction of the document to be processed; and the gap value corresponding to the third type of gap is larger than or equal to a third gap reference value.

In one embodiment, the processing unit 402 determines a first type of gap in the horizontal direction of the file to be processed according to the location information of the document content blocks, and is specifically configured to: judging whether a first type of gap exists in the horizontal direction of the file to be processed according to the position information of the document content block; if not, carrying out shrinkage processing of a preset proportion on the plurality of document content blocks according to the position information of the document content blocks; acquiring the position information of the contracted document content block in the file to be processed; determining a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content block after the shrinkage processing; the plurality of document content blocks is partitioned into N sets of document blocks according to the first class of voids.

In one embodiment, the processing unit 402 is configured to, when scanning a file to be processed to obtain target document scan data, specifically: scanning a file to be processed to obtain initial document scanning data, wherein the initial document scanning data comprises a plurality of initial document content blocks and position information of the initial document content blocks in the file to be processed; when the plurality of initial document content blocks meet preset merging conditions, merging the combinable document content blocks in the plurality of initial document content blocks to obtain a plurality of document content block sets; and acquiring the document scanning data corresponding to each document content block set in the plurality of document content block sets, and taking the document scanning data corresponding to each document content block set as target document scanning data respectively.

In one embodiment, the plurality of initial document content blocks conform to a preset merge condition means that: the number of the initial document content blocks is larger than a preset number threshold, and/or the distribution mode of the initial document content blocks in the file to be processed is a target distribution mode; the combinable document content blocks refer to: and shifting the position between the file to be processed and the adjacent file content blocks to be within a preset shift range.

It may be understood that the functions of each functional unit of the file scanning processing apparatus according to the embodiment of the present invention may be specifically implemented according to the method in the embodiment of the method, and the specific implementation process may refer to the related description of the embodiment of the method, which is not repeated herein.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an intelligent terminal according to an embodiment of the present invention. The intelligent terminal described in the embodiment of the invention comprises: the processor 501 communicates with the interface 502 and the memory 503. The processor 501, the communication interface 502, and the memory 503 may be connected by a bus or other means, which is exemplified by the present embodiment.

The processor 501 (or CPU (Central Processing Unit, central processing unit)) is a computing core and a control core of the terminal, and may parse various instructions in the terminal and process various data of the terminal, for example: the CPU can be used for analyzing a startup and shutdown instruction sent by a user to the terminal and controlling the terminal to perform startup and shutdown operation; and the following steps: the CPU can transmit various kinds of interactive data between the internal structures of the terminal, and so on. The communication interface 502 may optionally include a standard wired interface, a wireless interface (e.g., wi-Fi, mobile communication interface, etc.), controlled by the processor 501 for transceiving data. The Memory 503 (Memory) is a Memory device in the terminal for storing programs and data. It will be appreciated that the memory 503 herein may include both the internal memory of the terminal and the expansion memory supported by the terminal. The memory 503 provides storage space that stores the operating system of the terminal, which may include, but is not limited to: android systems, iOS systems, windows Phone systems, etc., the invention is not limited in this regard.

In an embodiment of the present invention, the processor 501 performs the following operations by executing executable program code in the memory 503:

In one embodiment, when the processor 501 divides the first set into a plurality of document blocks according to the second class of slots, the processor is specifically configured to: dividing the first set into a plurality of subsets according to the second class of gaps; in the horizontal direction of the file to be processed, detecting gaps of a first subset comprising two or more file blocks in the subsets, and determining third type gaps; dividing the first subset into a plurality of document blocks according to the third type of gaps; the document blocks obtained by dividing the first set comprise document blocks in a second subset which only comprises one document block in the subsets and the document blocks obtained by dividing the first subset.

In one embodiment, the processor 501 determines a first type of gap in the horizontal direction of the file to be processed according to the location information of the document content blocks, and is specifically configured to: judging whether a first type of gap exists in the horizontal direction of the file to be processed according to the position information of the document content block; if not, carrying out shrinkage processing of a preset proportion on the plurality of document content blocks according to the position information of the document content blocks; acquiring the position information of the contracted document content block in the file to be processed; determining a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content block after the shrinkage processing; the plurality of document content blocks is partitioned into N sets of document blocks according to the first class of voids.

In one embodiment, the processor 501 is configured to, when scanning a file to be processed to obtain target document scan data, specifically: scanning a file to be processed to obtain initial document scanning data, wherein the initial document scanning data comprises a plurality of initial document content blocks and position information of the initial document content blocks in the file to be processed; when the plurality of initial document content blocks meet preset merging conditions, merging the combinable document content blocks in the plurality of initial document content blocks to obtain a plurality of document content block sets; and acquiring the document scanning data corresponding to each document content block set in the plurality of document content block sets, and taking the document scanning data corresponding to each document content block set as target document scanning data respectively.

In a specific implementation, the processor 501, the communication interface 502, and the memory 503 described in the embodiments of the present invention may execute an implementation manner of the intelligent terminal described in the method for processing file scanning provided in the embodiments of the present invention, or may execute an implementation manner described in the device for processing file scanning provided in the embodiments of the present invention, which is not described herein again.

The embodiment of the invention also provides a computer readable storage medium, wherein instructions are stored in the computer readable storage medium, and when the computer readable storage medium runs on a computer, the computer is caused to execute the file scanning processing method according to the embodiment of the invention.

The embodiments of the present invention also provide a computer program product containing instructions which, when run on a computer, cause the computer to perform the file scanning processing method according to the embodiments of the present invention.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present invention is not limited by the order of action described, as some steps may be performed in other order or simultaneously according to the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

The above disclosure is illustrative only of some embodiments of the invention and is not intended to limit the scope of the invention, which is defined by the claims and their equivalents.

Claims

1. A document scanning processing method, the method comprising:

Judging whether a first type of gap exists in the horizontal direction of the file to be processed according to the position information of the document content block;

If yes, dividing the plurality of document content blocks into N document block sets according to the determined first type of gaps; if not, carrying out contraction processing of a preset proportion on the plurality of document content blocks according to the position information of the document content blocks, obtaining the position information of the document content blocks after the contraction processing in the file to be processed, determining a first type of gap in the horizontal direction of the file to be processed according to the position information of the document content blocks after the contraction processing, and dividing the plurality of document content blocks into N document block sets according to the first type of gap; the N is a positive integer greater than 1;

In the vertical direction of the file to be processed, detecting gaps in a first set comprising a plurality of document blocks in the N document block sets, and determining second class gaps;

2. The method of claim 1, wherein the first type of void is a void that divides the plurality of blocks of document content into at least two parts in a vertical direction of the document to be processed and that extends through a page of the document to be processed in a horizontal direction of the document to be processed; and the gap value corresponding to the first type of gaps is larger than or equal to the first gap reference value.

3. The method according to claim 1 or 2, wherein the second type of gap refers to a gap that divides the first set into at least two parts in the horizontal direction of the file to be processed and penetrates through the page area where the first set is located in the vertical direction of the file to be processed; and the gap value corresponding to the second type of gaps is larger than or equal to a second gap reference value.

4. The method of claim 1, wherein the partitioning the first set into a plurality of document blocks according to the second class of slots comprises:

dividing the first set into a plurality of subsets according to the second class of gaps;

In the horizontal direction of the file to be processed, detecting gaps of a first subset including a plurality of file blocks in the plurality of subsets, and determining third type gaps;

Dividing the first subset into a plurality of document blocks according to the third type of gaps;

The document blocks obtained by dividing the first set comprise document blocks in a second subset which only comprises one document block in the subsets and the document blocks obtained by dividing the first subset.

5. The method of claim 4, wherein the third type of gap is a gap that divides the first subset into at least two parts in a vertical direction of the document to be processed and penetrates through a page area where the first subset is located in a horizontal direction of the document to be processed; and the gap value corresponding to the third type of gap is larger than or equal to a third gap reference value.

6. The method of claim 1, wherein scanning the document to be processed to obtain target document scan data comprises:

Scanning a file to be processed to obtain initial document scanning data, wherein the initial document scanning data comprises a plurality of initial document content blocks and position information of the initial document content blocks in the file to be processed;

When the plurality of initial document content blocks meet preset merging conditions, merging the combinable document content blocks in the plurality of initial document content blocks to obtain a plurality of document content block sets;

and acquiring the document scanning data corresponding to each document content block set in the plurality of document content block sets, and taking the document scanning data corresponding to each document content block set as target document scanning data respectively.

7. The method of claim 6, wherein the plurality of initial document content blocks meeting a preset merge condition means that: the number of the initial document content blocks is larger than a preset number threshold, and/or the distribution mode of the initial document content blocks in the file to be processed is a target distribution mode; the combinable document content blocks refer to: and shifting the position between the file to be processed and the adjacent file content blocks to be within a preset shift range.

8. An intelligent terminal, characterized by comprising: a processor and a memory, the memory storing executable program code, the processor for invoking the executable program code to perform the file scan processing method of any of claims 1 to 7.

9. A storage medium having stored therein instructions which, when executed on a computer, cause the computer to perform the document scanning processing method according to any one of claims 1 to 7.

10. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the document scanning method of any of claims 1 to 7.