CN102890784A - Method and device for identifying directions of characters in image blocks - Google Patents
Method and device for identifying directions of characters in image blocks Download PDFInfo
- Publication number
- CN102890784A CN102890784A CN2011102098335A CN201110209833A CN102890784A CN 102890784 A CN102890784 A CN 102890784A CN 2011102098335 A CN2011102098335 A CN 2011102098335A CN 201110209833 A CN201110209833 A CN 201110209833A CN 102890784 A CN102890784 A CN 102890784A
- Authority
- CN
- China
- Prior art keywords
- sub
- image blocks
- correctness
- characters
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1463—Orientation detection or correction, e.g. rotation of multiples of 90 degrees
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/88—Image or video recognition using optical means, e.g. reference filters, holographic masks, frequency domain filters or spatial domain filters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Input (AREA)
- Character Discrimination (AREA)
Abstract
本发明公开了一种识别图像块中文字的方向的方法和装置。所述方法包括:分别以不同的方向作为假设文字方向对图像块进行光学字符识别处理,以得到各个假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;搜索最小匹配对;最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;当最小匹配对中仅包含两个子图像块,且这两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将两个子图像块对应的正确性度量调整为同一数值;基于调整后的子图像块计算各个假设文字方向上的累积正确性度量;以及根据累积正确性度量识别图像块中文字的方向。
The invention discloses a method and a device for recognizing the direction of characters in an image block. The method includes: performing optical character recognition processing on image blocks with different directions as hypothetical text directions, so as to obtain sub-image blocks in each hypothetical text direction, recognition characters corresponding to sub-image blocks and their correctness measures; Matching pair; the minimum matching pair is a set of two sub-image blocks that correspond to each other in the hypothetical text direction with a relationship of 180°, have the same size, and contain the minimum number of sub-image blocks; when the minimum matching pair contains only two sub-image blocks , and the recognition characters corresponding to the two sub-image blocks are the same rotation-invariant character or belong to the same rotation-invariant character pair, adjust the correctness metrics corresponding to the two sub-image blocks to the same value; calculate based on the adjusted sub-image blocks a cumulative correctness measure for each hypothesized text orientation; and identifying an orientation of the text in the image patch based on the cumulative correctness measure.
Description
技术领域 technical field
本发明一般地涉及文档图像处理。具体而言,本发明涉及一种识别图像块中文字的方向的方法和装置。The present invention relates generally to document image processing. In particular, the present invention relates to a method and device for recognizing the direction of characters in an image block.
背景技术 Background technique
当用户使用扫描仪等设备对一叠文档进行扫描时,理想的输入是每个文档的每一页都以正向放置。以正向放置文档时,用户能够轻松阅读该文档,扫描出的文档图像也无需用户调整方向即可阅读。然而,实际使用中,用户要扫描的文档往往是以正向0°、反向180°、横向90°和270°交叠放置。如果用户需要在扫描时对文档的放置方向进行逐页的检查和调整,工作是繁重而耗时的。因此,扫描仪设计有自动文档图像方向判别的功能。基于自动文档图像方向判别功能,扫描得到的文档图像可以被调整为正向,从而减轻了用户的负担,提高了用户的使用效率。When a user uses a device such as a scanner to scan a stack of documents, the ideal input is for each page of each document to be placed in a positive orientation. When the document is placed in the front orientation, the user can easily read the document, and the scanned document image can be read without the user adjusting the orientation. However, in actual use, the documents to be scanned by the user are often stacked at 0° in the forward direction, 180° in the reverse direction, 90° in the horizontal direction and 270° in the horizontal direction. If the user needs to check and adjust the orientation of the document page by page when scanning, the work is tedious and time-consuming. Therefore, the scanner is designed with the function of automatic document image orientation discrimination. Based on the automatic document image orientation discrimination function, the scanned document image can be adjusted to be positive, thereby reducing the burden on users and improving user efficiency.
传统的文档图像自动判别方法是:找到文档图像中的文本行,在4个可能的方向上分别做光学字符识别OCR(Optical CharacterRecognition)处理,得到4个可能方向上的识别字符及对应的置信度或识别距离,并计算文本行的平均置信度或平均识别距离。平均置信度最大或平均识别距离最小的方向被判别为文本行的方向。进而根据文本行的方向判断文档图像的方向。文本行的方向是指文本行的正向,文档图像的方向是指文档图像的正向。下文中,文字(的)方向是指文字的正向。The traditional method for automatic identification of document images is: find the text lines in the document image, perform optical character recognition (OCR) (Optical Character Recognition) processing in four possible directions, and obtain the recognized characters and corresponding confidence levels in the four possible directions or recognition distance, and calculate the average confidence or average recognition distance for lines of text. The direction with the largest average confidence or the smallest average recognition distance is identified as the direction of the text line. Further, the direction of the document image is determined according to the direction of the text line. The direction of the text line refers to the forward direction of the text line, and the direction of the document image refers to the forward direction of the document image. Hereinafter, the (direction) of a character refers to the forward direction of the character.
发明内容 Contents of the invention
在下文中给出了关于本发明的简要概述,以便提供关于本发明的某些方面的基本理解。应当理解,这个概述并不是关于本发明的穷举性概述。它并不是意图确定本发明的关键或重要部分,也不是意图限定本发明的范围。其目的仅仅是以简化的形式给出某些概念,以此作为稍后论述的更详细描述的前序。A brief overview of the invention is given below in order to provide a basic understanding of some aspects of the invention. It should be understood that this summary is not an exhaustive overview of the invention. It is not intended to identify key or critical parts of the invention nor to delineate the scope of the invention. Its purpose is merely to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.
如图1所示,输入文本行“TIP AMOUNT”的图像块,设该方向为0°方向,将该文本行图像块旋转180°得到180°方向的文本行图像块。由于90°和270°方向与0°和180°方向的处理类似,因此这里仅以0°和180°为例进行说明。分别对0°和180°上的文本行图像块进行OCR处理,得到如图1中所示的两个方向上的子图像块、子图像块对应的识别字符及其置信度。As shown in Figure 1, input the image block of the text line "TIP AMOUNT", set the direction as the 0° direction, and rotate the text line image block by 180° to obtain the text line image block in the 180° direction. Since the 90° and 270° directions are handled similarly to the 0° and 180° directions, only 0° and 180° are taken as examples for illustration here. Perform OCR processing on the text line image blocks at 0° and 180° respectively to obtain sub-image blocks in two directions as shown in FIG. 1 , the recognized characters corresponding to the sub-image blocks and their confidence levels.
采用传统的方法,0°方向上识别字符的平均置信度=(0.59+0.36+0.53+0.61+0.61+0.61+0.53+0.72)/8=0.57,180°方向上识别字符的平均置信度=(0.62+0.58+0.65+0.67+0.60+0.46+0.50+0.58)/8=0.5825。由于0.57小于0.5825,传统的方法会将180°方向(即平均置信度高的方向)错误地判断为文本行图像块中文字的方向。Adopt traditional method, the average confidence degree=(0.59+0.36+0.53+0.61+0.61+0.61+0.53+0.72)/8=0.57 of the average confidence degree=(0.59+0.36+0.53+0.61+0.61+0.61+0.53+0.72)/8=0.57, the average confidence degree=( 0.62+0.58+0.65+0.67+0.60+0.46+0.50+0.58)/8=0.5825. Since 0.57 is less than 0.5825, traditional methods will misjudge the 180° direction (ie, the direction with high average confidence) as the direction of the text in the image block of the text line.
产生上述错误的原因之一是180°图像是从0°图像旋转得到的,识别字符当中存在很多旋转不变字符或者旋转不变字符对,比如N、O、p-d、U-n。如果同一个图像在正反两个方向识别的结果是同一旋转不变字符或者属于同一旋转不变字符对,实际上对应的识别置信度应该是一致的,因为正反两个方向的识别结果都是同一形状的正确的识别结果。在传统的利用平均识别置信度来判断文本行方向的方法中,由于没有考虑到旋转不变的特性,造成了一定的性能下降。One of the reasons for the above error is that the 180° image is rotated from the 0° image, and there are many rotation-invariant characters or pairs of rotation-invariant characters among the recognized characters, such as N, O, p-d, U-n. If the recognition result of the same image in both positive and negative directions is the same rotation-invariant character or belongs to the same rotation-invariant character pair, in fact, the corresponding recognition confidence should be the same, because the recognition results in both positive and negative directions are the same. is the correct recognition result of the same shape. In the traditional method of using the average recognition confidence to judge the direction of the text line, because the rotation invariant property is not considered, it causes a certain performance degradation.
旋转不变字符包括具有180°自旋转对称性的字符,即所述旋转不变字符旋转180°后是其本身,例如,“I”,“O”,“Z”,“N”,“$”,“%”,等等。Rotation-invariant characters include characters with 180° self-rotation symmetry, that is, the rotation-invariant characters are themselves after being rotated 180°, for example, "I", "O", "Z", "N", "$ ","%",etc.
旋转不变字符对包括两个字符,且所述两个字符中的任意一个旋转180°后与另一个字符一致或具有形状上的高相似度,例如,“W-M”,“U-n”,“P-d”,等等。A rotation-invariant character pair includes two characters, and any one of the two characters is consistent with the other character after being rotated by 180° or has a high similarity in shape, for example, "W-M", "U-n", "P-d ",etc.
本发明的目的是针对上述问题,提出了一种能够正确识别图像块中文字的方向的方法和装置。该方案通过考虑旋转不变特性并据此对识别字符对应的正确性度量(置信度或识别距离)进行调整,能够提高自动文档图像方向判别的正确率。The object of the present invention is to solve the above problems and propose a method and device for correctly identifying the direction of characters in an image block. By considering the rotation invariant characteristics and adjusting the correctness measure (confidence degree or recognition distance) corresponding to the recognized characters, the scheme can improve the correct rate of automatic document image orientation discrimination.
为了实现上述目的,根据本发明的一个方面,提供了一种识别图像块中文字的方向的方法,包括:分别以不同的方向作为假设文字方向对所述图像块进行光学字符识别处理,以得到在各个所述假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;在互相为180°关系的假设文字方向上的子图像块中,搜索子图像块的最小匹配对;所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为同一数值;基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量;以及根据所述累积正确性度量识别图像块中文字的方向。In order to achieve the above object, according to one aspect of the present invention, a method for recognizing the direction of characters in an image block is provided, including: performing optical character recognition processing on the image block with different directions as hypothetical character directions, to obtain In each sub-image block on the hypothetical text direction, the recognition character corresponding to the sub-image block and its correctness measurement; In the sub-image blocks on the hypothetical text direction that are 180 ° of relationship to each other, search for the minimum matching of the sub-image blocks Right; the minimum matching pair is a set of two sub-image blocks that correspond to each other in the hypothetical text direction of the 180° relationship, have the same size, and contain the minimum number of sub-image blocks; when the two hypothetical texts in the minimum matching pair There is one sub-image block in each direction, and when the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character or belong to the same rotation-invariant character pair, the corresponding correctness of the two sub-image blocks adjust the correctness measure to the same value; calculate the accumulative correctness measure for each assumed text direction based on the adjusted sub-image block; and identify the direction of the character in the image block according to the accumulative correctness measure.
根据本发明的一个具体实施例,所述旋转不变字符包括具有180°自旋转对称性的字符,即所述旋转不变字符旋转180°后是其本身;以及所述旋转不变字符对包括两个字符,且所述两个字符中的任意一个旋转180°后与另一个字符一致或具有形状上的高相似度。According to a specific embodiment of the present invention, the rotation-invariant characters include characters with 180° rotation symmetry, that is, the rotation-invariant characters are themselves after being rotated by 180°; and the rotation-invariant character pairs include Two characters, and any one of the two characters is consistent with the other character or has a high degree of similarity in shape after being rotated by 180°.
根据本发明的一个具体实施例,所述将所述两个子图像块对应的正确性度量调整为同一数值包括将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量的平均值。According to a specific embodiment of the present invention, the adjusting the correctness metrics corresponding to the two sub-image blocks to the same value includes adjusting the correctness metrics corresponding to the two sub-image blocks to the correctness values corresponding to the two sub-image blocks The average value of the metric.
根据本发明的一个具体实施例,所述将所述两个子图像块对应的正确性度量调整为同一数值包括将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量之一。According to a specific embodiment of the present invention, the adjusting the correctness metrics corresponding to the two sub-image blocks to the same value includes adjusting the correctness metrics corresponding to the two sub-image blocks to the correctness values corresponding to the two sub-image blocks One of the metrics.
根据本发明的一个具体实施例,所述正确性度量包括置信度和识别距离;以及所述不同方向包括所述图像块的横向上的两个方向和纵向上的两个方向。According to a specific embodiment of the present invention, the correctness measure includes confidence and recognition distance; and the different directions include two directions in the horizontal direction and two directions in the vertical direction of the image block.
根据本发明的一个具体实施例,所述基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量包括:将各个所述假设文字方向上的调整后的子图像块的正确性度量之和除以相应假设文字方向上的最小匹配对数的结果作为相应假设文字方向上的累积正确性度量。According to a specific embodiment of the present invention, the calculating the cumulative correctness measure of each of the assumed text directions based on the adjusted sub-image blocks includes: calculating the correctness of the adjusted sub-image blocks in each of the assumed text directions The result of dividing the sum of the correctness measures by the minimum matching logarithm in the corresponding hypothesized text direction is used as the cumulative correctness measure in the corresponding hypothesized text direction.
根据本发明的另一个方面,提供了一种识别图像块中文字的方向的装置,包括:光学字符识别处理单元,配置为分别以不同的方向作为假设文字方向对所述图像块进行光学字符识别处理,以得到在各个所述假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;最小匹配对搜索单元,配置为在互相为180°关系的假设文字方向上的子图像块中,搜索子图像块的最小匹配对;所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;子图像块调整单元,配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为同一数值;累积正确性度量计算单元,配置为基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量;以及文字方向识别单元,配置为根据所述累积正确性度量识别图像块中文字的方向。According to another aspect of the present invention, there is provided a device for recognizing the direction of characters in an image block, including: an optical character recognition processing unit configured to perform optical character recognition on the image block using different directions as assumed character directions respectively Processing, to obtain the sub-image blocks on each of the hypothetical text directions, the recognition characters corresponding to the sub-image blocks and their correctness metrics; the minimum matching pair search unit is configured as the hypothetical text directions that are 180° to each other. In the sub-image block, search for the minimum matching pair of the sub-image block; the minimum matching pair is two sub-image blocks corresponding to each other in the hypothetical text direction with a relationship of 180°, having the same size, and containing the minimum number of sub-image blocks A set; a sub-image block adjustment unit configured to have a sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character Or when belonging to the same rotation-invariant character pair, the correctness measure corresponding to the two sub-image blocks is adjusted to the same value; the cumulative correctness measure calculation unit is configured to calculate each assumed text direction based on the adjusted sub-image block A cumulative correctness measure on the above; and a text direction identification unit configured to identify the direction of the text in the image block according to the cumulative correctness measure.
根据本发明的一个具体实施例,所述子图像块调整单元配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量的平均值。According to a specific embodiment of the present invention, the sub-image block adjustment unit is configured such that when there is a sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the two sub-image blocks belonging to the minimum matching pair correspond to When the recognized characters are the same rotation-invariant character or belong to the same rotation-invariant character pair, the correctness metrics corresponding to the two sub-image blocks are adjusted to an average value of the correctness metrics corresponding to the two sub-image blocks.
根据本发明的一个具体实施例,所述子图像块调整单元配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量之一。According to a specific embodiment of the present invention, the sub-image block adjustment unit is configured such that when there is a sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the two sub-image blocks belonging to the minimum matching pair correspond to When the recognized characters are the same rotation-invariant character or belong to the same rotation-invariant character pair, the correctness metrics corresponding to the two sub-image blocks are adjusted to one of the correctness metrics corresponding to the two sub-image blocks.
根据本发明的一个具体实施例,所述累积正确性度量计算单元配置为将各个所述假设文字方向上的调整后的子图像块的正确性度量之和除以相应假设文字方向上的最小匹配对数的结果作为相应假设文字方向上的累积正确性度量。According to a specific embodiment of the present invention, the cumulative correctness measure calculation unit is configured to divide the sum of the correctness measures of the adjusted sub-image blocks in each assumed text direction by the minimum matching value in the corresponding assumed text direction The logarithmic result is taken as a cumulative correctness measure in the corresponding hypothesized text orientation.
另外,根据本发明的另一方面,还提供了一种存储介质。所述存储介质包括机器可读的程序代码,当在信息处理设备上执行所述程序代码时,所述程序代码使得所述信息处理设备执行根据本发明的上述方法。In addition, according to another aspect of the present invention, a storage medium is also provided. The storage medium includes machine-readable program code, and when the program code is executed on the information processing device, the program code causes the information processing device to execute the above-mentioned method according to the present invention.
此外,根据本发明的再一方面,还提供了一种程序产品。所述程序产品包括机器可执行的指令,当在信息处理设备上执行所述指令时,所述指令使得所述信息处理设备执行根据本发明的上述方法。In addition, according to still another aspect of the present invention, a program product is also provided. The program product includes machine-executable instructions that, when executed on an information processing device, cause the information processing device to execute the above-mentioned method according to the present invention.
附图说明 Description of drawings
参照下面结合附图对本发明实施例的说明,会更加容易地理解本发明的以上和其它目的、特点和优点。附图中的部件只是为了示出本发明的原理。在附图中,相同的或类似的技术特征或部件将采用相同或类似的附图标记来表示。附图中:The above and other objects, features and advantages of the present invention will be more easily understood with reference to the following description of the embodiments of the present invention in conjunction with the accompanying drawings. The components in the drawings are only to illustrate the principles of the invention. In the drawings, the same or similar technical features or components will be denoted by the same or similar reference numerals. In the attached picture:
图1示出了文本行图像块在OCR处理后所得到的0°和180°方向上的子图像块、识别字符、置信度;Fig. 1 shows the sub-image block, recognition character, confidence degree on the 0 ° and 180 ° directions that the text line image block obtains after OCR processing;
图2示出了文本行图像块在OCR处理后所得到的0°和180°方向上的子图像块、识别字符、识别距离;Fig. 2 shows sub-image blocks, recognition characters, and recognition distances in the 0° and 180° directions obtained after the text line image blocks are processed by OCR;
图3示出根据本发明的第一实施例的识别图像块中文字的方向的方法的流程图;FIG. 3 shows a flowchart of a method for recognizing the direction of characters in an image block according to a first embodiment of the present invention;
图4示出根据本发明的第二实施例的识别图像块中文字的方向的方法的流程图;FIG. 4 shows a flowchart of a method for recognizing the direction of characters in an image block according to a second embodiment of the present invention;
图5示出根据本发明的一个实施例的识别图像块中文字的方向的识别装置的结构方框图;以及FIG. 5 shows a structural block diagram of a recognition device for recognizing the direction of characters in an image block according to an embodiment of the present invention; and
图6示出可用于实施根据本发明实施例的方法和装置的计算机的示意性框图。Fig. 6 shows a schematic block diagram of a computer that can be used to implement the method and apparatus according to the embodiments of the present invention.
具体实施方式 Detailed ways
在下文中将结合附图对本发明的示范性实施例进行详细描述。为了清楚和简明起见,在说明书中并未描述实际实施方式的所有特征。然而,应该了解,在开发任何这种实际实施例的过程中必须做出很多特定于实施方式的决定,以便实现开发人员的具体目标,例如,符合与系统及业务相关的那些限制条件,并且这些限制条件可能会随着实施方式的不同而有所改变。此外,还应该了解,虽然开发工作有可能是非常复杂和费时的,但对得益于本公开内容的本领域技术人员来说,这种开发工作仅仅是例行的任务。Exemplary embodiments of the present invention will be described in detail below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It should be understood, however, that in developing any such practical embodiment, many implementation-specific decisions must be made in order to achieve the developer's specific goals, such as meeting those constraints related to the system and business, and those Restrictions may vary from implementation to implementation. Moreover, it should also be understood that development work, while potentially complex and time-consuming, would at least be a routine undertaking for those skilled in the art having the benefit of this disclosure.
在此,还需要说明的一点是,为了避免因不必要的细节而模糊了本发明,在附图中仅仅示出了与根据本发明的方案密切相关的装置结构和/或处理步骤,而省略了与本发明关系不大的其他细节。另外,还需要指出的是,在本发明的一个附图或一种实施方式中描述的元素和特征可以与一个或更多个其它附图或实施方式中示出的元素和特征相结合。Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the device structure and/or processing steps closely related to the solution according to the present invention are shown in the drawings, and the Other details not relevant to the present invention are described. In addition, it should also be pointed out that elements and features described in one drawing or one embodiment of the present invention may be combined with elements and features shown in one or more other drawings or embodiments.
下面将参照图3-图4描述根据本发明的实施例的识别图像块中文字的方向的方法的流程。The flow of the method for recognizing the direction of characters in an image block according to an embodiment of the present invention will be described below with reference to FIGS. 3-4 .
在本文中进行如下假设,已经从文档图像中找到文本行,并从文档图像中分割出包含文本行的图像块。本发明的重点并不在于如何从文档图像中搜索文本行的位置,而关注如何正确识别包含文本行的图像块中文字的正确方向。In this paper, it is assumed that the text line has been found from the document image, and the image block containing the text line is segmented from the document image. The focus of the present invention is not how to search for the position of the text line from the document image, but how to correctly identify the correct direction of the characters in the image block containing the text line.
一般而言,主要考虑四个主要方向作为假设文字方向。即图像块本身的方向(0°方向)、将图像块旋转180°的方向、将图像块旋转90°的方向、将图像块旋转270°的方向,也可称为图像块的横向上的两个方向和纵向上的两个方向。90°和270°方向主要应用于汉语、日语等可能竖写文字的情形。由于0°和180°方向与90°和270°方向的情况类似,因此,在下文中以0°和180°方向为例进行说明。In general, four main directions are mainly considered as hypothetical text directions. That is, the direction of the image block itself (0° direction), the direction of rotating the image block by 180°, the direction of rotating the image block by 90°, and the direction of rotating the image block by 270°. one direction and two longitudinal directions. The 90° and 270° directions are mainly used in situations where characters may be written vertically, such as Chinese and Japanese. Since the directions of 0° and 180° are similar to those of the directions of 90° and 270°, the directions of 0° and 180° will be described below as examples.
下面将参照图3描述根据本发明的第一实施例的识别图像块中文字的方向的方法的流程。The flow of the method for recognizing the direction of characters in an image block according to the first embodiment of the present invention will be described below with reference to FIG. 3 .
首先,以0°和180°作为假设文字方向对图像块进行OCR处理,以得到0°和180°方向上的子图像块、子图像块对应的识别字符及其置信度(步骤S301)。图1示出了0°和180°方向上的子图像块、识别字符、置信度的示例,并对子图像块设置了序号。OCR识别结果一般包括分割出的子图像块、子图像块对应的识别字符、识别字符的正确性度量。正确性度量反映了识别字符的可靠程度,通常为置信度或识别距离。置信度越大,识别字符正确的可能性越大;识别距离越小,识别字符正确的可能性越大。在第一实施例中,将以识别结果中包括置信度为例进行说明。在第二实施例中将对识别结果中包括识别距离的情况进行说明。First, OCR processing is performed on the image block with 0° and 180° as the hypothetical text directions, so as to obtain the sub-image blocks in the directions of 0° and 180°, the recognized characters corresponding to the sub-image blocks and their confidence levels (step S301). Fig. 1 shows examples of sub-image blocks, recognized characters, and confidence levels in directions of 0° and 180°, and sequence numbers are set for sub-image blocks. The OCR recognition result generally includes the segmented sub-image blocks, the recognized characters corresponding to the sub-image blocks, and the correctness measure of the recognized characters. The correctness measure reflects the reliability of character recognition, usually confidence or recognition distance. The greater the confidence, the greater the possibility of correct character recognition; the smaller the recognition distance, the greater the possibility of correct character recognition. In the first embodiment, description will be made by taking the recognition result including confidence as an example. In the second embodiment, the case where the recognition distance is included in the recognition result will be described.
接着,在0°和180°方向上的子图像块中,搜索子图像块的最小匹配对(步骤S302)。所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合。最小匹配对包括两个子图像块集合,这两个子图像块集合中包括的子图像块分别位于互相为180°关系的两个假设文字方向上,并且两个子图像块集合的位置对应,大小相同,即两个子图像块集合中的任一子图像块集合在随着其所在的文本行旋转180°后,会与同属于一个最小匹配对中的另一个子图像块集合重合。当这两个子图像块集合中包含的子图像块个数最小时,称这两个子图像块集合构成了最小匹配对。例如,在图1中,P1与N8构成最小匹配对。类似地,P2与N7、P3与N6、P4与N5、P5与N4、P6与N3、P7与N2、P8与N1分别构成最小匹配对。最小匹配对的搜索方法有很多,例如,可以根据最小匹配对的定义从两个方向的对应侧依次寻找最小匹配对。具体地说,如图1所示,在0°方向的最左侧和180°方向的最右侧,分别找到第一个子图像块P1和N8,判断两个子图像块大小相同,因此将P1和N8确定为一个最小匹配对。然后,继续沿上述两个方向寻找下一个子图像块P2和N7,判断两个图像块大小相同,因此将P2和N7确定为一个最小匹配对。依次类推,直至互相为180°关系的两个假设文字方向上的所有最小匹配对均被找到。Next, in the sub-image blocks in the directions of 0° and 180°, the minimum matching pair of the sub-image blocks is searched (step S302). The minimum matching pair is a set of two sub-image blocks that correspond to each other in the hypothetical text direction with a relationship of 180°, have the same size, and contain a minimum number of sub-image blocks. The minimum matching pair includes two sub-image block sets, and the sub-image blocks included in the two sub-image block sets are respectively located in two hypothetical text directions that are 180° to each other, and the positions of the two sub-image block sets are corresponding and have the same size. That is, after any sub-image block set in the two sub-image block sets rotates 180° along with the text line where it is located, it will coincide with another sub-image block set belonging to the same minimum matching pair. When the number of sub-image blocks included in the two sub-image block sets is the smallest, it is said that the two sub-image block sets form a minimum matching pair. For example, in Figure 1, P1 and N8 form a minimum matching pair. Similarly, P2 and N7, P3 and N6, P4 and N5, P5 and N4, P6 and N3, P7 and N2, P8 and N1 respectively constitute the minimum matching pairs. There are many search methods for the minimum matching pair, for example, the minimum matching pair can be searched sequentially from the corresponding sides of the two directions according to the definition of the minimum matching pair. Specifically, as shown in Figure 1, the first sub-image blocks P1 and N8 are respectively found on the leftmost side of the 0° direction and the rightmost side of the 180° direction, and it is judged that the two sub-image blocks have the same size, so P1 and N8 are identified as a minimal matching pair. Then, continue to search for the next sub-image blocks P2 and N7 along the above two directions, and determine that the two image blocks have the same size, so P2 and N7 are determined as a minimum matching pair. And so on, until all the minimum matching pairs on the two hypothetical character directions that are 180° to each other are found.
如上所述,出现错误的原因之一在于没有考虑到字符的旋转不变特性,对于是同一旋转不变字符或属于同一旋转不变字符对的两个方向上的识别结果给出了不同的置信度。通过上述步骤S302中找到的最小匹配对,可以认定为将文本行的图像块进一步细分的结果。As mentioned above, one of the reasons for the error is that the rotation-invariant characteristics of the characters are not considered, and different confidences are given for the recognition results in the two directions of the same rotation-invariant character or belonging to the same rotation-invariant character pair Spend. The minimum matching pair found in the above step S302 can be identified as the result of further subdividing the image block of the text line.
因此,在步骤S303中,判断是否最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对。可以预先定义好旋转不变性字符字典,其中记录有已知的旋转不变字符及旋转不变字符对。通过利用该字典,可以进行步骤S303中的判断。如果步骤S303判断结果为否,则无需调整,直接进行到步骤S305进行后续处理。如果步骤S303判断为是,则进入步骤S 304,对最小匹配对中的子图像块对应的置信度进行调整。Therefore, in step S303, it is judged whether there is a sub-image block in each of the two hypothetical character directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character or belong to Pairs of identical rotation-invariant characters. A rotation-invariant character dictionary can be defined in advance, in which known rotation-invariant characters and pairs of rotation-invariant characters are recorded. By using this dictionary, the judgment in step S303 can be performed. If the judgment result of step S303 is negative, no adjustment is required, and the process proceeds directly to step S305 for subsequent processing. If step S303 is judged to be yes, then enter step S304, the confidence level corresponding to the sub-image block in the minimum matching pair is adjusted.
调整最小匹配对中的子图像块对应的置信度主要是考虑到了旋转不变字符和旋转不变字符对的旋转不变特性。具体地,在步骤S304中,将最小匹配对中的两个子图像块对应的置信度调整为同一数值。同一数值的取值有多种选择。在此,给出几种示例性的实施方式。Adjusting the confidence corresponding to the sub-image blocks in the minimum matching pair mainly takes into account the rotation-invariant characteristics of the rotation-invariant character and the rotation-invariant character pair. Specifically, in step S304, the confidence levels corresponding to the two sub-image blocks in the minimum matching pair are adjusted to the same value. There are multiple choices for the value of the same value. Here, several exemplary implementations are given.
方式一:最小匹配对中的两个子图像块对应的置信度调整为两个子图像块对应的置信度的平均值。Mode 1: The confidence levels corresponding to the two sub-image blocks in the minimum matching pair are adjusted to the average value of the confidence levels corresponding to the two sub-image blocks.
如图1所示,旋转不变字符或者旋转不变字符对包括:P1-N8、P2-N7、P5-N4、P7-N2。因此,可将P1、N8的置信度调整为(0.59+0.58)/2=0.585,将P2、N7的置信度调整为(0.36+0.50)/2=0.43,将P5、N4的置信度调整为(0.61+0.67)/2=0.64,将P7、N2的置信度调整为(0.53+0.58)/2=0.555。As shown in FIG. 1 , the rotation-invariant characters or pairs of rotation-invariant characters include: P1-N8, P2-N7, P5-N4, and P7-N2. Therefore, the confidence of P1 and N8 can be adjusted to (0.59+0.58)/2=0.585, the confidence of P2 and N7 can be adjusted to (0.36+0.50)/2=0.43, and the confidence of P5 and N4 can be adjusted to (0.61+0.67)/2=0.64, adjust the confidence of P7 and N2 to (0.53+0.58)/2=0.555.
方式二:将最小匹配对中的两个子图像块对应的置信度调整为两个子图像块对应的置信度之一。Mode 2: Adjust the confidence levels corresponding to the two sub-image blocks in the minimum matching pair to one of the confidence levels corresponding to the two sub-image blocks.
例如,可将P1、N8的置信度调整为0.59,将P2、N7的置信度调整为0.36,将P5、N4的置信度调整为0.61,将P7、N2的置信度调整为0.53。For example, the confidence levels of P1 and N8 can be adjusted to 0.59, the confidence levels of P2 and N7 can be adjusted to 0.36, the confidence levels of P5 and N4 can be adjusted to 0.61, and the confidence levels of P7 and N2 can be adjusted to 0.53.
在步骤S304中调整了其两个假设文字方向上各有一个子图像块,且这两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对的最小匹配对中的两个子图像块对应的置信度,获得了经调整的置信度,处理进行到步骤S305,基于调整后的置信度计算各个假设方向上的累积置信度,并根据累积置信度识别文本行图像块中文字的方向。In step S304, there is one sub-image block in each of its two hypothetical text directions, and the recognition characters corresponding to these two sub-image blocks are the same rotation-invariant character or the smallest matching pair belonging to the same rotation-invariant character pair. Confidence levels corresponding to the two sub-image blocks, the adjusted confidence level is obtained, and the process proceeds to step S305, based on the adjusted confidence level, the cumulative confidence level in each hypothetical direction is calculated, and the text line image block is identified according to the cumulative confidence level. The direction of the word.
累积置信度是用来表征一个方向上文本行图像块的识别结果整体的正确性度量。通常有两种具体方式计算累积置信度。可以将一个假设文字方向上的所有子图像块对应的置信度之和作为该方向上的累积置信度。也可以将一个假设文字方向上的所有子图像块对应的置信度的算术平均值作为该方向上的累积置信度。累积置信度更高的方向更有可能是正确的识别结果。The cumulative confidence is used to characterize the overall correctness of the recognition results of image blocks of text lines in one direction. There are usually two specific ways to calculate the cumulative confidence. The sum of the confidences corresponding to all sub-image blocks in a hypothetical text direction can be used as the cumulative confidence in the direction. Alternatively, an arithmetic mean of confidences corresponding to all sub-image blocks in a hypothetical text direction may be used as the cumulative confidence in the direction. Directions with higher cumulative confidence are more likely to be correct identifications.
在上述步骤S304中,方式一和方式二的目的都是通过调整最小匹配对中的置信度来针对具有旋转不变特性的识别字符给出更合理的置信度,调整的结果都是最小匹配对整体的置信度更为合理。在步骤S305中,累积置信度的计算方法有多种,作为示例,可计算一个假设文字方向上的所有置信度的总和作为累积置信度,也可计算一个假设文字方向上的平均置信度作为累积置信度。计算平均置信度时,优选地,取一个假设文字方向上的最小匹配对数作为分母,取一个假设文字方向上的所有置信度的总和作为分子。此时的物理意义是将最小匹配对中的子图像块集合作为文本行图像块分割得到的结果的基本单位,调整了最小匹配对中两个子图像块集合对外整体的置信度,不难理解,此时宜取最小匹配对数作为计算平均置信度时的分母。当然,也可取一个假设文字方向上的所有置信度的总和作为分子,并取一个假设文字方向上的子图像块个数作为分母来计算平均置信度。在这种情况下,如果各个假设文字方向上的子图像块个数不同,优选地,在计算各个假设文字方向上的平均置信度时,取同一个假设文字方向上的子图像块个数作为相同的分母。In the above step S304, the purpose of method 1 and method 2 is to give a more reasonable confidence for the recognized characters with rotation invariant characteristics by adjusting the confidence in the minimum matching pair, and the adjusted results are all minimum matching pairs The overall confidence level is more reasonable. In step S305, there are many ways to calculate the cumulative confidence degree. As an example, the sum of all confidence degrees on a hypothetical text direction can be calculated as the cumulative confidence degree, and the average confidence degree on a hypothetical text direction can also be calculated as the cumulative confidence degree. Confidence. When calculating the average confidence, preferably, the minimum matching logarithm in a hypothetical text direction is taken as a denominator, and the sum of all confidences in a hypothetical text direction is taken as a numerator. The physical meaning at this time is that the sub-image block set in the minimum matching pair is used as the basic unit of the result of text line image block segmentation, and the confidence of the two sub-image block sets in the minimum matching pair is adjusted to the outside world. It is not difficult to understand. At this time, it is advisable to take the smallest matching logarithm as the denominator when calculating the average confidence. Of course, it is also possible to take the sum of all confidences in the assumed text direction as the numerator, and take the number of sub-image blocks in the assumed text direction as the denominator to calculate the average confidence. In this case, if the number of sub-image blocks in each hypothetical text direction is different, preferably, when calculating the average confidence in each hypothetical text direction, the number of sub-image blocks in the same hypothetical text direction is taken as same denominator.
显然,各个假设文字方向中,累积置信度最高的假设文字方向应被判定为正确的识别结果所在的方向。Obviously, among the various hypothetical text directions, the hypothetical text direction with the highest cumulative confidence should be determined as the direction where the correct recognition result is located.
以取一个假设文字方向上的所有置信度的平均置信度作为累积置信度为例,采用上述步骤S304的方式一和方式二,计算的累积置信度分别为:Taking the average confidence degree of all confidence degrees in a hypothetical text direction as the cumulative confidence degree as an example, using the method 1 and method 2 of the above step S304, the calculated cumulative confidence degrees are respectively:
方式一:method one:
0°方向累积置信度=(0.585+0.43+0.53+0.61+0.64+0.61+0.555+0.72)/8=0.585Cumulative confidence in 0° direction = (0.585+0.43+0.53+0.61+0.64+0.61+0.555+0.72)/8=0.585
180°方向累积置信度=(0.62+0.555+0.65+0.64+0.60+0.46+0.43+0.585)/8=0.5675Cumulative confidence in 180° direction = (0.62+0.555+0.65+0.64+0.60+0.46+0.43+0.585)/8=0.5675
方式二:Method 2:
0°方向累积置信度=(0.59+0.36+0.53+0.61+0.61+0.61+0.53+0.72)/8=0.57Cumulative confidence in 0° direction = (0.59+0.36+0.53+0.61+0.61+0.61+0.53+0.72)/8=0.57
180°方向累积置信度=(0.62+0.53+0.65+0.61+0.60+0.46+0.36+0.59)/8=0.5525。Cumulative confidence in 180° direction=(0.62+0.53+0.65+0.61+0.60+0.46+0.36+0.59)/8=0.5525.
可见,采用上述两种方式调整置信度后,均为0°方向累积置信度大于180°方向累积置信度。可见,给出了更为准确的判断结果。It can be seen that after using the above two methods to adjust the confidence, the cumulative confidence in the 0° direction is greater than the cumulative confidence in the 180° direction. It can be seen that a more accurate judgment result is given.
下面将参照图4描述根据本发明的第二实施例的识别图像块中文字的方向的方法的流程。The flow of the method for recognizing the direction of characters in an image block according to the second embodiment of the present invention will be described below with reference to FIG. 4 .
如上所述,OCR识别结果一般包括分割出的子图像块、子图像块对应的识别字符、识别字符的正确性度量。正确性度量反映了识别字符的可靠程度,通常为置信度或识别距离。以上在第一实施例中,以识别结果中包括置信度为例进行说明。在第二实施例中将对识别结果中包括识别距离的情况进行说明。图2给出了示出了0°和180°方向上的子图像块、识别字符、识别距离的示例,并对子图像块设置了序号。As mentioned above, the OCR recognition result generally includes the segmented sub-image blocks, the recognized characters corresponding to the sub-image blocks, and the correctness measure of the recognized characters. The correctness measure reflects the reliability of character recognition, usually confidence or recognition distance. In the above, in the first embodiment, the recognition result includes the confidence degree as an example for description. In the second embodiment, the case where the recognition distance is included in the recognition result will be described. Fig. 2 shows examples of sub-image blocks in directions of 0° and 180°, recognized characters, and recognition distances, and sequence numbers are set for sub-image blocks.
在图2中,采用传统的方法,0°方向上识别字符的平均识别距离=(828+1279+934+774+778+789+940+595)/8=864.625,180°方向上识别字符的平均识别距离=(759+840+704+669+802+1087+1005+790)/8=832。由于832小于864.625,传统的方法会将180°方向(即平均识别距离小的方向)错误地判断为文本行图像块中文字的方向。造成这个错误的原因在于没有考虑到字符的旋转不变特性,对于是同一旋转不变字符或属于同一旋转不变字符对的两个方向上的识别结果给出了不同的置信度。In Fig. 2, adopt traditional method, the average recognition distance=(828+1279+934+774+778+789+940+595)/8=864.625 of the average recognition distance of recognition character on the 0 ° direction, recognition character on the 180 ° direction Average recognition distance=(759+840+704+669+802+1087+1005+790)/8=832. Since 832 is less than 864.625, the traditional method will mistakenly judge the 180° direction (that is, the direction with the smallest average recognition distance) as the direction of the text in the image block of the text line. The reason for this error is that the rotation-invariant characteristics of characters are not considered, and different confidence levels are given for the recognition results in two directions that are the same rotation-invariant character or belong to the same rotation-invariant character pair.
由于产生问题的原因在于没有考虑到字符的旋转不变特性,而本发明所采用的方法是将旋转不变字符或旋转不变字符对的正确性度量调整为同一数值,因此,上述在第一实施例中所描述的思想同样适用于识别结果中包括识别距离而非置信度的情形。Because the cause of the problem is that the rotation-invariant characteristics of the characters are not considered, and the method adopted in the present invention is to adjust the correctness measure of the rotation-invariant characters or the rotation-invariant character pairs to the same value, therefore, the above-mentioned in the first The ideas described in the embodiments are also applicable to the situation where the recognition result includes the recognition distance instead of the confidence level.
下面将参照图4描述根据本发明的第二实施例的识别图像块中文字的方向的方法的流程。第二实施例的方法与第一实施例的方法类似。The flow of the method for recognizing the direction of characters in an image block according to the second embodiment of the present invention will be described below with reference to FIG. 4 . The method of the second embodiment is similar to the method of the first embodiment.
首先,以0°和180°作为假设文字方向对图像块进行OCR处理,以得到0°和180°方向上的子图像块、子图像块对应的识别字符及其识别距离(步骤S401)。Firstly, OCR processing is performed on the image block with 0° and 180° as the hypothetical text directions, to obtain the sub-image blocks in the directions of 0° and 180°, the recognized characters corresponding to the sub-image blocks and their recognition distances (step S401).
接着,在0°和180°方向上的子图像块中,搜索子图像块的最小匹配对(步骤S402)。例如,在图1中,在图1中,P1与N8构成最小匹配对。类似地,P2与N7、P3与N6、P4与N5、P5与N4、P6与N3、P7与N2、P8与N1分别构成最小匹配对。Next, in the sub-image blocks in the directions of 0° and 180°, the minimum matching pair of the sub-image blocks is searched (step S402). For example, in Figure 1, in Figure 1, P1 and N8 form a minimum matching pair. Similarly, P2 and N7, P3 and N6, P4 and N5, P5 and N4, P6 and N3, P7 and N2, P8 and N1 respectively constitute the minimum matching pairs.
在步骤S403中,判断是否最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对。可以预先定义好旋转不变性字符字典,其中记录有已知的旋转不变字符及旋转不变字符对。通过利用该字典,可以进行步骤S403中的判断。如果判断结果为否,则无需调整,直接进行到步骤S405进行后续处理。如果判断为是,则进入步骤S404,对最小匹配对中的子图像块对应的识别距离进行调整。In step S403, it is judged whether there is a sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character or belong to the same rotation Invariant character pairs. A rotation-invariant character dictionary can be defined in advance, in which known rotation-invariant characters and pairs of rotation-invariant characters are recorded. By using this dictionary, the judgment in step S403 can be performed. If the judgment result is negative, no adjustment is required, and the process proceeds directly to step S405 for subsequent processing. If it is judged as yes, go to step S404, and adjust the recognition distance corresponding to the sub-image block in the minimum matching pair.
调整最小匹配对中的子图像块对应的识别距离是主要是考虑到了旋转不变字符和旋转不变字符对的旋转不变特性。具体地,将最小匹配对中的两个子图像块对应的识别距离调整为同一数值。同一数值的取值有多种选择。在此,给出几种示例性的实施方式。Adjusting the recognition distance corresponding to the sub-image blocks in the minimum matching pair mainly takes into account the rotation-invariant characteristics of the rotation-invariant character and the rotation-invariant character pair. Specifically, the recognition distances corresponding to the two sub-image blocks in the minimum matching pair are adjusted to the same value. There are multiple choices for the value of the same value. Here, several exemplary implementations are given.
方式一:将最小匹配对中的两个子图像块对应的识别距离调整为两个子图像块对应的识别距离的平均值。Way 1: Adjust the recognition distances corresponding to the two sub-image blocks in the minimum matching pair to the average value of the recognition distances corresponding to the two sub-image blocks.
如图1所示,旋转不变字符或者旋转不变字符对包括:P1-N8、P2-N7、P5-N4、P7-N2。因此,可将P1、N8的识别距离调整为(828+790)/2=809,将P2、N7的识别距离调整为(1279+1005)/2=1142,将P5、N4的识别距离调整为(778+669)/2=723.5,将P7、N2的识别距离调整为(940+840)/2=890。As shown in FIG. 1 , the rotation-invariant characters or pairs of rotation-invariant characters include: P1-N8, P2-N7, P5-N4, and P7-N2. Therefore, the recognition distance of P1 and N8 can be adjusted to (828+790)/2=809, the recognition distance of P2 and N7 can be adjusted to (1279+1005)/2=1142, and the recognition distance of P5 and N4 can be adjusted to (778+669)/2=723.5, adjust the recognition distance of P7 and N2 to (940+840)/2=890.
方式二:将最小匹配对中的两个子图像块对应的识别距离调整为两个子图像块对应的识别距离之一。Method 2: Adjust the recognition distances corresponding to the two sub-image blocks in the minimum matching pair to one of the recognition distances corresponding to the two sub-image blocks.
例如,可将P1、N8的识别距离调整为828,将P2、N7的识别距离调整为1279,将P5、N4的识别距离调整为778,将P7、N2的识别距离调整为940。For example, the recognition distance of P1 and N8 can be adjusted to 828, the recognition distance of P2 and N7 can be adjusted to 1279, the recognition distance of P5 and N4 can be adjusted to 778, and the recognition distance of P7 and N2 can be adjusted to 940.
在步骤S404中调整了其两个假设文字方向上各有一个子图像块,且这两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对的最小匹配对中的两个子图像块对应的识别距离,获得了经调整的识别距离,处理进行到步骤S405,基于调整后的识别距离计算各个假设方向上的累积识别距离,并根据累积识别距离识别文本行图像块中文字的方向。In step S404, there is a sub-image block in each of its two hypothetical text directions, and the recognition characters corresponding to these two sub-image blocks are the same rotation-invariant character or the minimum matching pair belonging to the same rotation-invariant character pair. The recognition distance corresponding to the two sub-image blocks obtains the adjusted recognition distance, and the process proceeds to step S405, and calculates the cumulative recognition distance in each hypothetical direction based on the adjusted recognition distance, and recognizes the text line image block Chinese according to the cumulative recognition distance The direction of the word.
累积识别距离是用来表征一个方向上文本行图像块的识别结果整体的正确性度量。通常有两种具体方式计算累积识别距离。可以将一个假设文字方向上的所有子图像块对应的识别距离之和作为该方向上的累积识别距离。也可以将一个假设文字方向上的所有子图像块对应的识别距离的算术平均值作为该方向上的累积识别距离。累积识别距离更小的方向更有可能是正确的识别结果。The cumulative recognition distance is used to characterize the overall correctness of the recognition results of text line image blocks in one direction. There are usually two specific ways to calculate the cumulative recognition distance. The sum of the recognition distances corresponding to all sub-image blocks in a hypothetical character direction can be used as the cumulative recognition distance in this direction. Alternatively, the arithmetic mean value of the recognition distances corresponding to all sub-image blocks in a hypothetical character direction may be used as the cumulative recognition distance in this direction. The direction with smaller cumulative recognition distance is more likely to be the correct recognition result.
在上述步骤S404中,方式一和方式二的目的都是通过调整最小匹配对中的识别距离来针对具有旋转不变特性的识别字符给出更合理的识别距离,调整的结果都是最小匹配对整体的识别距离更为合理。在步骤S405中,累积识别距离的计算方法有多种,作为示例,可计算一个假设文字方向上的所有识别距离的总和作为累积识别距离,也可计算一个假设文字方向上的平均识别距离作为累积识别距离。计算平均识别距离时,优选地,取一个假设文字方向上的最小匹配对数作为分母,取一个假设文字方向上的所有识别距离的总和作为分子。此时的物理意义是将最小匹配对中的子图像块集合作为文本行图像块分割得到的结果的基本单位,调整了最小匹配对中两个子图像块集合对外整体的识别距离,不难理解,此时宜取最小匹配对数作为计算平均识别距离时的分母。In the above step S404, the purpose of the first and second methods is to adjust the recognition distance in the minimum matching pair to give a more reasonable recognition distance for the recognition characters with rotation invariant characteristics, and the adjusted results are all the minimum matching pair The overall recognition distance is more reasonable. In step S405, there are many ways to calculate the cumulative recognition distance. As an example, the sum of all recognition distances in a hypothetical character direction can be calculated as the cumulative recognition distance, or the average recognition distance in a hypothetical character direction can be calculated as the cumulative recognition distance. Recognition distance. When calculating the average recognition distance, preferably, a minimum matching logarithm in a hypothetical character direction is taken as a denominator, and a sum of all recognition distances in a hypothetical character direction is taken as a numerator. The physical meaning at this time is to use the sub-image block set in the minimum matching pair as the basic unit of the result of text line image block segmentation, and adjust the recognition distance of the two sub-image block sets in the minimum matching pair to the outside world, which is not difficult to understand. At this time, it is advisable to take the minimum matching logarithm as the denominator when calculating the average recognition distance.
显然,各个假设文字方向中,累积识别距离最小的假设文字方向应被判定为正确的识别结果所在的方向。Obviously, among various hypothetical text directions, the hypothetical text direction with the smallest cumulative recognition distance should be determined as the direction where the correct recognition result is located.
以取一个假设文字方向上的平均识别距离作为累积识别距离为例,采用上述步骤S404的方式一和方式二,计算的累积识别距离分别为:Taking the average recognition distance in a hypothetical character direction as the cumulative recognition distance as an example, using the first and second methods of step S404 above, the calculated cumulative recognition distances are respectively:
方式一:method one:
0°方向累积识别距离=(809+1142+934+774+723.5+789+890+595)/8=832.0625Cumulative recognition distance in 0° direction = (809+1142+934+774+723.5+789+890+595)/8=832.0625
180°方向累积识别距离=(759+890+704+723.5+802+1087+1142+809)/8=864.5625Cumulative recognition distance in 180° direction = (759+890+704+723.5+802+1087+1142+809)/8=864.5625
方式二:Method 2:
0°方向累积识别距离=(828+1279+934+774+778+789+940+595)/8=864.625Cumulative recognition distance in 0° direction = (828+1279+934+774+778+789+940+595)/8=864.625
180°方向累积识别距离=(759+940+704+778+802+1087+1279+828)/8=897.125。Cumulative recognition distance in 180° direction=(759+940+704+778+802+1087+1279+828)/8=897.125.
可见,采用上述两种方式调整识别距离后,均为0°方向累积识别距离小于180°方向累积识别距离。可见,给出了更为准确的判断结果。It can be seen that after the recognition distance is adjusted by the above two methods, the cumulative recognition distance in the 0° direction is smaller than the cumulative recognition distance in the 180° direction. It can be seen that a more accurate judgment result is given.
下面将结合图5描述根据本发明的一个实施例的识别图像块中文字的方向的识别装置的结构。如图5所示,根据该实施例的识别图像块中文字的方向的识别装置500包括:光学字符识别处理单元501,配置为分别以不同的方向作为假设文字方向对所述图像块进行光学字符识别处理,以得到在各个所述假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;最小匹配对搜索单元502,配置为在互相为180°关系的假设文字方向上的子图像块中,搜索子图像块的最小匹配对;所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;子图像块调整单元503,配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为同一数值;累积正确性度量计算单元504,配置为基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量;以及文字方向识别单元505,配置为根据所述累积正确性度量识别图像块中文字的方向。The structure of a recognition device for recognizing the direction of characters in an image block according to an embodiment of the present invention will be described below with reference to FIG. 5 . As shown in FIG. 5 , the recognition device 500 for recognizing the direction of characters in an image block according to this embodiment includes: an optical character recognition processing unit 501 configured to perform optical character recognition on the image block with different directions as assumed character directions respectively. Recognition processing, to obtain the sub-image blocks on each of the hypothetical text directions, the recognized characters corresponding to the sub-image blocks and their correctness metrics; the minimum matching pair search unit 502 is configured to be in the hypothetical text directions that are 180° to each other In the sub-image blocks above, the minimum matching pair of sub-image blocks is searched; the minimum matching pair is two sub-image blocks with corresponding positions, the same size, and the minimum number of sub-image blocks in the hypothetical text direction with a relationship of 180° to each other. A set of image blocks; a sub-image block adjustment unit 503 configured to have a sub-image block in each of the two hypothetical character directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are of the same rotation Invariant characters or belonging to the same rotation-invariant character pair, the correctness metrics corresponding to the two sub-image blocks are adjusted to the same value; the cumulative correctness metric calculation unit 504 is configured to calculate each of the correctness metrics based on the adjusted sub-image blocks the accumulative correctness measure on the assumed character direction; and the character direction identification unit 505 configured to identify the direction of the character in the image block according to the accumulative correctness measure.
由于在根据本发明的识别装置500所包括的光学字符识别处理单元501、最小匹配对搜索单元502、子图像块调整单元503、累积正确性度量计算单元504以及文字方向识别单元505中的处理分别与上面描述的识别图像块中文字的方向的方法的步骤S301-S305、S401-S405中的处理类似,因此为了简洁起见,在此省略这些单元中的详细描述。Due to the processing in the optical character recognition processing unit 501, the minimum matching pair search unit 502, the sub-image block adjustment unit 503, the cumulative correctness metric calculation unit 504 and the character direction recognition unit 505 included in the recognition device 500 according to the present invention are respectively It is similar to the processing in steps S301-S305, S401-S405 of the method for identifying the direction of characters in an image block described above, so for the sake of brevity, detailed descriptions in these units are omitted here.
此外,这里尚需指出的是,上述装置中各个组成模块、单元可以通过软件、固件、硬件或其组合的方式进行配置。配置可使用的具体手段或方式为本领域技术人员所熟知,在此不再赘述。在通过软件或固件实现的情况下,从存储介质或网络向具有专用硬件结构的计算机(例如图6所示的通用计算机600)安装构成该软件的程序,该计算机在安装有各种程序时,能够执行各种功能等。In addition, it should be pointed out here that each component module and unit in the above device can be configured by means of software, firmware, hardware or a combination thereof. Specific means or manners that can be used for configuration are well known to those skilled in the art, and will not be repeated here. In the case of realizing by software or firmware, the program constituting the software is installed from a storage medium or a network to a computer (for example, a general-
在图6中,中央处理单元(CPU)601根据只读存储器(ROM)602中存储的程序或从存储部分608加载到随机存取存储器(RAM)603的程序执行各种处理。在RAM 603中,还根据需要存储当CPU 601执行各种处理等等时所需的数据。CPU 601、ROM 602和RAM 603经由总线604彼此连接。输入/输出接口605也连接到总线604。In FIG. 6 , a central processing unit (CPU) 601 executes various processes according to programs stored in a read only memory (ROM) 602 or loaded from a
下述部件连接到输入/输出接口605:输入部分606(包括键盘、鼠标等等)、输出部分607(包括显示器,比如阴极射线管(CRT)、液晶显示器(LCD)等,和扬声器等)、存储部分608(包括硬盘等)、通信部分609(包括网络接口卡比如LAN卡、调制解调器等)。通信部分609经由网络比如因特网执行通信处理。根据需要,驱动器610也可连接到输入/输出接口605。可拆卸介质611比如磁盘、光盘、磁光盘、半导体存储器等等可以根据需要被安装在驱动器610上,使得从中读出的计算机程序根据需要被安装到存储部分608中。The following components are connected to the input/output interface 605: an input section 606 (including a keyboard, a mouse, etc.), an output section 607 (including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.), A storage section 608 (including a hard disk, etc.), a communication section 609 (including a network interface card such as a LAN card, a modem, etc.). The
在通过软件实现上述系列处理的情况下,从网络比如因特网或存储介质比如可拆卸介质611安装构成软件的程序。In the case of realizing the above-described series of processes by software, the programs constituting the software are installed from a network such as the Internet or a storage medium such as the
本领域的技术人员应当理解,这种存储介质不局限于图6所示的其中存储有程序、与设备相分离地分发以向用户提供程序的可拆卸介质611。可拆卸介质611的例子包含磁盘(包含软盘(注册商标))、光盘(包含光盘只读存储器(CD-ROM)和数字通用盘(DVD))、磁光盘(包含迷你盘(MD)(注册商标))和半导体存储器。或者,存储介质可以是ROM 602、存储部分608中包含的硬盘等等,其中存有程序,并且与包含它们的设备一起被分发给用户。Those skilled in the art should understand that such a storage medium is not limited to the
本发明还提出一种存储有机器可读取的指令代码的程序产品。所述指令代码由机器读取并执行时,可执行上述根据本发明实施例的方法。The invention also proposes a program product storing machine-readable instruction codes. When the instruction code is read and executed by a machine, the above-mentioned method according to the embodiment of the present invention can be executed.
相应地,用于承载上述存储有机器可读取的指令代码的程序产品的存储介质也包括在本发明的公开中。所述存储介质包括但不限于软盘、光盘、磁光盘、存储卡、存储棒等等。Correspondingly, a storage medium for carrying the program product storing the above-mentioned machine-readable instruction codes is also included in the disclosure of the present invention. The storage medium includes, but is not limited to, a floppy disk, an optical disk, a magneto-optical disk, a memory card, a memory stick, and the like.
本发明实施例中公开的识别图像块中文字的方向的识别装置,识别图像块中文字的方向的识别方法,以及相应的程序产品可以用于扫描仪等图像扫描装置,用于识别所扫描的文件中文字的方向。The recognition device for recognizing the direction of characters in an image block disclosed in the embodiments of the present invention, the recognition method for recognizing the direction of characters in an image block, and the corresponding program products can be used in image scanning devices such as scanners to identify scanned The orientation of the text in the document.
在上面对本发明具体实施例的描述中,针对一种实施方式描述和/或示出的特征可以以相同或类似的方式在一个或更多个其它实施方式中使用,与其它实施方式中的特征相组合,或替代其它实施方式中的特征。In the above description of specific embodiments of the present invention, features described and/or illustrated for one embodiment can be used in the same or similar manner in one or more other embodiments, and features in other embodiments Combination or replacement of features in other embodiments.
应该强调,术语“包括/包含”在本文使用时指特征、要素、步骤或组件的存在,但并不排除一个或更多个其它特征、要素、步骤或组件的存在或附加。It should be emphasized that the term "comprising/comprising" when used herein refers to the presence of a feature, element, step or component, but does not exclude the presence or addition of one or more other features, elements, steps or components.
此外,本发明的方法不限于按照说明书中描述的时间顺序来执行,也可以按照其他的时间顺序地、并行地或独立地执行。因此,本说明书中描述的方法的执行顺序不对本发明的技术范围构成限制。In addition, the method of the present invention is not limited to being executed in the chronological order described in the specification, and may also be executed in other chronological order, in parallel or independently. Therefore, the execution order of the methods described in this specification does not limit the technical scope of the present invention.
根据以上多个实施例,本发明还包括以下附记:According to the above multiple embodiments, the present invention also includes the following additional notes:
附记1.一种识别图像块中文字的方向的方法,包括:Additional Note 1. A method for recognizing the direction of text in an image block, comprising:
分别以不同的方向作为假设文字方向对所述图像块进行光学字符识别处理,以得到在各个所述假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;Using different directions as hypothetical text directions to perform optical character recognition processing on the image blocks, to obtain sub-image blocks in each hypothetical text direction, recognition characters corresponding to sub-image blocks and their correctness metrics;
在互相为180°关系的假设文字方向上的子图像块中,搜索子图像块的最小匹配对;所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;In the sub-image blocks on the hypothetical text direction that are 180° to each other, search for the minimum matching pair of the sub-image blocks; Two sets of sub-image blocks with a minimum number of sub-image blocks;
当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为同一数值;When there is a sub-image block in the two hypothetical text directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character or belong to the same rotation-invariant character pair, Adjusting the correctness metrics corresponding to the two sub-image blocks to the same value;
基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量;以及calculating a cumulative correctness measure for each of said hypothetical text orientations based on the adjusted sub-image blocks; and
根据所述累积正确性度量识别图像块中文字的方向。The orientation of the text in the image block is identified based on the cumulative correctness measure.
附记2.根据附记1所述的方法,其中Additional Note 2. The method according to Additional Note 1, wherein
所述旋转不变字符包括具有180°自旋转对称性的字符,即所述旋转不变字符旋转180°后是其本身;以及The rotation-invariant characters include characters with 180° self-rotational symmetry, that is, the rotation-invariant characters are themselves after being rotated by 180°; and
所述旋转不变字符对包括两个字符,且所述两个字符中的任意一个旋转180°后与另一个字符一致或具有形状上的高相似度。The rotation-invariant character pair includes two characters, and any one of the two characters is consistent with the other character or has a high similarity in shape after being rotated by 180°.
附记3.根据附记1所述的方法,其中所述将所述两个子图像块对应的正确性度量调整为同一数值包括将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量的平均值。Supplement 3. The method according to Supplement 1, wherein said adjusting the correctness metrics corresponding to the two sub-image blocks to the same value includes adjusting the correctness metrics corresponding to the two sub-image blocks to two sub-images The average of the correctness metrics corresponding to the block.
附记4.根据附记1所述的方法,其中所述将所述两个子图像块对应的正确性度量调整为同一数值包括将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量之一。Supplement 4. The method according to Supplement 1, wherein said adjusting the correctness metrics corresponding to the two sub-image blocks to the same value includes adjusting the correctness metrics corresponding to the two sub-image blocks to two sub-images One of the correctness metrics corresponding to the block.
附记5.根据附记1-4之一所述的方法,其中Supplementary Note 5. The method according to one of Supplementary Notes 1-4, wherein
所述正确性度量包括置信度和识别距离;以及The correctness measures include confidence and recognition distance; and
所述不同方向包括所述图像块的横向上的两个方向和纵向上的两个方向。The different directions include two directions in the transverse direction and two directions in the longitudinal direction of the image block.
附记6.根据附记1-4之一所述的方法,其中所述基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量包括:将各个所述假设文字方向上的调整后的子图像块的正确性度量之和除以相应假设文字方向上的最小匹配对数的结果作为相应假设文字方向上的累积正确性度量。Supplementary Note 6. The method according to any one of Supplementary Notes 1-4, wherein the calculation of the cumulative correctness measure in each of the assumed text directions based on the adjusted sub-image blocks includes: The result of dividing the sum of the correctness measures of the adjusted sub-image blocks by the minimum matching logarithm in the corresponding hypothesized text direction is used as the cumulative correctness measure in the corresponding hypothesized text direction.
附记7.一种识别图像块中文字的方向的装置,包括:Additional note 7. A device for recognizing the direction of characters in an image block, comprising:
光学字符识别处理单元,配置为分别以不同的方向作为假设文字方向对所述图像块进行光学字符识别处理,以得到在各个所述假设文字方向上的子图像块、子图像块对应的识别字符及其正确性度量;The optical character recognition processing unit is configured to perform optical character recognition processing on the image blocks with different directions as hypothetical text directions, so as to obtain sub-image blocks in each of the hypothetical text directions and the recognition characters corresponding to the sub-image blocks and its correctness measure;
最小匹配对搜索单元,配置为在互相为180°关系的假设文字方向上的子图像块中,搜索子图像块的最小匹配对;所述最小匹配对是在互相为180°关系的假设文字方向上位置对应、大小相同、包含最小个数的子图像块的两个子图像块集合;The minimum matching pair search unit is configured to search for the minimum matching pair of the sub-image blocks in the sub-image blocks in the hypothetical text direction of the 180° relationship; the minimum matching pair is in the hypothetical text direction of the 180 ° relationship Two sets of sub-image blocks corresponding to the upper position, the same size, and containing the minimum number of sub-image blocks;
子图像块调整单元,配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为同一数值;The sub-image block adjustment unit is configured such that when there is a sub-image block in the two hypothetical text directions in the minimum matching pair, and the recognition characters corresponding to the two sub-image blocks belonging to the minimum matching pair are the same rotation-invariant character or belong to When the same rotation-invariant character pair is used, the correctness metrics corresponding to the two sub-image blocks are adjusted to the same value;
累积正确性度量计算单元,配置为基于调整后的子图像块计算各个所述假设文字方向上的累积正确性度量;以及a cumulative correctness measure calculation unit configured to calculate a cumulative correctness measure for each of the assumed text directions based on the adjusted sub-image blocks; and
文字方向识别单元,配置为根据所述累积正确性度量识别图像块中文字的方向。The character direction recognition unit is configured to recognize the direction of the characters in the image block according to the cumulative correctness measure.
附记8.如附记7所述的装置,其中所述子图像块调整单元配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量的平均值。Supplement 8. The device according to Supplement 7, wherein the sub-image block adjustment unit is configured such that when there is one sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the two hypothetical texts belonging to the minimum matching pair When the recognized characters corresponding to the sub-image blocks are the same rotation-invariant character or belong to the same rotation-invariant character pair, the correctness metrics corresponding to the two sub-image blocks are adjusted to the average value of the correctness metrics corresponding to the two sub-image blocks.
附记9.如附记7所述的装置,其中所述子图像块调整单元配置为当最小匹配对中的两个假设文字方向上各有一个子图像块,且属于该最小匹配对的两个子图像块对应的识别字符是同一旋转不变字符或属于同一旋转不变字符对时,将所述两个子图像块对应的正确性度量调整为两个子图像块对应的正确性度量之一。Supplement 9. The device according to Supplement 7, wherein the sub-image block adjustment unit is configured such that when there is one sub-image block in each of the two hypothetical text directions in the minimum matching pair, and the two hypothetical texts belonging to the minimum matching pair When the recognized characters corresponding to the sub-image blocks are the same rotation-invariant character or belong to the same rotation-invariant character pair, the correctness metrics corresponding to the two sub-image blocks are adjusted to one of the correctness metrics corresponding to the two sub-image blocks.
附记10.如附记7所述的装置,其中所述累积正确性度量计算单元配置为将各个所述假设文字方向上的调整后的子图像块的正确性度量之和除以相应假设文字方向上的最小匹配对数的结果作为相应假设文字方向上的累积正确性度量。Supplementary Note 10. The device according to Supplementary Note 7, wherein the cumulative correctness measure calculation unit is configured to divide the sum of the correctness measures of the adjusted sub-image blocks in each of the hypothetical text directions by the corresponding hypothetical text The result of the smallest number of matching pairs in an orientation serves as the cumulative correctness measure for the corresponding hypothesized text orientation.
附记11.一种扫描仪,所述扫描仪包括附记7-10之一所述的识别图像块中文字的方向的装置。Supplement 11. A scanner comprising the device for identifying the direction of characters in an image block according to any one of Supplements 7-10.
尽管上面已经通过对本发明的具体实施例的描述对本发明进行了披露,但是,应该理解,上述的所有实施例和示例均是示例性的,而非限制性的。本领域的技术人员可在所附权利要求的精神和范围内设计对本发明的各种修改、改进或者等同物。这些修改、改进或者等同物也应当被认为包括在本发明的保护范围内。Although the present invention has been disclosed by the description of specific embodiments of the present invention above, it should be understood that all the above embodiments and examples are illustrative rather than restrictive. Those skilled in the art can devise various modifications, improvements or equivalents to the present invention within the spirit and scope of the appended claims. These modifications, improvements or equivalents should also be considered to be included in the protection scope of the present invention.
Claims (10)
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201110209833.5A CN102890784B (en) | 2011-07-20 | 2011-07-20 | The method and apparatus in the direction of word in recognition image block |
| US13/525,736 US8787674B2 (en) | 2011-07-20 | 2012-06-18 | Method of and device for identifying direction of characters in image block |
| JP2012150259A JP5910365B2 (en) | 2011-07-20 | 2012-07-04 | Method and apparatus for recognizing the direction of characters in an image block |
| KR1020120073938A KR101345925B1 (en) | 2011-07-20 | 2012-07-06 | Method of and device for identifying direction of characters in image block |
| EP12176593.7A EP2549407B1 (en) | 2011-07-20 | 2012-07-16 | Method of and device for identifying direction of characters in image block |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201110209833.5A CN102890784B (en) | 2011-07-20 | 2011-07-20 | The method and apparatus in the direction of word in recognition image block |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN102890784A true CN102890784A (en) | 2013-01-23 |
| CN102890784B CN102890784B (en) | 2016-03-30 |
Family
ID=46679100
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201110209833.5A Expired - Fee Related CN102890784B (en) | 2011-07-20 | 2011-07-20 | The method and apparatus in the direction of word in recognition image block |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US8787674B2 (en) |
| EP (1) | EP2549407B1 (en) |
| JP (1) | JP5910365B2 (en) |
| KR (1) | KR101345925B1 (en) |
| CN (1) | CN102890784B (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017088282A1 (en) * | 2015-11-25 | 2017-06-01 | 中兴通讯股份有限公司 | Display method and apparatus for adjusting picture character |
| CN107220640A (en) * | 2017-05-23 | 2017-09-29 | 广州绿怡信息科技有限公司 | Character identifying method, device, computer equipment and computer-readable recording medium |
| CN108345827A (en) * | 2017-01-24 | 2018-07-31 | 富士通株式会社 | Identify method, system and the neural network in document direction |
| CN114241184A (en) * | 2020-09-09 | 2022-03-25 | 顺丰科技有限公司 | Text character detection method, device and storage medium |
| CN114842464A (en) * | 2022-05-13 | 2022-08-02 | 北京百度网讯科技有限公司 | Image direction recognition method, device, equipment, storage medium and program product |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10210384B2 (en) * | 2016-07-25 | 2019-02-19 | Intuit Inc. | Optical character recognition (OCR) accuracy by combining results across video frames |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1242560A (en) * | 1998-06-01 | 2000-01-26 | 佳能株式会社 | Image processing method, device and storage medium therefor |
| US6993205B1 (en) * | 2000-04-12 | 2006-01-31 | International Business Machines Corporation | Automatic method of detection of incorrectly oriented text blocks using results from character recognition |
| CN101833648A (en) * | 2009-03-13 | 2010-09-15 | 汉王科技股份有限公司 | Method for correcting text image |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5031225A (en) * | 1987-12-09 | 1991-07-09 | Ricoh Company, Ltd. | Character recognition method for recognizing character in an arbitrary rotation position |
| JPH04195485A (en) | 1990-11-28 | 1992-07-15 | Hitachi Ltd | Image information input device |
| JP3251959B2 (en) * | 1991-10-17 | 2002-01-28 | 株式会社リコー | Image forming device |
| JP3727971B2 (en) * | 1995-02-01 | 2005-12-21 | キヤノン株式会社 | Document processing apparatus and document processing method |
| JPH08293000A (en) * | 1995-04-21 | 1996-11-05 | Canon Inc | Image processing apparatus and method |
| JPH09282413A (en) | 1996-04-16 | 1997-10-31 | Canon Inc | Document direction acquisition method and apparatus and character recognition method and apparatus |
| JP3728040B2 (en) * | 1996-12-27 | 2005-12-21 | キヤノン株式会社 | Image forming apparatus and method |
| JPH11213089A (en) * | 1998-01-23 | 1999-08-06 | Canon Inc | Image processing apparatus and method |
| US6151423A (en) | 1998-03-04 | 2000-11-21 | Canon Kabushiki Kaisha | Character recognition with document orientation determination |
| US6804414B1 (en) * | 1998-05-01 | 2004-10-12 | Fujitsu Limited | Image status detecting apparatus and document image correcting apparatus |
| JPH11338974A (en) * | 1998-05-28 | 1999-12-10 | Canon Inc | Document processing method and apparatus, storage medium |
| JP2002125114A (en) | 2000-10-13 | 2002-04-26 | Ricoh Co Ltd | Image reading device |
| JP2004013704A (en) * | 2002-06-10 | 2004-01-15 | Sumitomo Denko Systems Kk | Original direction distinguishing method for character recognition processing |
| JP2004272798A (en) * | 2003-03-11 | 2004-09-30 | Pfu Ltd | Image reading device |
| US8200043B2 (en) | 2008-05-01 | 2012-06-12 | Xerox Corporation | Page orientation detection based on selective character recognition |
| US8023770B2 (en) * | 2008-05-23 | 2011-09-20 | Sharp Laboratories Of America, Inc. | Methods and systems for identifying the orientation of a digital image |
| JP4927122B2 (en) * | 2009-06-15 | 2012-05-09 | シャープ株式会社 | Image processing method, image processing apparatus, image forming apparatus, program, and recording medium |
-
2011
- 2011-07-20 CN CN201110209833.5A patent/CN102890784B/en not_active Expired - Fee Related
-
2012
- 2012-06-18 US US13/525,736 patent/US8787674B2/en active Active
- 2012-07-04 JP JP2012150259A patent/JP5910365B2/en not_active Expired - Fee Related
- 2012-07-06 KR KR1020120073938A patent/KR101345925B1/en not_active Expired - Fee Related
- 2012-07-16 EP EP12176593.7A patent/EP2549407B1/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1242560A (en) * | 1998-06-01 | 2000-01-26 | 佳能株式会社 | Image processing method, device and storage medium therefor |
| US6993205B1 (en) * | 2000-04-12 | 2006-01-31 | International Business Machines Corporation | Automatic method of detection of incorrectly oriented text blocks using results from character recognition |
| CN101833648A (en) * | 2009-03-13 | 2010-09-15 | 汉王科技股份有限公司 | Method for correcting text image |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017088282A1 (en) * | 2015-11-25 | 2017-06-01 | 中兴通讯股份有限公司 | Display method and apparatus for adjusting picture character |
| CN108345827A (en) * | 2017-01-24 | 2018-07-31 | 富士通株式会社 | Identify method, system and the neural network in document direction |
| CN108345827B (en) * | 2017-01-24 | 2021-11-30 | 富士通株式会社 | Method, system and neural network for identifying document direction |
| CN107220640A (en) * | 2017-05-23 | 2017-09-29 | 广州绿怡信息科技有限公司 | Character identifying method, device, computer equipment and computer-readable recording medium |
| CN107220640B (en) * | 2017-05-23 | 2020-07-17 | 广州绿怡信息科技有限公司 | Character recognition method, character recognition device, computer equipment and computer-readable storage medium |
| CN114241184A (en) * | 2020-09-09 | 2022-03-25 | 顺丰科技有限公司 | Text character detection method, device and storage medium |
| CN114842464A (en) * | 2022-05-13 | 2022-08-02 | 北京百度网讯科技有限公司 | Image direction recognition method, device, equipment, storage medium and program product |
Also Published As
| Publication number | Publication date |
|---|---|
| KR101345925B1 (en) | 2013-12-27 |
| US8787674B2 (en) | 2014-07-22 |
| JP2013025800A (en) | 2013-02-04 |
| EP2549407A3 (en) | 2014-06-04 |
| US20130022271A1 (en) | 2013-01-24 |
| EP2549407B1 (en) | 2020-06-10 |
| KR20130011921A (en) | 2013-01-30 |
| EP2549407A2 (en) | 2013-01-23 |
| CN102890784B (en) | 2016-03-30 |
| JP5910365B2 (en) | 2016-04-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102890783B (en) | The method and apparatus in the direction of word in recognition image block | |
| CN112699775B (en) | Certificate identification method, device, equipment and storage medium based on deep learning | |
| US10049096B2 (en) | System and method of template creation for a data extraction tool | |
| US7583841B2 (en) | Table detection in ink notes | |
| US7664325B2 (en) | Framework for detecting a structured handwritten object | |
| CN102982330B (en) | Character identifying method and identification device in character image | |
| CN102890784B (en) | The method and apparatus in the direction of word in recognition image block | |
| US11321558B2 (en) | Information processing apparatus and non-transitory computer readable medium | |
| CN103455806A (en) | Document processing device, document processing method and scanner | |
| CN110837796B (en) | Image processing method and device | |
| CN102855477B (en) | Method and device for recognizing direction of characters in image block | |
| CN117727056A (en) | A check box identification method, device, equipment and medium | |
| CN117765544A (en) | Document key element identification method, device, equipment and medium | |
| CN102968610B (en) | Receipt image processing method and equipment | |
| CN105844207A (en) | Text line extraction method and text line extraction equipment | |
| CN114581927A (en) | Bank bill identification method, equipment and medium | |
| JP2020087112A (en) | Document processing apparatus and document processing method | |
| CN104112135B (en) | Text image extraction element and method | |
| CN115393860A (en) | Information identification method and device and terminal equipment | |
| EP4026055A2 (en) | Method and system for keypoint extraction from images of documents | |
| CN121121761A (en) | Bill key field identification method and device | |
| CN102750514A (en) | Method and device for determining categories of lists in input images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160330 Termination date: 20210720 |