CN105844275B

CN105844275B - The localization method of line of text in text image

Info

Publication number: CN105844275B
Application number: CN201610178271.5A
Authority: CN
Inventors: 刘辉; 石胜坤; 陈李江
Original assignee: Beijing Yun Jiang Science And Technology Ltd
Current assignee: Beijing Yun Jiang Science And Technology Ltd
Priority date: 2016-03-25
Filing date: 2016-03-25
Publication date: 2019-08-23
Anticipated expiration: 2036-03-25
Also published as: CN105844275A

Abstract

The invention discloses the methods that line of text in a kind of text image positions.Wherein, which comprises the text image local contrast matrix is calculated by image grayscale matrix；The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image；Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain；To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up；To the character connected domain after sequence, the positioning work of line of text is carried out, obtain the initial row of all line of text and terminates row information；According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is realized through the invention.

Description

The localization method of line of text in text image

Technical field

The present invention relates to optical character recognition technology fields in image procossing, more particularly to text in a kind of text image Capable localization method.

Background technique

OCR (Optical Character Recognition, optical character identification) technology of text image is at image An important branch in reason field, has a wide range of applications.The basic principle of OCR is analyzed using various algorithm for pattern recognitions Text morphological feature, judges the standard code of Chinese character, and is stored in text file.Text in text image is mentioned It takes and identifies, have great significance for the analysis and understanding of text image content.Wherein, text is oriented from text image Current row it is significant.

In view of this, the present invention is specifically proposed.

Summary of the invention

The present invention provides a kind of localization method of line of text in text image, to solve how to efficiently locate out text diagram As in line of text the technical issues of.

To achieve the goals above, the following technical schemes are provided:

A kind of method of line of text positioning in text image, which is characterized in that the described method includes:

The text image local contrast matrix is calculated by image grayscale matrix；

The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image；

Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain；

To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up；

To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and knot of all line of text Beam row information；

According to the line of text initial row and terminate row information and its comprising character number, it is deleted and is arranged Sequence work, to realize the positioning of the line of text.

The present invention is by using above-mentioned technical proposal, and using the binaryzation technology of text image, connected domain analysis technology is real The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is showed.Side of the invention Method is a kind of fast algorithm of printenv, can be in the batch processing of extensive text image OCR, it can also be used to online real-time In OCR processing system.

Detailed description of the invention

Fig. 1 is the flow diagram according to the localization method of line of text in the text image of the embodiment of the present invention；

Fig. 2 a is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of one embodiment of the invention；

Fig. 2 b is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of another embodiment of the present invention；

Fig. 2 c is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of yet another embodiment of the invention；

Fig. 2 d is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention；

Fig. 2 e is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention.

Specific embodiment

The localization method of line of text in text image provided in an embodiment of the present invention is retouched in detail with reference to the accompanying drawing It states.

Fig. 1 is the flow chart of the localization method of line of text in text image provided in an embodiment of the present invention.As shown in Figure 1, The method comprising the steps of S101 to step S106.

Step S101: text image local contrast matrix is calculated by image grayscale matrix.

Step S102: local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image.

Step S103: carrying out extraction, merging and the delete operation of connected domain to bianry image, obtains character connected domain.

Step S104: to the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up.

Step S105: to the character connected domain after sequence, carrying out the positioning work of line of text, obtains rising for all line of text It begins and terminates row information.

Step S106: according to line of text initial row and terminate row information and its comprising character number, it is deleted Remove and sort work, to realize the positioning of line of text.

Then the embodiment of the present invention uses connected domain analysis technology, to two by the way that text image is carried out binary conversion treatment It is worth extraction, merging and delete operation that image carries out connected domain, obtains character connected domain.It is obtained again based on character connected domain all The initial row and end row information of line of text.Finally, according to the line of text initial row and terminate row information and its comprising Character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The embodiment of the present invention is real as a result, The technical effect for efficiently locating out the line of text in text image is showed.

Preferably, when calculating text image local contrast matrix, it is locally right that text image is calculated according to the following formula Than degree value:

Con (i, j)=α C (i, j)+(1- α) (I_max(i,j)-I_min(i,j))

Wherein, I_max(i, j) and I_min(i, j) respectively indicates minimum and maximum in the local neighborhood centered on (i, j) Gray value, α ∈ (0,1) are adjustable parameter, and ε is dimensionless, and it is that 0, Con (i, j) indicates (i, j) that being used for, which prevents denominator, The local contrast numerical value at place.

In the actual implementation process, it is thus necessary to determine that the size of part filter window width, such as part filter can be set Window width is 3.

In the calculation formula of local contrast, the calculation of α are as follows:

Wherein, pow (x, y) is exponential function, and x is (i.e.) it is the truth of a matter, y (i.e. gamma) is index.

In the above-described embodiments, when carrying out two-value division to local contrast matrix using maximum entropy, certain is chosen first The numerical value of local contrast is divided into two parts according to the threshold value by a threshold value k, and the probability for calculating separately out the two parts is close Degree function p (x | ω_i), i=1,2, respectively count two parts probability distribution histogram p (x | ω₁) and p (x | ω₂), in turn Acquire the sum of this two parts entropy (it is defined as objective function), it may be assumed that

Wherein, i takes 1 and 2.

According to principle of maximum entropy, it is believed that and H (p | ω_i) it is maximum when corresponding threshold point k be optimal threshold, the i.e. selection of threshold value Mode are as follows:

In above-mentioned two values matrix, 1 indicates character point, and 0 indicates background dot.

Extraction, merging and the delete operation of connected domain are carried out to bianry image, obtaining character connected domain can specifically include: The analysis that connected domain is carried out to bianry image, extracts the connected domain of text image；Delete the connection for not meeting predetermined size requirement Domain；By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.

Specifically, connected domain analysis is carried out to the character point in two values matrix, such as indicates to connect with 8 pixel neighborhoods of a point Logical domain, records information [cc_right_col, cc_left_col, cc_up_row, the cc_ of each connected domain in two values matrix Down_row, cc_pixel_num], cc_left_col and cc_right_col respectively indicate connected domain minimum outsourcing rectangle Left and right boundary abscissa, cc_up_row and cc_down_row respectively indicate the upper and lower side of connected domain minimum outsourcing rectangle The ordinate on boundary, cc_pixel_num indicate the number for the character point for including in connected domain；To do not meet the connected domain of size into Row is deleted, and the influence of noise spot and non-character connected domain is eliminated with this；To overlapping biggish two connected domains of regional percentage, close And at a connected domain.Remaining connected domain is considered as character connected domain after above method filtering.

After obtaining character connected domain, character connected domain is traversed, executes following behaviour for each character connected domain At least one of make, to determine the initial row and end line of line of text:

If region determined by the text_up_row and text_down_row of the character in character connected domain and currently depositing Row be not overlapped, then create a line, and the information of the row is arranged are as follows: line_up_row=text_up_row, line_ Down_row=text_down_row；

If region determined by the text_up_row and text_down_row of character and there is currently certain a line have weight It is folded, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row, line_ ), up_row line_down_row=max (text_down_row, line_down_row)；

If region determined by the text_up_row and text_down_row of character is included in the region of certain a line, The character belongs to the row, and the not coordinate information of more newline；

If all there is overlapping region in region determined by the text_up_row and text_down_row of character and certain two row, Then the character is put into the more a line in overlapping region, and the traveling row information is updated；

Wherein, line_up_row and line_down_row respectively indicates the initial row of line of text and the end of line of text Row；Text_up_row and text_down_row respectively indicates initial row coordinate and the minimum outsourcing of connected domain minimum outsourcing rectangle The end line coordinate of rectangle.

Due to each character connected domain include information [text_up_row, text_down_row, text_left_col, Text_right_col], in which: text_up_row and text_down_row respectively indicates the connected domain minimum outsourcing rectangle The end line coordinate of initial row coordinate and minimum outsourcing rectangle, text_left_col and text_right_col respectively indicate this The end column coordinate of the starting column coordinate of connected domain minimum outsourcing rectangle and minimum outsourcing rectangle；So root of the embodiment of the present invention The positioning of line of text is carried out according to the information of character connected domain.

Character connected domain is filtered according to following filter criteria:

(a) cc_down_row-cc_up_row < 5 and cc_right_col-cc_left_col < 5；

(b) cc_down_row-cc_up_row > 50 or cc_right_col-cc_left_col > 50；

(c)cc_pixel_num<10。

If connected domain meets (a), (b), any one condition in (c), deleted, remaining connected domain is as word Accord with connected domain, the information that each character connected domain includes be [text_up_row, text_down_row, text_left_col, text_right_col]；The abscissa (left_col) of left margin according to character connected domain carries out sequence from small to large.

Then, the positioning of line of text, specific mode are carried out according to character connected domain are as follows:

Character connected domain is traversed, by the region of the occupied ordinate of character connected domain [text_up_row, Text_down_row] it is carried out pair with the region [line_up_row, line_down_row] of the occupied ordinate of each row Than newly creating a row, and [line_up_row=text_up_ is arranged if not being overlapped (as shown in Fig. 2 a, Fig. 2 e) Row, line_down_row=text_down_row]；

If the region of the occupied ordinate of character connected domain is overlapped (as schemed with the occupied ordinate region of certain a line Shown in 2b, Fig. 2 d), then current character is summed up in the point that in current line, and update the information of the row, update mode are as follows:

[line_up_row=min (text_up_row, line_up_row),

Line_down_row=max (text_down_row, line_down_row)]；

If the region of the occupied ordinate of character connected domain is not only overlapped with the occupied ordinate region of certain a line, and The former includes as shown in Figure 2 c, then to sum up in the point that current character in current line by the latter, and row information is without updating；

When completing the traversal to character connected domain, that is, complete the Primary Location work of line of text.

In a preferred embodiment, according to the initial row of line of text and terminate row information and its comprising character Number, work that it is deleted and is sorted, if obtaining the width that final line of text information includes: line of text is less than width threshold It is less than number threshold value comprising character number in value or line of text, then deletes the row；To the line of text after delete operation by row Starting line number size is ranked up.

Specifically, the screening and sequence work of line of text can be carried out in the following way.

To the filter criteria of line of text are as follows: it includes character number in width threshold value or line of text that the width of line of text, which is less than, Less than number threshold value.Such as:

(a)line_down_row-line_up_row<10

(b) the character connected domain number for including is less than 5

If line of text meets (a), (b) one of condition is then deleted, the line of text that remaining text behavior is finally chosen；

Sequence sequence to the line of text after screening by line_up_row from small to large, is exported, that is, completes text The method of the positioning of line of text in image.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims

1. a kind of method that line of text positions in text image, which is characterized in that the described method includes:

The text image local contrast matrix is calculated by image grayscale matrix；

To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and end line of all line of text Information；

According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted Make, to realize the positioning of the line of text；

The initial row for obtaining all line of text and end line information include:

The character connected domain is traversed, executes at least one of following operation for each character connected domain, With the initial row and end line of the determination line of text:

If region determined by the text_up_row and text_down_row of the character in the character connected domain and described two Every a line is not overlapped in value image, then creates a line, and the information of the row is arranged are as follows: line_up_row=text_up_ Row, line_down_row=text_down_row；

If certain a line in region determined by the text_up_row and text_down_row of the character and the bianry image There is overlapping, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row, ), line_up_row line_down_row=max (text_down_row, line_down_row)；

If region determined by the text_up_row and text_down_row of the character is included in the region of certain a line, The character belongs to the row, and does not update the coordinate information of the row；

If all there is overlapping region in region determined by the text_up_row and text_down_row of the character and certain two row, Then the character is put into the more a line in overlapping region, and the traveling row information is updated；

Wherein, the line_up_row and line_down_row respectively indicates the initial row and line of text of the line of text End line；The text_up_row and text_down_row respectively indicates rising for the connected domain minimum outsourcing rectangle Begin the end line coordinate of coordinate and minimum outsourcing rectangle.

2. the method according to claim 1, wherein the method also includes:

The text image local contrast numerical value is calculated according to the following formula:

Con (i, j)=α C (i, j)+(1- α) (I_max(i,j)-I_min(i,j))

Wherein, the I_max(i, j) and the I_min(i, j) respectively indicate in the local neighborhood centered on (i, j) maximum and Minimum gradation value, the α ∈ (0,1), is adjustable parameter, and the ε is dimensionless, and it is 0 that being used for, which prevents denominator, described Con (i, j) indicates the local contrast numerical value at (i, j).

3. according to the method described in claim 2, it is characterized in that, the method also includes:

A selected threshold value；

The local contrast numerical value is divided into two parts according to the threshold value；

The probability distribution histogram of two part is counted respectively；

According to the probability distribution histogram of two part, the sum of the entropy of two part is calculated；

Based on the sum of the entropy of two part, and principle of maximum entropy is utilized, determines optimal threshold.

4. the method according to claim 1, wherein it is described to the bianry image carry out connected domain extraction, Merge and delete operation, acquisition character connected domain specifically include:

The analysis that connected domain is carried out to the bianry image, extracts the connected domain of the text image；

Delete the connected domain for not meeting predetermined size requirement；

By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.

5. the method according to claim 1, wherein the initial row and end line according to the line of text is believed Breath and it includes character number, work of being deleted it and sorted obtains final text row information, specifically includes:

If the width of the line of text is less than in width threshold value or the line of text and is less than number threshold value comprising character number, Delete the row；

Line of text after delete operation is ranked up by capable starting line number size.