CN105844275B - The localization method of line of text in text image - Google Patents
The localization method of line of text in text image Download PDFInfo
- Publication number
- CN105844275B CN105844275B CN201610178271.5A CN201610178271A CN105844275B CN 105844275 B CN105844275 B CN 105844275B CN 201610178271 A CN201610178271 A CN 201610178271A CN 105844275 B CN105844275 B CN 105844275B
- Authority
- CN
- China
- Prior art keywords
- text
- row
- line
- connected domain
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Abstract
The invention discloses the methods that line of text in a kind of text image positions.Wherein, which comprises the text image local contrast matrix is calculated by image grayscale matrix;The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;To the character connected domain after sequence, the positioning work of line of text is carried out, obtain the initial row of all line of text and terminates row information;According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is realized through the invention.
Description
Technical field
The present invention relates to optical character recognition technology fields in image procossing, more particularly to text in a kind of text image
Capable localization method.
Background technique
OCR (Optical Character Recognition, optical character identification) technology of text image is at image
An important branch in reason field, has a wide range of applications.The basic principle of OCR is analyzed using various algorithm for pattern recognitions
Text morphological feature, judges the standard code of Chinese character, and is stored in text file.Text in text image is mentioned
It takes and identifies, have great significance for the analysis and understanding of text image content.Wherein, text is oriented from text image
Current row it is significant.
In view of this, the present invention is specifically proposed.
Summary of the invention
The present invention provides a kind of localization method of line of text in text image, to solve how to efficiently locate out text diagram
As in line of text the technical issues of.
To achieve the goals above, the following technical schemes are provided:
A kind of method of line of text positioning in text image, which is characterized in that the described method includes:
The text image local contrast matrix is calculated by image grayscale matrix;
The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;
Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;
To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;
To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and knot of all line of text
Beam row information;
According to the line of text initial row and terminate row information and its comprising character number, it is deleted and is arranged
Sequence work, to realize the positioning of the line of text.
The present invention is by using above-mentioned technical proposal, and using the binaryzation technology of text image, connected domain analysis technology is real
The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is showed.Side of the invention
Method is a kind of fast algorithm of printenv, can be in the batch processing of extensive text image OCR, it can also be used to online real-time
In OCR processing system.
Detailed description of the invention
Fig. 1 is the flow diagram according to the localization method of line of text in the text image of the embodiment of the present invention;
Fig. 2 a is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of one embodiment of the invention;
Fig. 2 b is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of another embodiment of the present invention;
Fig. 2 c is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of yet another embodiment of the invention;
Fig. 2 d is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention;
Fig. 2 e is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention.
Specific embodiment
The localization method of line of text in text image provided in an embodiment of the present invention is retouched in detail with reference to the accompanying drawing
It states.
Fig. 1 is the flow chart of the localization method of line of text in text image provided in an embodiment of the present invention.As shown in Figure 1,
The method comprising the steps of S101 to step S106.
Step S101: text image local contrast matrix is calculated by image grayscale matrix.
Step S102: local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image.
Step S103: carrying out extraction, merging and the delete operation of connected domain to bianry image, obtains character connected domain.
Step S104: to the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up.
Step S105: to the character connected domain after sequence, carrying out the positioning work of line of text, obtains rising for all line of text
It begins and terminates row information.
Step S106: according to line of text initial row and terminate row information and its comprising character number, it is deleted
Remove and sort work, to realize the positioning of line of text.
Then the embodiment of the present invention uses connected domain analysis technology, to two by the way that text image is carried out binary conversion treatment
It is worth extraction, merging and delete operation that image carries out connected domain, obtains character connected domain.It is obtained again based on character connected domain all
The initial row and end row information of line of text.Finally, according to the line of text initial row and terminate row information and its comprising
Character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The embodiment of the present invention is real as a result,
The technical effect for efficiently locating out the line of text in text image is showed.
Preferably, when calculating text image local contrast matrix, it is locally right that text image is calculated according to the following formula
Than degree value:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein, Imax(i, j) and Imin(i, j) respectively indicates minimum and maximum in the local neighborhood centered on (i, j)
Gray value, α ∈ (0,1) are adjustable parameter, and ε is dimensionless, and it is that 0, Con (i, j) indicates (i, j) that being used for, which prevents denominator,
The local contrast numerical value at place.
In the actual implementation process, it is thus necessary to determine that the size of part filter window width, such as part filter can be set
Window width is 3.
In the calculation formula of local contrast, the calculation of α are as follows:
Wherein, pow (x, y) is exponential function, and x is (i.e.) it is the truth of a matter, y (i.e. gamma) is index.
In the above-described embodiments, when carrying out two-value division to local contrast matrix using maximum entropy, certain is chosen first
The numerical value of local contrast is divided into two parts according to the threshold value by a threshold value k, and the probability for calculating separately out the two parts is close
Degree function p (x | ωi), i=1,2, respectively count two parts probability distribution histogram p (x | ω1) and p (x | ω2), in turn
Acquire the sum of this two parts entropy (it is defined as objective function), it may be assumed that
Wherein, i takes 1 and 2.
According to principle of maximum entropy, it is believed that and H (p | ωi) it is maximum when corresponding threshold point k be optimal threshold, the i.e. selection of threshold value
Mode are as follows:
In above-mentioned two values matrix, 1 indicates character point, and 0 indicates background dot.
Extraction, merging and the delete operation of connected domain are carried out to bianry image, obtaining character connected domain can specifically include:
The analysis that connected domain is carried out to bianry image, extracts the connected domain of text image;Delete the connection for not meeting predetermined size requirement
Domain;By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.
Specifically, connected domain analysis is carried out to the character point in two values matrix, such as indicates to connect with 8 pixel neighborhoods of a point
Logical domain, records information [cc_right_col, cc_left_col, cc_up_row, the cc_ of each connected domain in two values matrix
Down_row, cc_pixel_num], cc_left_col and cc_right_col respectively indicate connected domain minimum outsourcing rectangle
Left and right boundary abscissa, cc_up_row and cc_down_row respectively indicate the upper and lower side of connected domain minimum outsourcing rectangle
The ordinate on boundary, cc_pixel_num indicate the number for the character point for including in connected domain;To do not meet the connected domain of size into
Row is deleted, and the influence of noise spot and non-character connected domain is eliminated with this;To overlapping biggish two connected domains of regional percentage, close
And at a connected domain.Remaining connected domain is considered as character connected domain after above method filtering.
After obtaining character connected domain, character connected domain is traversed, executes following behaviour for each character connected domain
At least one of make, to determine the initial row and end line of line of text:
If region determined by the text_up_row and text_down_row of the character in character connected domain and currently depositing
Row be not overlapped, then create a line, and the information of the row is arranged are as follows: line_up_row=text_up_row, line_
Down_row=text_down_row;
If region determined by the text_up_row and text_down_row of character and there is currently certain a line have weight
It is folded, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row, line_
), up_row line_down_row=max (text_down_row, line_down_row);
If region determined by the text_up_row and text_down_row of character is included in the region of certain a line,
The character belongs to the row, and the not coordinate information of more newline;
If all there is overlapping region in region determined by the text_up_row and text_down_row of character and certain two row,
Then the character is put into the more a line in overlapping region, and the traveling row information is updated;
Wherein, line_up_row and line_down_row respectively indicates the initial row of line of text and the end of line of text
Row;Text_up_row and text_down_row respectively indicates initial row coordinate and the minimum outsourcing of connected domain minimum outsourcing rectangle
The end line coordinate of rectangle.
Due to each character connected domain include information [text_up_row, text_down_row, text_left_col,
Text_right_col], in which: text_up_row and text_down_row respectively indicates the connected domain minimum outsourcing rectangle
The end line coordinate of initial row coordinate and minimum outsourcing rectangle, text_left_col and text_right_col respectively indicate this
The end column coordinate of the starting column coordinate of connected domain minimum outsourcing rectangle and minimum outsourcing rectangle;So root of the embodiment of the present invention
The positioning of line of text is carried out according to the information of character connected domain.
Character connected domain is filtered according to following filter criteria:
(a) cc_down_row-cc_up_row < 5 and cc_right_col-cc_left_col < 5;
(b) cc_down_row-cc_up_row > 50 or cc_right_col-cc_left_col > 50;
(c)cc_pixel_num<10。
If connected domain meets (a), (b), any one condition in (c), deleted, remaining connected domain is as word
Accord with connected domain, the information that each character connected domain includes be [text_up_row, text_down_row, text_left_col,
text_right_col];The abscissa (left_col) of left margin according to character connected domain carries out sequence from small to large.
Then, the positioning of line of text, specific mode are carried out according to character connected domain are as follows:
Character connected domain is traversed, by the region of the occupied ordinate of character connected domain [text_up_row,
Text_down_row] it is carried out pair with the region [line_up_row, line_down_row] of the occupied ordinate of each row
Than newly creating a row, and [line_up_row=text_up_ is arranged if not being overlapped (as shown in Fig. 2 a, Fig. 2 e)
Row, line_down_row=text_down_row];
If the region of the occupied ordinate of character connected domain is overlapped (as schemed with the occupied ordinate region of certain a line
Shown in 2b, Fig. 2 d), then current character is summed up in the point that in current line, and update the information of the row, update mode are as follows:
[line_up_row=min (text_up_row, line_up_row),
Line_down_row=max (text_down_row, line_down_row)];
If the region of the occupied ordinate of character connected domain is not only overlapped with the occupied ordinate region of certain a line, and
The former includes as shown in Figure 2 c, then to sum up in the point that current character in current line by the latter, and row information is without updating;
When completing the traversal to character connected domain, that is, complete the Primary Location work of line of text.
In a preferred embodiment, according to the initial row of line of text and terminate row information and its comprising character
Number, work that it is deleted and is sorted, if obtaining the width that final line of text information includes: line of text is less than width threshold
It is less than number threshold value comprising character number in value or line of text, then deletes the row;To the line of text after delete operation by row
Starting line number size is ranked up.
Specifically, the screening and sequence work of line of text can be carried out in the following way.
To the filter criteria of line of text are as follows: it includes character number in width threshold value or line of text that the width of line of text, which is less than,
Less than number threshold value.Such as:
(a)line_down_row-line_up_row<10
(b) the character connected domain number for including is less than 5
If line of text meets (a), (b) one of condition is then deleted, the line of text that remaining text behavior is finally chosen;
Sequence sequence to the line of text after screening by line_up_row from small to large, is exported, that is, completes text
The method of the positioning of line of text in image.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain
Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.
Claims (5)
1. a kind of method that line of text positions in text image, which is characterized in that the described method includes:
The text image local contrast matrix is calculated by image grayscale matrix;
The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;
Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;
To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;
To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and end line of all line of text
Information;
According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted
Make, to realize the positioning of the line of text;
The initial row for obtaining all line of text and end line information include:
The character connected domain is traversed, executes at least one of following operation for each character connected domain,
With the initial row and end line of the determination line of text:
If region determined by the text_up_row and text_down_row of the character in the character connected domain and described two
Every a line is not overlapped in value image, then creates a line, and the information of the row is arranged are as follows: line_up_row=text_up_
Row, line_down_row=text_down_row;
If certain a line in region determined by the text_up_row and text_down_row of the character and the bianry image
There is overlapping, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row,
), line_up_row line_down_row=max (text_down_row, line_down_row);
If region determined by the text_up_row and text_down_row of the character is included in the region of certain a line,
The character belongs to the row, and does not update the coordinate information of the row;
If all there is overlapping region in region determined by the text_up_row and text_down_row of the character and certain two row,
Then the character is put into the more a line in overlapping region, and the traveling row information is updated;
Wherein, the line_up_row and line_down_row respectively indicates the initial row and line of text of the line of text
End line;The text_up_row and text_down_row respectively indicates rising for the connected domain minimum outsourcing rectangle
Begin the end line coordinate of coordinate and minimum outsourcing rectangle.
2. the method according to claim 1, wherein the method also includes:
The text image local contrast numerical value is calculated according to the following formula:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein, the Imax(i, j) and the Imin(i, j) respectively indicate in the local neighborhood centered on (i, j) maximum and
Minimum gradation value, the α ∈ (0,1), is adjustable parameter, and the ε is dimensionless, and it is 0 that being used for, which prevents denominator, described
Con (i, j) indicates the local contrast numerical value at (i, j).
3. according to the method described in claim 2, it is characterized in that, the method also includes:
A selected threshold value;
The local contrast numerical value is divided into two parts according to the threshold value;
The probability distribution histogram of two part is counted respectively;
According to the probability distribution histogram of two part, the sum of the entropy of two part is calculated;
Based on the sum of the entropy of two part, and principle of maximum entropy is utilized, determines optimal threshold.
4. the method according to claim 1, wherein it is described to the bianry image carry out connected domain extraction,
Merge and delete operation, acquisition character connected domain specifically include:
The analysis that connected domain is carried out to the bianry image, extracts the connected domain of the text image;
Delete the connected domain for not meeting predetermined size requirement;
By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.
5. the method according to claim 1, wherein the initial row and end line according to the line of text is believed
Breath and it includes character number, work of being deleted it and sorted obtains final text row information, specifically includes:
If the width of the line of text is less than in width threshold value or the line of text and is less than number threshold value comprising character number,
Delete the row;
Line of text after delete operation is ranked up by capable starting line number size.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610178271.5A CN105844275B (en) | 2016-03-25 | 2016-03-25 | The localization method of line of text in text image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610178271.5A CN105844275B (en) | 2016-03-25 | 2016-03-25 | The localization method of line of text in text image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105844275A CN105844275A (en) | 2016-08-10 |
CN105844275B true CN105844275B (en) | 2019-08-23 |
Family
ID=56583525
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610178271.5A Expired - Fee Related CN105844275B (en) | 2016-03-25 | 2016-03-25 | The localization method of line of text in text image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105844275B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107845068B (en) * | 2016-09-18 | 2021-05-11 | 富士通株式会社 | Image viewing angle conversion device and method |
CN106503732B (en) * | 2016-10-13 | 2019-07-19 | 北京云江科技有限公司 | The classification method and categorizing system of text image and non-textual image |
CN107977593A (en) * | 2016-10-21 | 2018-05-01 | 富士通株式会社 | Image processing apparatus and image processing method |
CN108133208B (en) * | 2016-12-01 | 2021-04-09 | 北京新唐思创教育科技有限公司 | A method and device for character segmentation in layout analysis |
CN107798321B (en) * | 2017-12-04 | 2021-03-02 | 海南云江科技有限公司 | Test paper analysis method and computing device |
CN111783780B (en) * | 2019-11-18 | 2024-03-05 | 北京沃东天骏信息技术有限公司 | Image processing method, device and computer readable storage medium |
CN113936137B (en) * | 2020-07-10 | 2025-04-08 | 中国人寿资产管理有限公司 | A method, system and storage medium for removing overlapping of image-type text line detection areas |
CN113642550B (en) * | 2021-07-20 | 2024-03-12 | 南京红松信息技术有限公司 | Entropy maximization card-coating identification method based on pixel probability distribution statistics |
CN116824594B (en) * | 2023-07-10 | 2024-04-26 | 广东西克智能科技有限公司 | Text ordering method for positioning keywords in image |
CN117727059B (en) * | 2024-02-18 | 2024-05-03 | 蓝色火焰科技成都有限公司 | Method and device for checking automobile financial invoice information, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077388A (en) * | 2012-10-31 | 2013-05-01 | 浙江大学 | Rapid text scanning method oriented to portable computing equipment |
CN105184292A (en) * | 2015-08-26 | 2015-12-23 | 北京云江科技有限公司 | Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8831381B2 (en) * | 2012-01-26 | 2014-09-09 | Qualcomm Incorporated | Detecting and correcting skew in regions of text in natural images |
-
2016
- 2016-03-25 CN CN201610178271.5A patent/CN105844275B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077388A (en) * | 2012-10-31 | 2013-05-01 | 浙江大学 | Rapid text scanning method oriented to portable computing equipment |
CN105184292A (en) * | 2015-08-26 | 2015-12-23 | 北京云江科技有限公司 | Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image |
Also Published As
Publication number | Publication date |
---|---|
CN105844275A (en) | 2016-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105844275B (en) | The localization method of line of text in text image | |
CN106446896B (en) | Character segmentation method and device and electronic equipment | |
WO2022121039A1 (en) | Bankcard tilt correction-based detection method and apparatus, readable storage medium, and terminal | |
CN102999886B (en) | Image Edge Detector and scale grating grid precision detection system | |
CN101710425B (en) | Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram | |
CN104778470B (en) | Text detection based on component tree and Hough forest and recognition methods | |
CN104408449B (en) | Intelligent mobile terminal scene literal processing method | |
CN112712273B (en) | Handwriting Chinese character aesthetic degree judging method based on skeleton similarity | |
CN107122777A (en) | A kind of vehicle analysis system and analysis method based on video file | |
CN108038481A (en) | A kind of combination maximum extreme value stability region and the text positioning method of stroke width change | |
CN111461068A (en) | Chromosome metaphase map identification and segmentation method | |
CN112861654A (en) | Famous tea picking point position information acquisition method based on machine vision | |
CN112991536B (en) | A method for automatic extraction and vectorization of geographic elements of thematic maps | |
CN109460735A (en) | Document binary processing method, system, device based on figure semi-supervised learning | |
CN113191358A (en) | Metal part surface text detection method and system | |
CN109271882B (en) | A color-distinguishing method for extracting handwritten Chinese characters | |
CN107423735A (en) | It is a kind of to utilize horizontal gradient and the algorithm of locating license plate of vehicle of saturation degree | |
Suganya et al. | Feature selection for an automated ancient Tamil script classification system using machine learning techniques | |
CN118470038A (en) | Three-dimensional image target detection method and system based on computer vision | |
Kefali et al. | Evaluation of several binarization techniques for old Arabic documents images | |
CN101447027A (en) | Binaryzation method of magnetic code character area and application thereof | |
CN110826360A (en) | OCR image preprocessing and character recognition | |
CN119048521B (en) | Method, device and computer equipment for counting milk somatic cells | |
CN114170218B (en) | Chromosome image instance label generation method and system | |
CN110930358A (en) | Solar panel image processing method based on self-adaptive algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190823 |