[go: up one dir, main page]

CN105844275B - The localization method of line of text in text image - Google Patents

The localization method of line of text in text image Download PDF

Info

Publication number
CN105844275B
CN105844275B CN201610178271.5A CN201610178271A CN105844275B CN 105844275 B CN105844275 B CN 105844275B CN 201610178271 A CN201610178271 A CN 201610178271A CN 105844275 B CN105844275 B CN 105844275B
Authority
CN
China
Prior art keywords
text
row
line
connected domain
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610178271.5A
Other languages
Chinese (zh)
Other versions
CN105844275A (en
Inventor
刘辉
石胜坤
陈李江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yun Jiang Science And Technology Ltd
Original Assignee
Beijing Yun Jiang Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yun Jiang Science And Technology Ltd filed Critical Beijing Yun Jiang Science And Technology Ltd
Priority to CN201610178271.5A priority Critical patent/CN105844275B/en
Publication of CN105844275A publication Critical patent/CN105844275A/en
Application granted granted Critical
Publication of CN105844275B publication Critical patent/CN105844275B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

The invention discloses the methods that line of text in a kind of text image positions.Wherein, which comprises the text image local contrast matrix is calculated by image grayscale matrix;The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;To the character connected domain after sequence, the positioning work of line of text is carried out, obtain the initial row of all line of text and terminates row information;According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is realized through the invention.

Description

The localization method of line of text in text image
Technical field
The present invention relates to optical character recognition technology fields in image procossing, more particularly to text in a kind of text image Capable localization method.
Background technique
OCR (Optical Character Recognition, optical character identification) technology of text image is at image An important branch in reason field, has a wide range of applications.The basic principle of OCR is analyzed using various algorithm for pattern recognitions Text morphological feature, judges the standard code of Chinese character, and is stored in text file.Text in text image is mentioned It takes and identifies, have great significance for the analysis and understanding of text image content.Wherein, text is oriented from text image Current row it is significant.
In view of this, the present invention is specifically proposed.
Summary of the invention
The present invention provides a kind of localization method of line of text in text image, to solve how to efficiently locate out text diagram As in line of text the technical issues of.
To achieve the goals above, the following technical schemes are provided:
A kind of method of line of text positioning in text image, which is characterized in that the described method includes:
The text image local contrast matrix is calculated by image grayscale matrix;
The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;
Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;
To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;
To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and knot of all line of text Beam row information;
According to the line of text initial row and terminate row information and its comprising character number, it is deleted and is arranged Sequence work, to realize the positioning of the line of text.
The present invention is by using above-mentioned technical proposal, and using the binaryzation technology of text image, connected domain analysis technology is real The positioning work of text image effective line of text under the complex situations such as uneven illumination, contrast be low is showed.Side of the invention Method is a kind of fast algorithm of printenv, can be in the batch processing of extensive text image OCR, it can also be used to online real-time In OCR processing system.
Detailed description of the invention
Fig. 1 is the flow diagram according to the localization method of line of text in the text image of the embodiment of the present invention;
Fig. 2 a is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of one embodiment of the invention;
Fig. 2 b is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of another embodiment of the present invention;
Fig. 2 c is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of yet another embodiment of the invention;
Fig. 2 d is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention;
Fig. 2 e is the schematic diagram that line of text positioning is carried out according to the foundation character connected domain of further embodiment of this invention.
Specific embodiment
The localization method of line of text in text image provided in an embodiment of the present invention is retouched in detail with reference to the accompanying drawing It states.
Fig. 1 is the flow chart of the localization method of line of text in text image provided in an embodiment of the present invention.As shown in Figure 1, The method comprising the steps of S101 to step S106.
Step S101: text image local contrast matrix is calculated by image grayscale matrix.
Step S102: local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image.
Step S103: carrying out extraction, merging and the delete operation of connected domain to bianry image, obtains character connected domain.
Step S104: to the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up.
Step S105: to the character connected domain after sequence, carrying out the positioning work of line of text, obtains rising for all line of text It begins and terminates row information.
Step S106: according to line of text initial row and terminate row information and its comprising character number, it is deleted Remove and sort work, to realize the positioning of line of text.
Then the embodiment of the present invention uses connected domain analysis technology, to two by the way that text image is carried out binary conversion treatment It is worth extraction, merging and delete operation that image carries out connected domain, obtains character connected domain.It is obtained again based on character connected domain all The initial row and end row information of line of text.Finally, according to the line of text initial row and terminate row information and its comprising Character number, work that it is deleted and is sorted, to realize the positioning of the line of text.The embodiment of the present invention is real as a result, The technical effect for efficiently locating out the line of text in text image is showed.
Preferably, when calculating text image local contrast matrix, it is locally right that text image is calculated according to the following formula Than degree value:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein, Imax(i, j) and Imin(i, j) respectively indicates minimum and maximum in the local neighborhood centered on (i, j) Gray value, α ∈ (0,1) are adjustable parameter, and ε is dimensionless, and it is that 0, Con (i, j) indicates (i, j) that being used for, which prevents denominator, The local contrast numerical value at place.
In the actual implementation process, it is thus necessary to determine that the size of part filter window width, such as part filter can be set Window width is 3.
In the calculation formula of local contrast, the calculation of α are as follows:
Wherein, pow (x, y) is exponential function, and x is (i.e.) it is the truth of a matter, y (i.e. gamma) is index.
In the above-described embodiments, when carrying out two-value division to local contrast matrix using maximum entropy, certain is chosen first The numerical value of local contrast is divided into two parts according to the threshold value by a threshold value k, and the probability for calculating separately out the two parts is close Degree function p (x | ωi), i=1,2, respectively count two parts probability distribution histogram p (x | ω1) and p (x | ω2), in turn Acquire the sum of this two parts entropy (it is defined as objective function), it may be assumed that
Wherein, i takes 1 and 2.
According to principle of maximum entropy, it is believed that and H (p | ωi) it is maximum when corresponding threshold point k be optimal threshold, the i.e. selection of threshold value Mode are as follows:
In above-mentioned two values matrix, 1 indicates character point, and 0 indicates background dot.
Extraction, merging and the delete operation of connected domain are carried out to bianry image, obtaining character connected domain can specifically include: The analysis that connected domain is carried out to bianry image, extracts the connected domain of text image;Delete the connection for not meeting predetermined size requirement Domain;By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.
Specifically, connected domain analysis is carried out to the character point in two values matrix, such as indicates to connect with 8 pixel neighborhoods of a point Logical domain, records information [cc_right_col, cc_left_col, cc_up_row, the cc_ of each connected domain in two values matrix Down_row, cc_pixel_num], cc_left_col and cc_right_col respectively indicate connected domain minimum outsourcing rectangle Left and right boundary abscissa, cc_up_row and cc_down_row respectively indicate the upper and lower side of connected domain minimum outsourcing rectangle The ordinate on boundary, cc_pixel_num indicate the number for the character point for including in connected domain;To do not meet the connected domain of size into Row is deleted, and the influence of noise spot and non-character connected domain is eliminated with this;To overlapping biggish two connected domains of regional percentage, close And at a connected domain.Remaining connected domain is considered as character connected domain after above method filtering.
After obtaining character connected domain, character connected domain is traversed, executes following behaviour for each character connected domain At least one of make, to determine the initial row and end line of line of text:
If region determined by the text_up_row and text_down_row of the character in character connected domain and currently depositing Row be not overlapped, then create a line, and the information of the row is arranged are as follows: line_up_row=text_up_row, line_ Down_row=text_down_row;
If region determined by the text_up_row and text_down_row of character and there is currently certain a line have weight It is folded, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row, line_ ), up_row line_down_row=max (text_down_row, line_down_row);
If region determined by the text_up_row and text_down_row of character is included in the region of certain a line, The character belongs to the row, and the not coordinate information of more newline;
If all there is overlapping region in region determined by the text_up_row and text_down_row of character and certain two row, Then the character is put into the more a line in overlapping region, and the traveling row information is updated;
Wherein, line_up_row and line_down_row respectively indicates the initial row of line of text and the end of line of text Row;Text_up_row and text_down_row respectively indicates initial row coordinate and the minimum outsourcing of connected domain minimum outsourcing rectangle The end line coordinate of rectangle.
Due to each character connected domain include information [text_up_row, text_down_row, text_left_col, Text_right_col], in which: text_up_row and text_down_row respectively indicates the connected domain minimum outsourcing rectangle The end line coordinate of initial row coordinate and minimum outsourcing rectangle, text_left_col and text_right_col respectively indicate this The end column coordinate of the starting column coordinate of connected domain minimum outsourcing rectangle and minimum outsourcing rectangle;So root of the embodiment of the present invention The positioning of line of text is carried out according to the information of character connected domain.
Character connected domain is filtered according to following filter criteria:
(a) cc_down_row-cc_up_row < 5 and cc_right_col-cc_left_col < 5;
(b) cc_down_row-cc_up_row > 50 or cc_right_col-cc_left_col > 50;
(c)cc_pixel_num<10。
If connected domain meets (a), (b), any one condition in (c), deleted, remaining connected domain is as word Accord with connected domain, the information that each character connected domain includes be [text_up_row, text_down_row, text_left_col, text_right_col];The abscissa (left_col) of left margin according to character connected domain carries out sequence from small to large.
Then, the positioning of line of text, specific mode are carried out according to character connected domain are as follows:
Character connected domain is traversed, by the region of the occupied ordinate of character connected domain [text_up_row, Text_down_row] it is carried out pair with the region [line_up_row, line_down_row] of the occupied ordinate of each row Than newly creating a row, and [line_up_row=text_up_ is arranged if not being overlapped (as shown in Fig. 2 a, Fig. 2 e) Row, line_down_row=text_down_row];
If the region of the occupied ordinate of character connected domain is overlapped (as schemed with the occupied ordinate region of certain a line Shown in 2b, Fig. 2 d), then current character is summed up in the point that in current line, and update the information of the row, update mode are as follows:
[line_up_row=min (text_up_row, line_up_row),
Line_down_row=max (text_down_row, line_down_row)];
If the region of the occupied ordinate of character connected domain is not only overlapped with the occupied ordinate region of certain a line, and The former includes as shown in Figure 2 c, then to sum up in the point that current character in current line by the latter, and row information is without updating;
When completing the traversal to character connected domain, that is, complete the Primary Location work of line of text.
In a preferred embodiment, according to the initial row of line of text and terminate row information and its comprising character Number, work that it is deleted and is sorted, if obtaining the width that final line of text information includes: line of text is less than width threshold It is less than number threshold value comprising character number in value or line of text, then deletes the row;To the line of text after delete operation by row Starting line number size is ranked up.
Specifically, the screening and sequence work of line of text can be carried out in the following way.
To the filter criteria of line of text are as follows: it includes character number in width threshold value or line of text that the width of line of text, which is less than, Less than number threshold value.Such as:
(a)line_down_row-line_up_row<10
(b) the character connected domain number for including is less than 5
If line of text meets (a), (b) one of condition is then deleted, the line of text that remaining text behavior is finally chosen;
Sequence sequence to the line of text after screening by line_up_row from small to large, is exported, that is, completes text The method of the positioning of line of text in image.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (5)

1. a kind of method that line of text positions in text image, which is characterized in that the described method includes:
The text image local contrast matrix is calculated by image grayscale matrix;
The local contrast matrix is subjected to two-value division using maximum entropy split plot design, obtains bianry image;
Extraction, merging and the delete operation of connected domain are carried out to the bianry image, obtain character connected domain;
To the character connected domain of extraction, the size according to its left side horizontal coordinate is ranked up;
To the character connected domain after sequence, the positioning work of line of text is carried out, obtains the initial row and end line of all line of text Information;
According to the line of text initial row and terminate row information and its comprising character number, work that it is deleted and is sorted Make, to realize the positioning of the line of text;
The initial row for obtaining all line of text and end line information include:
The character connected domain is traversed, executes at least one of following operation for each character connected domain, With the initial row and end line of the determination line of text:
If region determined by the text_up_row and text_down_row of the character in the character connected domain and described two Every a line is not overlapped in value image, then creates a line, and the information of the row is arranged are as follows: line_up_row=text_up_ Row, line_down_row=text_down_row;
If certain a line in region determined by the text_up_row and text_down_row of the character and the bianry image There is overlapping, then it is assumed that the character belongs to the row, and updates the information of the row: line_up_row=min (text_up_row, ), line_up_row line_down_row=max (text_down_row, line_down_row);
If region determined by the text_up_row and text_down_row of the character is included in the region of certain a line, The character belongs to the row, and does not update the coordinate information of the row;
If all there is overlapping region in region determined by the text_up_row and text_down_row of the character and certain two row, Then the character is put into the more a line in overlapping region, and the traveling row information is updated;
Wherein, the line_up_row and line_down_row respectively indicates the initial row and line of text of the line of text End line;The text_up_row and text_down_row respectively indicates rising for the connected domain minimum outsourcing rectangle Begin the end line coordinate of coordinate and minimum outsourcing rectangle.
2. the method according to claim 1, wherein the method also includes:
The text image local contrast numerical value is calculated according to the following formula:
Con (i, j)=α C (i, j)+(1- α) (Imax(i,j)-Imin(i,j))
Wherein, the Imax(i, j) and the Imin(i, j) respectively indicate in the local neighborhood centered on (i, j) maximum and Minimum gradation value, the α ∈ (0,1), is adjustable parameter, and the ε is dimensionless, and it is 0 that being used for, which prevents denominator, described Con (i, j) indicates the local contrast numerical value at (i, j).
3. according to the method described in claim 2, it is characterized in that, the method also includes:
A selected threshold value;
The local contrast numerical value is divided into two parts according to the threshold value;
The probability distribution histogram of two part is counted respectively;
According to the probability distribution histogram of two part, the sum of the entropy of two part is calculated;
Based on the sum of the entropy of two part, and principle of maximum entropy is utilized, determines optimal threshold.
4. the method according to claim 1, wherein it is described to the bianry image carry out connected domain extraction, Merge and delete operation, acquisition character connected domain specifically include:
The analysis that connected domain is carried out to the bianry image, extracts the connected domain of the text image;
Delete the connected domain for not meeting predetermined size requirement;
By the two-connected domain of overlapping region large percentage, it is merged into a connected domain.
5. the method according to claim 1, wherein the initial row and end line according to the line of text is believed Breath and it includes character number, work of being deleted it and sorted obtains final text row information, specifically includes:
If the width of the line of text is less than in width threshold value or the line of text and is less than number threshold value comprising character number, Delete the row;
Line of text after delete operation is ranked up by capable starting line number size.
CN201610178271.5A 2016-03-25 2016-03-25 The localization method of line of text in text image Expired - Fee Related CN105844275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610178271.5A CN105844275B (en) 2016-03-25 2016-03-25 The localization method of line of text in text image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610178271.5A CN105844275B (en) 2016-03-25 2016-03-25 The localization method of line of text in text image

Publications (2)

Publication Number Publication Date
CN105844275A CN105844275A (en) 2016-08-10
CN105844275B true CN105844275B (en) 2019-08-23

Family

ID=56583525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610178271.5A Expired - Fee Related CN105844275B (en) 2016-03-25 2016-03-25 The localization method of line of text in text image

Country Status (1)

Country Link
CN (1) CN105844275B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107845068B (en) * 2016-09-18 2021-05-11 富士通株式会社 Image viewing angle conversion device and method
CN106503732B (en) * 2016-10-13 2019-07-19 北京云江科技有限公司 The classification method and categorizing system of text image and non-textual image
CN107977593A (en) * 2016-10-21 2018-05-01 富士通株式会社 Image processing apparatus and image processing method
CN108133208B (en) * 2016-12-01 2021-04-09 北京新唐思创教育科技有限公司 A method and device for character segmentation in layout analysis
CN107798321B (en) * 2017-12-04 2021-03-02 海南云江科技有限公司 Test paper analysis method and computing device
CN111783780B (en) * 2019-11-18 2024-03-05 北京沃东天骏信息技术有限公司 Image processing method, device and computer readable storage medium
CN113936137B (en) * 2020-07-10 2025-04-08 中国人寿资产管理有限公司 A method, system and storage medium for removing overlapping of image-type text line detection areas
CN113642550B (en) * 2021-07-20 2024-03-12 南京红松信息技术有限公司 Entropy maximization card-coating identification method based on pixel probability distribution statistics
CN116824594B (en) * 2023-07-10 2024-04-26 广东西克智能科技有限公司 Text ordering method for positioning keywords in image
CN117727059B (en) * 2024-02-18 2024-05-03 蓝色火焰科技成都有限公司 Method and device for checking automobile financial invoice information, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077388A (en) * 2012-10-31 2013-05-01 浙江大学 Rapid text scanning method oriented to portable computing equipment
CN105184292A (en) * 2015-08-26 2015-12-23 北京云江科技有限公司 Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8831381B2 (en) * 2012-01-26 2014-09-09 Qualcomm Incorporated Detecting and correcting skew in regions of text in natural images

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077388A (en) * 2012-10-31 2013-05-01 浙江大学 Rapid text scanning method oriented to portable computing equipment
CN105184292A (en) * 2015-08-26 2015-12-23 北京云江科技有限公司 Method for analyzing and recognizing structure of handwritten mathematical formula in natural scene image

Also Published As

Publication number Publication date
CN105844275A (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN105844275B (en) The localization method of line of text in text image
CN106446896B (en) Character segmentation method and device and electronic equipment
WO2022121039A1 (en) Bankcard tilt correction-based detection method and apparatus, readable storage medium, and terminal
CN102999886B (en) Image Edge Detector and scale grating grid precision detection system
CN101710425B (en) Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram
CN104778470B (en) Text detection based on component tree and Hough forest and recognition methods
CN104408449B (en) Intelligent mobile terminal scene literal processing method
CN112712273B (en) Handwriting Chinese character aesthetic degree judging method based on skeleton similarity
CN107122777A (en) A kind of vehicle analysis system and analysis method based on video file
CN108038481A (en) A kind of combination maximum extreme value stability region and the text positioning method of stroke width change
CN111461068A (en) Chromosome metaphase map identification and segmentation method
CN112861654A (en) Famous tea picking point position information acquisition method based on machine vision
CN112991536B (en) A method for automatic extraction and vectorization of geographic elements of thematic maps
CN109460735A (en) Document binary processing method, system, device based on figure semi-supervised learning
CN113191358A (en) Metal part surface text detection method and system
CN109271882B (en) A color-distinguishing method for extracting handwritten Chinese characters
CN107423735A (en) It is a kind of to utilize horizontal gradient and the algorithm of locating license plate of vehicle of saturation degree
Suganya et al. Feature selection for an automated ancient Tamil script classification system using machine learning techniques
CN118470038A (en) Three-dimensional image target detection method and system based on computer vision
Kefali et al. Evaluation of several binarization techniques for old Arabic documents images
CN101447027A (en) Binaryzation method of magnetic code character area and application thereof
CN110826360A (en) OCR image preprocessing and character recognition
CN119048521B (en) Method, device and computer equipment for counting milk somatic cells
CN114170218B (en) Chromosome image instance label generation method and system
CN110930358A (en) Solar panel image processing method based on self-adaptive algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190823