CN1093965C - On-line character recognition method and apparatus thereof - Google Patents
On-line character recognition method and apparatus thereof Download PDFInfo
- Publication number
- CN1093965C CN1093965C CN96121613A CN96121613A CN1093965C CN 1093965 C CN1093965 C CN 1093965C CN 96121613 A CN96121613 A CN 96121613A CN 96121613 A CN96121613 A CN 96121613A CN 1093965 C CN1093965 C CN 1093965C
- Authority
- CN
- China
- Prior art keywords
- mentioned
- stroke
- strokes
- font
- characters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/018—Input/output arrangements for oriental characters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04883—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Character Discrimination (AREA)
Abstract
本发明采用有输入文字的书写信息的输入单元,按照书写文字的大体形状特征从存储在字形字库中的标准文字选出多个类似度高的候选文字的字形识别单元,根据上述书写文字的各笔划特征就上述字形识别单元所选出的候选文字进行考虑到笔序的对应命名的笔序对应命名识别单元,根据上述书写文字的各笔划特征就上述字形识别单元所选出的候选文字进行考虑到划数的对应命名识别的划数命名识别单元,以及,显示上述笔序对应命名识别单元或上述划数对应命名识别单元的识别结果的显示单元。
The present invention adopts the input unit with the writing information of the input characters, and selects the font recognition unit of a plurality of candidate characters with high similarity from the standard characters stored in the font library according to the general shape characteristics of the written characters. Stroke feature is carried out with regard to the candidate characters selected by the above-mentioned font recognition unit. Consider the corresponding naming of the stroke order. A stroke number name recognition unit for recognizing the corresponding name of the stroke number, and a display unit for displaying the recognition result of the stroke number corresponding name recognition unit or the stroke number corresponding name recognition unit.
Description
本发明涉及对手写文字进行高精度识别的在线文字识别方法及装置。The invention relates to an online character recognition method and device for high-precision recognition of handwritten characters.
在以往的在线文字识别装置中,有通过吸取笔序变动识别文字的在线文字识别装置和通过吸取笔序变动吸取识别文字的在线文字识别装置。以下分别就各种在线文字识别装置进行说明。Conventional online character recognition devices include an online character recognition device that recognizes characters by absorbing stroke-order changes, and an online character recognition device that recognizes characters by absorbing stroke-order changes. Various online character recognition devices are described below.
已有技术例1:Existing technology example 1:
在通过吸取笔序变动吸取识别文字的已有在线文字识别装置中,是在字库的标准字型的所有笔划与输入字型的所有笔划之间,求取笔划的各自组合的距离,通过在标准字型的各笔划中对距离最小的输入字型笔划作对应命名、就输入字型的笔划变动进行相应处理(日本专利申请昭54-061146),或将笔序变更方法等信息预先存入标准字型中,根据存入内容,限制输入字型与标准字型笔划的对应命名,以减轻对应命名的处理量(日本专利公开昭57-178579)。In the existing online character recognition device that absorbs and recognizes characters by absorbing stroke order changes, it is between all the strokes of the standard font of the font and all the strokes of the input font to obtain the distances of the respective combinations of the strokes. Among the strokes of the font, the strokes of the input font with the smallest distance are named accordingly, and the stroke changes of the input font are processed accordingly (Japanese patent application Sho 54-061146), or information such as the method of changing the stroke order is stored in the standard In the font, according to the stored content, the corresponding naming of the strokes of the input font and the standard font is limited, so as to reduce the processing amount of the corresponding naming (Japanese Patent Publication Sho 57-178579).
例如,图26是表示在日本专利公开昭57-178579中所公开的吸取笔序变动的已有的在线文字识别装置的构成的方框图。For example, FIG. 26 is a block diagram showing the structure of a conventional online character recognition device disclosed in Japanese Patent Laid-Open No. Sho 57-178579 for absorbing stroke order changes.
图26中,1为计算输入字型笔划与标准字型笔划的距离的字型匹配装置,2为预先存储各个文字的标准字型的笔划顺序的标准字型存储装置,3为存储着笔序组合方法等的笔序组合存储装置,4为根据笔序组合存储装置3的信息变更存储在标准字型存储装置2中的标准字型的笔划顺序的标准字型再形成装置,5为根据字型匹配装置1输出的距离确定识别结果的识别装置。In Fig. 26, 1 is the font matching device that calculates the distance between the strokes of the input font and the strokes of the standard font, 2 is the standard font storage device that pre-stores the stroke order of the standard font of each character, and 3 is the combination of strokes stored Stroke-order combination storage devices such as methods, 4 is a standard font reconstruction device for changing the stroke order of a standard font stored in the standard
以下,对使用图26的已有例子的在线文字识别装置的动作进行说明。Hereinafter, the operation of the online character recognition device of the conventional example using FIG. 26 will be described.
首先,在字型匹配装置1中进行输入字型信息X输入。输入字型信息X,在n划的文字时由n条笔划构成,若设定各笔划的笔序为X1,X2,…Xn,则在多元矢量空间中表示成First, the input font information X is input in the
X=(X1,X2,…Xn)。X = (X1, X2, ... Xn).
另外,若设存储在标准字型存储装置2中的用n划构成的标准字型为SK(K表示标准字型的分类名),SK的各种笔划的笔划顺序为SK1,SK2,…SKn,则标准字型SK在多元矢量空间中表示成In addition, if the standard font formed by n strokes stored in the standard
SK=(SK1,SK2,……SKn) SK = (SK1, SK2, ... SKn)
接着,字型匹配装置1计算标准字型SK与输入字型X的距离,但是因为输入字型X不一定是以正确的笔序记入,所以只是简单地将输入字型X的各要素与标准字型要素顺序对应命名,不能正确地识别用与标准字型的笔序不同的笔序记入的文字。Next, the
另外,标准字型再形成装置4根据存储在笔序组合存储装置3的笔序组合信息,转换笔序形成标准字型SKm(m=1,2,…,p),顺序输送到字型匹配装置1中。这里,p这标准字型的笔序变动的组合总数。In addition, the standard
接着,字型匹配装置1对由标准字型再形成装置4依次送来的标准字型,利用下式求出其与输入字型X的距离Pxkm。Next, the
Pxkm=|X1-Skm1|+|X2-Skm2|+……+|Xn-Skmn|Pxkm=|X1-Skm1|+|X2-Skm2|+...+|Xn-Skmn|
然后,识别装置5对由字型匹配装置1送来的P个距离内识别出最小距离,以该最小距离作为输入字型X与标准字型Sk的距离Pxk。对存储在标准字型储存装置中的所有类目内具有n划的标准字型进行以上处理,最后将最小距离的的类目作为识别结果输出。Then, the
已有技术例2Existing technology example 2
在通过吸取划数变动识别文字的已有在线文字识别装置中,有以输入字型的笔划的多个特征点与标准字型的笔划特征点进行DP匹配、进行特征眯的对应命名以便形成最小距离命名(专利公开昭62-62394),或者,为了提高字划数变动字型的识别精度,将笔划连接方法记述于标准字型中(专利公开昭60-79485,专利公开昭60-136783)等等。In the existing online character recognition device that recognizes characters by absorbing changes in the number of strokes, there are multiple feature points of the strokes of the input font and the stroke feature points of the standard font to perform DP matching, and the corresponding naming of the features is performed so as to form a minimum Distance naming (Patent Publication No. 62-62394), or, in order to improve the recognition accuracy of fonts with stroke count changes, stroke connection methods are described in standard fonts (Patent Publication No. 60-79485, Patent Publication No. 60-136783) etc.
例如,图28是在专利公开昭62-62394中说明的吸取字划数变动进行吸取的原有的在线文字识别装置的构成框图。图27是用来说明吸取字划数变动的原有的在线文字识别装置的识别顺序。For example, FIG. 28 is a block diagram of an existing online character recognition device that absorbs changes in the number of strokes described in Patent Publication No. 62-62394. FIG. 27 is a diagram for explaining the recognition sequence of a conventional online character recognition device for absorbing changes in the number of strokes.
图27(a)表示标准字型,图27(b)表示输入字型。图27(c)(D)是说明标准字型的各笔划与输入字型的各笔划的对应命名。Fig. 27(a) shows the standard font, and Fig. 27(b) shows the input font. Fig. 27(c)(D) illustrates the corresponding naming of each stroke of the standard font and each stroke of the input font.
图27中,30~39表示输入字型的笔划,10~21表示标准字型的笔划。In Fig. 27, 30-39 represent the strokes of the input font, and 10-21 represent the strokes of the standard font.
而在图28中,40为字划数检测部,41是字库地址发生部,42为字库,43为特征抽出部,44是DP匹配部,46为判定部,47是范围指定部。And in Fig. 28, 40 is the stroke number detection part, 41 is font address generating part, 42 is font, 43 is feature extracting part, 44 is DP matching part, 46 is judging part, 47 is range specifying part.
以下,用图27、28对吸取划数变动的原来在线文字识别装置的动作进行说明。Hereinafter, the operation of the conventional online character recognition device for absorbing fluctuations in the number of strokes will be described with reference to FIGS. 27 and 28 .
首先,将输入字型输入字划数检测部40及特征抽出部43,抽出划数及各笔划的特征点。用字划数检测部40,相对输入字型字划数n,将应对照的字划数范围(n+α)~(n-β)输给字库42。在图27(b)所示的「系」字的情况下,因为n=5,所以设α=2,β=1的字划数范围为7划~4划。First, the input font is input to the stroke count detecting unit 40 and the
图27(b)那样的输入字型为5划,形成10个特征点,而标准字型为6划,形成12个特征点。但是,如果简单地从头进行对应命名,则遗留了如图27(c)那样没有进行对应命名的特征数,由于没有进行正确地笔划对应命名,结果距离数值变大。The input font like Fig. 27 (b) is 5 strokes, forms 10 characteristic points, and standard font is 6 strokes, forms 12 characteristic points. However, if the corresponding naming is simply carried out from the beginning, the feature numbers which have not been correspondingly named as shown in Fig. 27(c) are left over, and the resulting distance value becomes larger because the corresponding strokes are not correctly named.
因此,在本已有技术例子中,采用了DP(动态编程)匹配,对划数不同的按距离最小进行对应命名。在这里,如下进行具体的DP匹配。Therefore, in this prior art example, DP (Dynamic Programming) matching is adopted, and those with different numbers of strokes are correspondingly named according to the smallest distance. Here, specific DP matching is performed as follows.
将输入字型的各特征点表示如下:Each feature point of the input font is expressed as follows:
A=a1,a2,……,ai,…aIA=a1,a2,……,ai,…aI
而标字库的标准字型的特征点表示如下:The characteristic points of the standard font of the standard font library are expressed as follows:
B=b1,b2,…,bj,…,bJB = b1, b2, ..., bj, ..., bJ
这时,ai,bj的距离d(i,j)可如下求出。设d(i,j)=|ai-bj|则(1) i=1,j=1时,In this case, the distance d(i, j) between ai and bj can be obtained as follows. Let d(i, j)=|ai-bj| then (1) When i=1, j=1,
g(1,1)=d(1,1)(2) i=1,j≠1时When g(1,1)=d(1,1)(2) i=1, j≠1
g(1,j)=g(1,j-1)+d(1,j)(3) i≠1,j=1时g(1, j)=g(1, j-1)+d(1, j)(3) when i≠1, j=1
g(i,1)=g(i-1,1)+d(i,1)(4) 其它情况 g(i,1)=g(i-1,1)+d(i,1)(4) Other cases
由上述的递推公式求出g(I,J)(初始条件),用下式求出A的特征点列与B的特征点列的距离S(A,B)。然后,将其距离最小者作为识别结果。Calculate g(I, J) (initial condition) from the above-mentioned recursive formula, and use the following formula to calculate the distance S(A, B) between the feature point sequence of A and the feature point sequence of B. Then, take the one with the smallest distance as the recognition result.
S(A,B)=1/(I+J)*g(I,J)S(A, B)=1/(I+J)*g(I, J)
DP匹配部44将上述字划数范围的标准字型依次从字库42读出,用上述DP匹配方法求出输入字型的特征点与标准字型的特征点的距离并输送到判定部46。判定部46将上述距离内距离最小者的文字作为识别结果输出。The DP matching unit 44 sequentially reads the standard fonts in the stroke count range from the font library 42, uses the above-mentioned DP matching method to find the distance between the feature points of the input font and the feature points of the standard font, and sends it to the judging unit 46. The determination part 46 outputs the character whose distance is the smallest among the said distances as a recognition result.
本例中,即使是图27(b)那样的输入字型,如图27(d)那样地正确进行特征点的对应命名,比图27(c)的对应命名的距离小。In this example, even if it is an input font like FIG. 27(b), the corresponding naming of the feature points is correctly performed as shown in FIG. 27(d), and the corresponding naming distance is smaller than that of FIG. 27(c).
如上说明的,就标准字型与输入字型的所有组合计算距离,吸取笔序变动的已有技术例1那样的在线文字识别装置,要对所有组合计算距离,计算量大,进行实时识别有困难。As explained above, with respect to all combination calculation distances of standard fonts and input font styles, the online character recognition device of example 1 of the prior art example 1 that absorbs stroke order changes, will calculate distances for all combinations, and the amount of calculation is large, and it is difficult to carry out real-time recognition. difficulty.
另外,记述标准字型中笔序对应命名方法的已有技术例1那样的在线文字识别装置,只能对预先准备在标准字型中的笔序变动进行处理,为了提高对应命名精度要对应地设定许多笔序,而有使字库变大的问题。In addition, the on-line character recognition device described in the prior art example 1 of the corresponding naming method of the stroke order in the standard font can only process the stroke order changes prepared in advance in the standard font. In order to improve the accuracy of the corresponding naming, corresponding Setting many strokes has the problem of making the font size larger.
另外,在原有技术例1的对应笔序变动的装置中,可对应笔序变动而不能对应划数变动,有使连在一起的字等的识别精度下降的问是。In addition, in the device for responding to stroke order changes in the prior art example 1, it can cope with stroke order changes but not stroke number changes, and there is a problem that the recognition accuracy of connected characters and the like is reduced.
同样,在用DP匹配进行划数不同字型的对应命名的已有技术例2那样的在线文字识别装置中,为了求出1个标准字型n划与输入字型m划的距离,必须进行n×m的距离计算,要对(m+d)~(n-β)的划数范围的所有标准字型进行这样计算。在这里,若考虑输入11划的文字作为输入文字的场合,则m=11,例如设定α=1,β=1,则必须就10划到13画的标准字型进行距离计算。即使限定在JIS第1水准文字中的场合,10划有289个字要作289×10×11=31790次的距离计算,11划有300个字要作300×11×11=36300次的距离计算,12划有297个字要作297××12×11=39204次的距离计算,13划有242个字要作242×13×11=34606次的距离计算。为此,合计要作141900次的距离计算,计算量非常多,实时识别有困难。Equally, in the on-line character recognition device of prior art example 2 such as prior art example 2 that carries out the correspondence naming of different fonts with DP matching, in order to obtain the distance of 1 standard font n strokes and input font m strokes, must carry out The calculation of the distance of n×m should be performed on all standard fonts in the range of strokes (m+d)~(n-β). Here, if considering the occasion of inputting characters with 11 strokes as the input characters, then m=11, for example, if α=1 and β=1 are set, then the distance calculation must be carried out with respect to standard fonts with 10 strokes to 13 strokes. Even if it is limited to
而且,在字库中具有结合方法等信息的已有技术例2那样的在线文字识别装置,要具有各个文字每一笔划的连接方法信息,形成字库有困难。且字库容量地必定很大。而且,因为不能对应预先没有准备好的连续的字,就存在有识别精度降低的问题。Furthermore, an online character recognition device such as prior art example 2, which has information such as a combination method in the font library, has difficulty in forming a font library with information on the connection method of each stroke of each character. And the font capacity must be very large. Furthermore, there is a problem that the recognition accuracy is lowered because it cannot correspond to consecutive characters that have not been prepared in advance.
另外,已有技术例2的对应划数变动的文字识别装置,只能对应划数变动而不能对应笔序变动,有对笔序发生变异的文字识别精度降低的问题,In addition, the character recognition device corresponding to the variation of the number of strokes in the prior art example 2 can only respond to the variation of the number of strokes but not the variation of the stroke order, and there is a problem that the recognition accuracy of characters with variation in the stroke order is reduced.
而且用上述已有技术例1,2的文字识别装置,积极地利用文字的笔划信息进行识别,但对于一些连笔字或笔序变化的文字,虽然文字整体形状美丽,但难以高精度读取。Moreover, with the above-mentioned character recognition devices of prior art examples 1 and 2, the stroke information of characters is actively used for recognition, but for some characters with continuous strokes or changes in the stroke order, although the overall shape of the characters is beautiful, it is difficult to read with high precision. .
不管文字的质量,为了进行识别处理,即使在比较小的地书写的场合,也需要和污染书写的文字相同的处理时间,为了使识别精度提高,对于要复杂地处理的所有文字,还有识别时间长的问题。Regardless of the quality of the characters, for recognition processing, even in the case of writing in relatively small places, the same processing time is required as for characters written with pollution. long time problem.
本发明目的在于为解决有关问题,获得一种既实现高精度识别,又可能高速识别的在线文字识别方法及在线文字识别装置。The object of the present invention is to solve related problems and obtain an online character recognition method and an online character recognition device that can realize high-precision recognition and high-speed recognition.
具体地说,以通过减少笔序对应命名识别的计算量,进行高速、高精度的识别为第1目的。Specifically, the first purpose is to perform high-speed and high-precision recognition by reducing the amount of calculations for recognition of stroke-order-corresponding names.
另外,通过减少划数对应命名识别的计算量,进行高速、高精度的识别为第2目的。In addition, it is the second object to perform high-speed and high-precision recognition by reducing the amount of calculation for recognition of stroke-number-corresponding names.
还有,除能高速识别外,同时还能对笔序变动及划数变动进行相应处理,进行高精度的识别、此为第3个目的。In addition, in addition to high-speed recognition, it is also possible to perform corresponding processing on stroke-order changes and stroke-number changes to perform high-precision recognition. This is the third purpose.
第4个目的是,即使是书写文字的笔序与标准文字的笔序不同的场合,也能进行精度优良的识别。A fourth object is to perform recognition with excellent accuracy even when the stroke order of written characters is different from that of standard characters.
作为第5个目的,是关于正确解释可能性高的文字对应于较柔软的笔序变动,关于正确解释可能较低的文字不过分进行笔序变动,都能以高速高精度地进行识别。The fifth object is to allow characters with a high probability of correct interpretation to respond to soft stroke changes, and characters with a low probability of correct interpretation to be recognized at high speed and high accuracy without excessive stroke changes.
第6个是在字型识别的识别结果良好场合将识别结果高速输出。The sixth is to output the recognition result at high speed when the recognition result of font recognition is good.
另外,由于在实际书写不能实现对应付名、要防止识别率变劣,此为第7目的。In addition, it is the seventh object to prevent the recognition rate from deteriorating because the corresponding name cannot be realized in actual writing.
为实现以上目的,第一点发明的在线文字识别方法包括有:In order to achieve the above purpose, the online character recognition method invented in the first point includes:
伴随文字的书写动作输入该书写信息的书写信息输入步骤;A writing information input step of inputting the written information with the writing action of the characters;
从上述书写信息抽出表示上述所书写文字的整体形状的特征,根据表示上述整体形状的特征从字库选出多个类似于上述所书写文字的标准文字的字型识别步骤;Extracting the feature representing the overall shape of the above-mentioned written character from the above-mentioned writing information, and selecting a plurality of font recognition steps similar to the standard characters of the above-mentioned written character from the font library according to the feature representing the above-mentioned overall shape;
从上述书写信息抽出表示构成上述所书写文字的笔划特征的第一笔划特征,考虑上述所书写文字的笔序根据上述第一笔划特征进行上述所书写文字的笔划与上述字形识别步骤中所选出的标准文字的笔划的对应命名,算出上述字型识别步骤中选出的多个标准文字相对上述书写文字的类似度的笔序对应命名识别步骤,以及,Extract the first stroke feature that represents the stroke feature that constitutes the above-mentioned written character from the above-mentioned writing information, consider the stroke order of the above-mentioned written character and perform the stroke and the above-mentioned font recognition step of the above-mentioned written character according to the above-mentioned first stroke feature. The corresponding naming of the strokes of the standard characters, calculate the corresponding naming recognition step of the stroke sequence of a plurality of standard characters selected in the above-mentioned font recognition step relative to the similarity of the above-mentioned written characters, and,
根据上述笔序对应命名识别步骤中算出的所说类似度,输出上述字型识别步骤中所选出标准文字的第一输出步骤。The first output step of outputting the standard characters selected in the above-mentioned font recognition step according to the said similarity calculated in the above-mentioned stroke-order correspondence naming recognition step.
这里,书写信息输入步骤相当于后述实施方式的将文字写入标牌等的步骤(图2有S1)。而第一笔划特征则相当于后面实施方式中的从字笔划的起点到终点的方向,笔划的外接矩形的宽,高度等。上述字型识别步骤中选出的多个标准文字的对上述书写文字的类似度相当于后面实施方式的标准文字与书写文字的距离等。Here, the written information input step corresponds to the step of writing characters on a sign or the like in an embodiment described later (S1 in FIG. 2 ). The first stroke feature is equivalent to the direction from the start point to the end point of the character stroke, the width and height of the circumscribed rectangle of the stroke in the following embodiments. The degree of similarity between the plurality of standard characters selected in the above-mentioned font recognition step and the above-mentioned written characters is equivalent to the distance between the standard characters and the written characters in the embodiment described later.
第二点发明的在线文字识别方法,包括有:The online text recognition method of the second invention includes:
伴随文字书写动作输入该书写信息的书写信息输入步骤;The writing information input step of inputting the written information along with the writing action;
从上述书写信息抽出表示上述所书写文字整体形状的特征,根据表示上述整体形状的特征从字库选出多个类似于上述书写文字的标准文字的字形识别步骤;Extracting the features representing the overall shape of the above-mentioned written characters from the above-mentioned writing information, and selecting a plurality of standard characters similar to the above-mentioned written characters from the font library according to the features representing the above-mentioned overall shape;
从上述书写信息抽出表示构成上述书写文字的笔划特征的第二笔划特征,根据上述第二笔划特征,考虑上述书写文字划数进行上述书写文字笔划与上述字形识别步骤中所选出标准文字的笔划的对应命名,算出上述字形识别步骤中选出的多个标准文字相对上述书写文字的类似度的划数对应命名识别步骤,以及Extract the second stroke feature representing the stroke feature that constitutes the above-mentioned written character from the above-mentioned writing information, and according to the above-mentioned second stroke feature, consider the number of strokes of the above-mentioned written character to perform the stroke of the above-mentioned written character and the stroke of the standard character selected in the above-mentioned font recognition step The corresponding naming, calculate the number of strokes corresponding to the naming recognition step of a plurality of standard characters selected in the above-mentioned font recognition step relative to the similarity of the above-mentioned written characters, and
根据上述划数对应命名识别步骤中算出的上述类似度,输出上述字形识别步骤中选出的标准文字的第2输出步骤。The second output step of outputting the standard character selected in the above-mentioned font recognition step based on the above-mentioned similarity calculated in the above-mentioned stroke-number-corresponding-name recognition step.
其中,第二笔划特征相当于后述实施方式中的笔划宽度,高度等。Wherein, the second stroke feature is equivalent to the stroke width, height, etc. in the embodiments described later.
第三点发明的在线文字识别方法,包括有:The online text recognition method invented by the third point includes:
伴随文字书写动作输入其书写信息的书写信息输入步骤;The writing information input step of inputting the writing information accompanying the writing action;
从上述书写信息抽出表示上述书写文字的整体形状的特征,根据表示上述整体的特征从字库选出多个类似于上述书写文字的标准文字的字形识别步骤;Extracting the features representing the overall shape of the above-mentioned written characters from the above-mentioned written information, and selecting a plurality of standard characters similar to the above-mentioned written characters from the font library according to the above-mentioned overall features;
从上述书写信息抽出表示构成上述书写文字笔划特征的第一笔划的特征,根据上述第一笔划特征,考虑上述书写文字笔序进行上述书写文字的笔划与上述字形识别步骤中选出的标准文字的笔划的对应命名,算出上述字形识别步骤选出的多个标准文字相对上述书写文字的类似度的笔序对应命名识别步骤;Extract the feature representing the first stroke that constitutes the stroke feature of the above-mentioned written character from the above-mentioned written information, and according to the above-mentioned first stroke feature, consider the stroke order of the above-mentioned written character to carry out the comparison between the stroke of the above-mentioned written character and the standard character selected in the above-mentioned font recognition step The corresponding naming of strokes, calculating the corresponding naming recognition step of the stroke sequence of a plurality of standard characters selected by the above-mentioned font recognition step relative to the similarity of the above-mentioned written characters;
通过将上述笔序对应识别步骤中算出的类似度与预定值进行比较,判别是否输出上述标准文字的第一判别步骤;A first discrimination step of judging whether to output the above-mentioned standard characters by comparing the similarity calculated in the above-mentioned stroke-order correspondence recognition step with a predetermined value;
在上述第一判别步骤中判别为输出时,根据上述笔序对应命名识别步骤中算出的上述类似度输出上述字形识别步骤中选择的标准文字的第一输出步骤;When being judged as output in the above-mentioned first discrimination step, the first output step of outputting the standard characters selected in the above-mentioned font recognition step according to the above-mentioned degree of similarity calculated in the above-mentioned stroke-corresponding naming recognition step;
在上述第一判别步骤中判别为不输出时,从上述书写信息抽出表示构成上述书写文字的笔划特征的第二笔划特征,根据上述第二笔划特征考虑上述书写文字划数进行上述书写文字笔划与上述字形识别步骤中选出的标准文字相对笔划的对应命名,算出上述字形识别步骤中选出的多个标准文字的上述书写文字的类似度的划数对应命名识别步骤;以及When it is judged not to output in the above-mentioned first judging step, a second stroke feature representing a stroke feature constituting the above-mentioned written character is extracted from the above-mentioned written information, and the above-mentioned written character stroke and The corresponding naming of the relative strokes of the standard characters selected in the above-mentioned font recognition step, and the stroke number corresponding to the naming recognition step of calculating the similarity of the above-mentioned written characters of a plurality of standard characters selected in the above-mentioned font recognition step; and
根据上述划数对应命名识别步骤中算出的上述类似度,输出在上述字形识别步骤中选出的标准文字的第二输出步骤。The second output step of outputting the standard characters selected in the above-mentioned font recognition step according to the above-mentioned similarity calculated in the above-mentioned stroke-number correspondence naming recognition step.
其中,第一判别步骤相当于后面的实施方式的由笔序优先输出机构55的有无输出的判定(图2的S4)。Among them, the first judging step corresponds to the judging whether or not to output by the stroke order priority output means 55 in the later embodiment (S4 in FIG. 2 ).
第四点发明的在线文字识别方法,是在第一或第三点发明的在线文字识别方法的上述笔序对应命名识别步骤中,根据上述第二笔划特征按上述书写文字的笔序进行在上述书写文字的笔划与上述字形识别步骤中选出的标准文字笔划的对应命名,不能进行上述对应命名时变更对应命名笔划顺序进行再次对应命名。In the online character recognition method of the fourth invention, in the above-mentioned stroke-order-corresponding naming recognition step of the online character recognition method of the first or third invention, according to the above-mentioned second stroke feature, the stroke order of the above-mentioned written characters is performed in the above-mentioned The corresponding naming of the strokes of the written characters and the standard character strokes selected in the above-mentioned font recognition step, when the above-mentioned corresponding naming cannot be performed, the order of the corresponding naming strokes can be changed to carry out corresponding naming again.
第五点发明的在线文字识别方法是在第四点发明所在线文字识别方法中,用上述字形识别步骤算出相对上述书写文字的上述选出标准文字的类似度,用上述笔序对应命名识别步骤,根据上述字形识别步骤中算出的上述标准文字的类似度多次进行变更上述笔划顺序的对应命名。In the online character recognition method of the fifth invention, in the online character recognition method of the fourth invention, the above-mentioned font recognition step is used to calculate the similarity of the above-mentioned selected standard characters relative to the above-mentioned written characters, and the above-mentioned stroke order is used to correspond to the naming recognition step According to the similarity of the above-mentioned standard characters calculated in the above-mentioned font recognition step, the corresponding naming of the above-mentioned stroke sequence is changed multiple times.
第六点发明在线文字识别方法是在第三点发明的的在线文字识别方法中,还有在上述字形识别步骤之后,通过将对上述书写文字的上述字形识别步骤中算出的标准文字的类似度与预定值进行比较,判断是否输出上述标准文字的第2判别步骤;以及,在上述第二判别判骤中判别为输出时,根据上述字形识别步骤中算出的上述类似度,输出上述字形识别步骤中选出的标准文字的第三输出步骤,The online character recognition method of the sixth invention is in the online character recognition method of the third invention, and after the above-mentioned font recognition step, the similarity of the standard characters calculated in the above-mentioned font recognition step to the above-mentioned written characters Compared with a predetermined value, it is judged whether to output the second judging step of the above-mentioned standard characters; The third output step of the standard text selected in
在上述第二判别步骤中判别为不输出时,实行上述笔序对应命名识别步骤。When it is judged not to output in the above-mentioned second judging step, the above-mentioned stroke-order-corresponding naming recognition step is carried out.
其中,第二判别步骤相当于后面实施方式的用字形优先输出机构110的输出有无的判定(图18的S3)。Here, the second judging step corresponds to judging the presence or absence of output by the font-priority output means 110 in the later embodiment (S3 in FIG. 18).
另外,第七点发明在线文字识别方法,是在第二点发明的在线文字识别方法中,In addition, the online character recognition method of the seventh invention is in the online character recognition method of the second invention,
上述划数对应命名识别步骤由DP匹配步骤与类似度算出步骤构成,The above-mentioned stroke number corresponding naming recognition step is made of DP matching step and similarity calculation step,
上述DP匹配步骤从上述书写信息抽出表示构成上述书写文字的笔划的特征的第二笔划特征,利用DP匹配基于上述第二笔划特征进行上述书写文字笔划与由上述字形识别步骤选出的标准文字笔划的对应命名,The DP matching step extracts the second stroke features representing the features of the strokes constituting the written characters from the written information, and uses DP matching to compare the strokes of the written characters with the standard character strokes selected in the font recognition step based on the second stroke features. The corresponding name of
上述类似度算出步骤从上述DP匹配步骤中经对应命名的笔划抽出上述书写文字的笔划与上述标准文字的笔划一一对应的稳定笔划,对上述稳定笔划算出相对于上述字形识别步骤中选出的标准文字的上述书写文字的类似度,The above-mentioned similarity calculation step extracts the stable strokes corresponding to the strokes of the above-mentioned written characters and the strokes of the above-mentioned standard characters from the strokes correspondingly named in the above-mentioned DP matching step, and calculates the above-mentioned stable strokes relative to the strokes selected in the above-mentioned font recognition step. the similarity of the above-mentioned written characters of the standard characters,
上述第二输出步骤根据上述类似度计算步骤中算出的上述类似度输出在上述字形识别步骤中选出的标准文字。The second output step outputs the standard character selected in the font recognition step based on the similarity calculated in the similarity calculation step.
其中,类似度算出步骤相当于后面实施方式的用稳定笔划检测机构132的检测(图22的S51)。Here, the similarity calculation step corresponds to the detection by the stable stroke detection means 132 in the later embodiment (S51 in FIG. 22).
第八点发明的在线文字识别装置,包括:The online character recognition device of the eighth invention includes:
伴随文字书写动作输入该书写信息的书写信息输入单元;A written information input unit for inputting the written information accompanying the writing action;
由上述书写信息抽出表示上述书写文字整体形状的特征,根据上述表示整体形状的特征从字库选出多个类似于上述书写文字的标准文字的字型识别单元;Extracting the feature representing the overall shape of the above-mentioned written character from the above-mentioned writing information, and selecting a plurality of font recognition units similar to the standard characters of the above-mentioned written character from the font library according to the above-mentioned feature representing the overall shape;
由上述书写信息抽出表示构成上述书写文字的笔划的特征的第一笔划特征,根据第一笔划特征考虑上述书写文字笔序进行上述书写文字笔划与由上述字形识别单元选出的标准文字的笔划的对应命名,算出由上述字形识别单元选出的多个标准文字相对上述书写文字的类似度的笔序对应命名识别单元;以及Extracting a first stroke feature representing the features of the strokes constituting the written character from the written information, and performing an identification of the strokes of the written character and the strokes of the standard character selected by the font recognition unit in consideration of the stroke order of the written character based on the first stroke feature. Corresponding naming, calculating the corresponding naming recognition unit of the stroke order corresponding to the similarity of a plurality of standard characters selected by the above-mentioned font recognition unit with respect to the above-mentioned written characters; and
根据上述笔序对应使名识别单元中算出的上述类似度输出在上述字形识别单元中选出的标准文字的第一输出单元。A first output unit for outputting the standard characters selected by the font recognition unit based on the stroke order corresponding to the similarity calculated by the name recognition unit.
其中,书写信息输入单元相当于后面实施例的输入单元50。而第一输出单元相当于后面实施例中的显示单元58。Here, the written information input unit corresponds to the
第九点发明的在线文字识别装置,包括:The online character recognition device of the ninth invention includes:
伴随文字书写动作输入其书写信息的书写信息输入单元;A written information input unit for inputting written information accompanying the writing action;
由上述书写信息抽出表示上述书写文字整体形状的特征,根据表示上述整体形状的特征从字库选出多个类似于上述书写文字的标准文字的字形识别单元;Extracting the feature representing the overall shape of the above-mentioned written character from the above-mentioned writing information, and selecting a plurality of font recognition units similar to the standard characters of the above-mentioned written character from the font library according to the feature representing the above-mentioned overall shape;
由上述书写信息抽出表示构成上述书写文字笔划特征的第二笔划的特征,根据上述第二笔划特征考虑上述书写文字划数进行上述书写文字的笔划与上述字形识别单元中选出的标准文字的笔划的对应命名,算出上述字形识别单元中选出的多个标准文字相对上述书写文字的类似度的划数对应命名识别单元;以及Extracting the feature representing the second stroke that constitutes the stroke feature of the above-mentioned written character from the above-mentioned writing information, and performing the strokes of the above-mentioned written character and the strokes of the standard character selected by the above-mentioned font recognition unit in consideration of the number of strokes of the above-mentioned written character according to the above-mentioned second stroke feature The corresponding naming, calculating the number of strokes corresponding to the naming recognition unit of a plurality of standard characters selected in the above-mentioned font recognition unit relative to the similarity of the above-mentioned written characters; and
根据上述划数对应命名识别单元中算出的上述类似度输出在上述字形识别单元中选出的标准文字的第二输出单元。A second output unit for outputting the standard characters selected in the font recognition unit according to the similarity calculated in the stroke number corresponding to the naming recognition unit.
其中,第二输出单元相当于后面实施例中的显示单元58。Wherein, the second output unit is equivalent to the
第十点发明的在线文字识别装置,包括:The online character recognition device of the tenth invention includes:
伴随文字书写动作输入其书写信息的书写信息输入单元;A written information input unit for inputting written information accompanying the writing action;
从上述书写信息抽出表示上述书写文字整体形状的特征,根据上述表示整体形状的特征从字库选出多个类似于上述书写文字的字形识别单元;Extracting features representing the overall shape of the above-mentioned written characters from the above-mentioned writing information, and selecting a plurality of font recognition units similar to the above-mentioned written characters from the font library according to the above-mentioned features representing the overall shape;
从上述书写信息抽出表示构成上述书写文字的笔划的特征的第一笔划特征,根据上述第一笔划特征考虑上述书写文字笔序,进行上述书写文字笔划与上述字形识别单元中选出的标准文字的笔划的对应命名,算出上述字形识别单元中选出的多个标准文字相对上述书写文字的类似度的笔序对应命名识别单元;Extract the first stroke feature representing the features of the strokes constituting the written character from the written information, consider the stroke order of the written character according to the first stroke feature, and perform the comparison between the strokes of the written character and the standard character selected in the font recognition unit The corresponding naming of strokes, calculating the stroke order corresponding naming recognition unit of a plurality of standard characters selected in the above-mentioned font recognition unit relative to the similarity of the above-mentioned written characters;
通过将上述笔序对应命名识别单元中算出的类似度与预定值进行比较,判别是否输出上述标准文字的第一判别单元;A first discrimination unit that judges whether to output the above-mentioned standard text by comparing the similarity calculated in the above-mentioned stroke-order corresponding naming recognition unit with a predetermined value;
在上述第一判别单元中判别为输出时,根据上述笔序对应命名识别单元中算出的上述类似度输出在上述字形识别单元中选出的标准文字的第一输出单元;和When it is judged to be an output in the above-mentioned first judging unit, the first output unit that outputs the standard characters selected in the above-mentioned font recognition unit according to the above-mentioned similarity calculated in the above-mentioned stroke order corresponding to the naming recognition unit; and
在上述第一判别单元中判别为不输出时,从上述书写信息抽出表示构成上述书写文字的笔划特征的第二笔划特征,根据上述第二笔划特征考虑上述书写文字划数进行上述书写文字笔划与在上述字形识别单元中选出的标准文字笔划的对应命名,算出上述字形识别单元中选出的多个标准文字相对上述书写文字的类似度的划数对应命名识别单元;以及When it is judged as not to be output by the first judging unit, a second stroke feature representing a stroke feature constituting the written character is extracted from the written information, and the strokes of the written character are compared with each other in consideration of the number of strokes of the written character based on the second stroke feature. The corresponding naming of the standard character strokes selected in the above-mentioned font recognition unit, calculate the number of strokes corresponding to the similarity of the plurality of standard characters selected in the above-mentioned font recognition unit with respect to the above-mentioned written characters, and the corresponding naming recognition unit; and
根据上述划数对应命名识别单元算出的上述类似度,输出上述字形识别单元中选出的标准文字的第二输出单元。A second output unit that outputs the standard characters selected by the font recognition unit according to the similarity calculated by the stroke number corresponding to the naming recognition unit.
其中,第一判别单元相当于后面实施例中的笔序优先输出单元55。Wherein, the first judging unit is equivalent to the stroke order
另外,第11点发明的在线文字识别,是在第9个发明的在线文字识别装置中,In addition, the online character recognition of the eleventh invention is in the online character recognition device of the ninth invention,
上述的划数对应命名单元由DP匹配单元与类似度计算单元构成;The above-mentioned stroke number corresponding naming unit is composed of a DP matching unit and a similarity calculation unit;
上述DP匹配单元从上述书写信息抽出表示构成上述书写文字的笔划特征的第二笔划特征,根据上述第二笔划特征通过DP匹配对上述书写文字笔划与上述字形识别单元中选出的标准文字笔划进行对应命名;The above-mentioned DP matching unit extracts from the above-mentioned writing information the second stroke features representing the stroke features constituting the above-mentioned written characters, and performs DP matching on the above-mentioned written character strokes and the standard character strokes selected by the above-mentioned font recognition unit according to the above-mentioned second stroke features. Corresponding naming;
上述类似度计算单元从上述DP匹配单元中经过经命名的笔划,抽出上述书写文字的笔划与上述标准文字的笔划一一对应的稳定笔划,就上述稳定笔划算出上述字形识别单元中选出的标准文字相对上述书写文字的类似度;The above-mentioned similarity calculation unit extracts the stable strokes corresponding one-to-one between the strokes of the above-mentioned written characters and the strokes of the above-mentioned standard characters through the named strokes from the above-mentioned DP matching unit, and calculates the standard strokes selected in the above-mentioned font recognition unit with respect to the above-mentioned stable strokes. The degree of similarity of the characters to the above-mentioned written characters;
上述第二输出单元根据上述类似度算出单元中算出的上述类似度,输出上述字形识别单元中选出的标准文字。The second output unit outputs the standard character selected by the font recognition unit based on the similarity calculated by the similarity calculation unit.
其中,类似度算出单元相当于后面实施例中的稳定笔划检测单元132。Wherein, the similarity calculation unit is equivalent to the stable stroke detection unit 132 in the following embodiments.
第12点发明的在线文字识别装置,是在第10点发明的在线文字识别装置中,包括有:The online character recognition device of the 12th invention is in the online character recognition device of the 10th invention, including:
通过对上述书写文字的上述字形识别单元中算出的标准文字的类似度与预定值进行比较,判别是否输出上述标准文字的第二判别单元;A second judging unit that judges whether to output the above-mentioned standard characters by comparing the similarity of the standard characters calculated in the above-mentioned font recognition unit of the above-mentioned written characters with a predetermined value;
在上述第二判别单元中判别为输出时,根据上述字形识别单元中算出的上述类似度输出在上述字形识别单元中选出的标准文字的第三输出单元;A third output unit that outputs the standard characters selected in the above-mentioned font recognition unit according to the above-mentioned similarity calculated in the above-mentioned font recognition unit when it is judged as output in the above-mentioned second judging unit;
上述笔序对应命名识别单元,在第二判断单元中判别不输出时进行处理。The above-mentioned stroke order corresponds to the naming recognition unit, and is processed when it is judged not to be output by the second judging unit.
下面,参照附图说明发明的实施方式。附图中:Embodiments of the invention will be described below with reference to the drawings. In the attached picture:
图1为表示实施方式1的在线文字识别装置的构成图;FIG. 1 is a block diagram showing an online character recognition device according to
图2是表示实施方式1的控制单元59动作的流程图;FIG. 2 is a flow chart showing the operation of the
图3是表示在实施方式1的输入单元50中书写的输入字形的图;FIG. 3 is a diagram showing an input font written in the
图4表示实施方式1的字形识别单元52动作的流程图;Fig. 4 represents the flow chart of the
图5是表示实施方式1的字形识别单元52的分配方向代码的种类的图;FIG. 5 is a diagram showing the types of allocation direction codes of the
图6是表示实施方式1的进行方向代码分配处理后的输入字形图;Fig. 6 is the input glyph diagram after carrying out direction code allocation processing of
图7是实施方式1的方向代码区域分割,在各区域中每一方向的代码计数的图;Fig. 7 is the division of direction code area of
图8是表示实施方式1的笔序对应命名识别单元54动作的流程图;Fig. 8 is a flow chart showing the actions of the stroke-order corresponding naming recognition unit 54 in
图9是表示实施方式1的笔序对应命名识别单元54分配笔划方向、假想的笔划方向时的方向的图;FIG. 9 is a diagram showing directions when the stroke-order-corresponding naming recognition unit 54 assigns stroke directions and virtual stroke directions in
图10是用实施方式1的笔序对应命名识别单元54表示输入字形的第一笔划的笔划特征的图;Fig. 10 is a figure representing the stroke features of the first stroke of the input font with the stroke-order corresponding naming recognition unit 54 of
图11为用实施方式1的笔序对应命名识别单元54表示进行笔划对应命名的法则的图;FIG. 11 is a diagram showing the rule of stroke-corresponding naming by the stroke-order corresponding naming recognition unit 54 in
图12为表示实施方式1的笔划特征字库中的文字「 」的标准字形的笔划排列的图;12 is a diagram showing the stroke arrangement of the standard glyph of the character " " in the stroke feature font library of
图13为表示实施方式1的文字「 」用不同笔序记入的输入字形的图;Fig. 13 is a diagram showing the input fonts of the character " " in
图14是表示实施方式1的笔序对应命名识别单元54的输入字形的笔序变更的过程的图;FIG. 14 is a diagram showing the process of changing the stroke order of the input font of the stroke-order corresponding naming recognition unit 54 in
图15是表示实施方式1的为了说明划数对应命名识别单元57动作的连笔字的输入字形的图;FIG. 15 is a diagram showing input fonts of ligatures in order to illustrate the operation of the stroke number-corresponding
图16是表示实施方式1的划数对应命名识别单元57的标准字形与输入字形的对应命名结果的图;FIG. 16 is a diagram showing the corresponding naming results of standard fonts and input fonts of the stroke number corresponding naming
图17是实施方式2在线文字识别装置的构成图;17 is a structural diagram of an online character recognition device in
图18为表示实施例2的控制单元59的动作流程的流程图;FIG. 18 is a flow chart showing the flow of operations of the
图19表示实施方式3的在线文字识别装置的构成图;FIG. 19 shows a configuration diagram of an online character recognition device according to
图20为说明实施方式3的控制单元59的动作流程的流程图;FIG. 20 is a flowchart illustrating the flow of operations of the
图21为实施方式4的在线文字识别装置的构成图;FIG. 21 is a structural diagram of an online character recognition device according to
图22为说明实施方式4的控制单元59动作流程的流程图;FIG. 22 is a flow chart illustrating the operation flow of the
图23为说明实施方式4的DP匹配单元130的输入字形与标准字形的DP匹配结果的距离成最小的路径的图;FIG. 23 is a diagram illustrating the path where the distance between the input glyph of the DP matching unit 130 of
图24为说明实施方式4的DP匹配单元130的输入字形与标准字形的实际书写不能进行对应命名时的路径的图;FIG. 24 is a diagram illustrating the path when the actual writing of the input glyph and the standard glyph in the DP matching unit 130 of
图25是表示用实施方式4的稳定笔划检测单元132检测出的稳定笔划的对应命名的图;FIG. 25 is a diagram showing the corresponding names of the stable strokes detected by the stable stroke detection unit 132 of
图26为表示原有技术例1的在线文字识别装置的构成图;Fig. 26 is a structural diagram showing the online character recognition device of prior art example 1;
图27是表示原有技术例2的输入字形、标准字形和其对应命名结果的图;Fig. 27 is a figure representing the input font, standard font and its corresponding naming result of prior art example 2;
图28是表示原有技术例2的在线文字识别装置的构成图;Fig. 28 is a structural diagram showing an online character recognition device of prior art example 2;
实施方式1
本实施方式的在线文字识别装置是通过字形识别抽出备用文字,就该备用文字进行笔序对应命名识别,再进行划数对应命名识别,以下参照图1~16进行说明。The online character recognition device of this embodiment extracts spare characters through font recognition, performs stroke-order-corresponding naming recognition on the spare characters, and then performs stroke-number-corresponding naming recognition, which will be described below with reference to FIGS. 1-16 .
图1是本实施方式的在线文字识别装置的构成图。FIG. 1 is a configuration diagram of an online character recognition device according to this embodiment.
图1中,50为输入标牌等的书写信息的输入单元,51为预先存储不依赖于各个文字的划数、笔序的文字的总体形状特征的字形字库。In FIG. 1 , 50 is an input unit for inputting written information such as signs, and 51 is a font library that stores in advance the general shape characteristics of characters independent of the number of strokes and stroke order of each character.
52是通过与从前述输入单元50所得的书写信息抽出不依赖于划数、笔序没有的总体形状特征的前述字形字库51比较,进行识别的字形识别单元。所谓不依赖于划数、笔序的文字的总体形状特征容候后述。52 is a font recognition unit that recognizes by comparing with the aforementioned font library 51 that extracts overall shape features independent of stroke number and stroke order from the written information obtained from the
53是预先贮存着各个文字的划数,笔序,及构成文字的每一笔划的笔划特征的笔划特征字库。53 is a stroke feature character library that stores in advance the number of strokes of each character, the stroke order, and the stroke features of each stroke that constitutes the character.
在本说明书中,所谓笔划就是构成某一文字的各种各样的字划。In this specification, the so-called strokes refer to various character strokes that constitute a certain character.
54是从前述输入单元50所得的书写信息抽出输入的笔划特征,通过考虑笔序对存储在前述笔划特征字库53中的各个文字的特征对应命名进行识别的笔序对应命名识别单元。54 is the stroke feature extracted from the writing information obtained by the
55是根据前述笔序对应命名识别单元54所得识别结果信息,确定是否将前述识别结果作为最终识别结果输出的笔序优先输出单元。55 is a stroke order priority output unit for determining whether to output the recognition result as the final recognition result according to the recognition result information obtained by the recognition unit 54 corresponding to the stroke order.
57是考虑划数对输入字形的笔划特征与前述笔划特征字库53的标准字形的笔划进行对应命名识别的划数对应命名识别单元。57 is a stroke-number-corresponding-naming recognition unit that recognizes the stroke features of the input font and the strokes of the standard font in the aforementioned stroke
58是显示笔序对应命名识别单元54的识别结果或划数对应命名识别单元57的识别结果的显示单元。58 is a display unit for displaying the recognition result of the stroke-number corresponding naming recognition unit 54 or the recognition result of the stroke-number corresponding naming
59是控制文字识别装置内的各单元50~57的处理动作的控制单元,它分别连接到各个单元50~57。59 is a control unit for controlling the processing operation of each unit 50-57 in the character recognition device, which is connected to each unit 50-57, respectively.
以下,用图2对本实施方式的在线文字识别装置的动作进行说明。Hereinafter, the operation of the online character recognition device according to this embodiment will be described with reference to FIG. 2 .
图2是表示本实施方式的在线文字识别装置的控制单元59的处理过程的流程图。FIG. 2 is a flowchart showing the processing procedure of the
作为本实施方式的说明,对输入文字 时的文字识别程序进行说明。As a description of this embodiment, for input text The character recognition program at that time will be described.
首先,在步骤S1,控制单元59指令输入单元50获得输入字形。输入字形是包含有随着写入到输入单元50的动作的输入过程信息。这种信息为书写信息。First, at step S1, the
图3是说明输入单元50所书写输入字形
的图,60~66是按时间先后输入的笔划顺序。60~66的顺序表示写入笔划的输入顺序也即笔序。本例中用七划书写,所以是由7条笔划构成。Fig. 3 is to illustrate
因此,在S1中,作为书写信息获得7条笔划60~66。Therefore, in S1, seven
接着进入步骤S2,控制单元59发出指令,将输入字形的笔划顺序送入字形识别单元52,进行字形识别。Then enter step S2, the
本例中,将图3的输入字形的笔划60~66送入字形识别单元52,字形识别单元52从输入字形的笔划60~66抽出与划数、笔序无关的文字总体形状特征,进行识别。In this example, the strokes 60-66 of the input font in Figure 3 are sent to the
用图4作详细说明字形识别单元52的识别处理的步骤S2的过程。The procedure of step S2 of the recognition processing by the
图4是表示字形识别单元52的处理过程的流程图。FIG. 4 is a flowchart showing the processing procedure of the
首先,在图4的步骤S11,字形识别单元52就输入字形的各笔划将4种方向代码分配在多种取样的各点上。First, in step S11 of FIG. 4, the
4种方向代码是示于图5的4种方向代码,具体地有垂直(V)70,水平(H)71,斜右上(R)72,斜左上(L)73的方向代码。4 kinds of direction codes are shown in Fig. 5 4 kinds of direction codes, specifically vertical (V) 70, horizontal (H) 71, oblique upper right (R) 72, oblique upper left (L) 73 direction codes.
在各个方向中将以箭头的方向为中心直到虚线的范围视作同一方向。In each direction, the range from the direction of the arrow to the dotted line is regarded as the same direction.
就一条笔划多点取样,将各点分配到前述4种方向代码的任一个。Multi-point sampling is performed on one stroke, and each point is assigned to any one of the aforementioned 4 direction codes.
但是,通常的采样点由于书写速度偏差和采样速率无一定间隔,点不一定是紧密排列的。However, the usual sampling points are not necessarily closely spaced due to writing speed deviation and sampling rate.
因而,若将方向代码分配到其原来各点,即使相同字形的文字,因书写速度使方向代码数也会产生偏差。另外,因文字大小也产生点数不同,因此,要使大小适当规一化并在点间进行补插使点数规一化。Therefore, if the direction codes are assigned to the original dots, the number of direction codes will vary depending on the writing speed even for characters with the same font. In addition, the number of dots varies depending on the size of the characters, so the size should be properly normalized and interpolation between dots should be performed to normalize the dots.
就经规一化的各点求出从现在点到下一总的方向,通过判定该方向是否相当于图5的任一方向,将方向代码分配于各点上。这里,因为笔划的终点不存在下一点,分配与前点同一方向。进行输入字形各笔划60~66大小的规一化,补插处理,将方向代码分配到各点的结果示于图6。The general direction from the current point to the next is obtained for each normalized point, and the direction code is assigned to each point by judging whether the direction corresponds to any direction in FIG. 5 . Here, since the next point does not exist at the end point of the stroke, the same direction as that of the previous point is assigned. The results of normalizing the size of each stroke 60-66 of the input font, interpolation processing, and assigning direction codes to each point are shown in FIG. 6 .
图6(a)是分配了水平方向代码(H)的结果,图6(b)是分配了垂直方向代码(H)的结果。该例的字形,因不存在斜右上(R),斜左上(L)的方向,省略了这些方向。Fig. 6(a) is the result of assigning the horizontal direction code (H), and Fig. 6(b) is the result of assigning the vertical direction code (H). The font of this example, because there is not oblique upper right (R), the direction of oblique upper left (L), omitted these directions.
接着,进入图4的步骤S12,将整个字分割成4×4个区域,对各区域每种方向代码进行计数。Next, enter step S12 of Fig. 4, divide the whole word into 4*4 areas, and count each direction code in each area.
图7为将整个文字分割成4×4个区域,计算各区域内的方向代码数的图解。Fig. 7 is an illustration of dividing the entire character into 4×4 regions and calculating the number of direction codes in each region.
图7(c)是对图6(a)的水平方向代码进行计算的结果、图7(d)是对图6(b)的垂直方向代码进行计算的结果。Fig. 7(c) is the result of calculating the code in the horizontal direction in Fig. 6(a), and Fig. 7(d) is the result of calculating the code in the vertical direction in Fig. 6(b).
因此,从输入字形抽出了4个方向的方向代码特征F沿各方向表示成4×4的16维矢量。Therefore, the directional code feature F of four directions is extracted from the input font and expressed as a 4×4 16-dimensional vector along each direction.
F=(FH,FV,FR,FL) F = (FH, FV, FR, FL)
FH=FH1,FH2,…,FH16 FH = FH1, FH2, ..., FH16
FV=FV1,FV2,…,FV16 FV = FV1, FV2, ..., FV16
FR=FR1,FR2,…,FR16FR=FR1,FR2,…,FR16
FL=FL1,FL2,…,FL16 FL = FL1, FL2, ..., FL16
FH1,FH2,…,FH16表示4×4的区域内的方向代码的计算数。对FV,FR,FL也有同样含义。FH1, FH2, . . . , FH16 indicate the number of calculations of direction codes in the 4×4 area. It has the same meaning for FV, FR, FL.
接着进入图4的S13,求出预先存入字形字库的各个字的标准字形的方向代码分布特征与输入字形的特征的距离D。Then enter into S13 of FIG. 4 , and obtain the distance D between the direction code distribution feature of each character stored in the font library in advance and the feature of the input font.
标准字形的方向代码的分布特征可以S作如下表示:The distribution characteristic of the direction code of standard font can be expressed as follows:
S=(SH,SV,SR,SL)S = (SH, SV, SR, SL)
SH=SH1,SH2,……,SH16 SH=SH1, SH2,...,SH16
SV=SV1,SV2,……,SV16SV=SV1, SV2,..., SV16
SR=SR1,SR2,……,SR16SR=SR1, SR2,..., SR16
SL=SL1,SL2,……,SL16SL=SL1, SL2,..., SL16
这里,输入字形与标准字形的距离可表示如下。Here, the distance between the input font and the standard font can be expressed as follows.
D=(DH,DV,DR,DL)D = (DH, DV, DR, DL)
DH=|FH1-SH1|+|FH2-SH2|+……+|FH16-SH16|DH=|FH1-SH1|+|FH2-SH2|+……+|FH16-SH16|
DV=|FV1-SV1|+|FV2-SV2|+……+|FV16-SV16| DV = |FV1-SV1|+|FV2-SV2|+...+|FV16-SV16|
DR=|FR1-SR1|+|FR2-SR2|+……+|FR16-SR16|DR=|FR1-SR1|+|FR2-SR2|+...+|FR16-SR16|
DL=|FL1-SL1|+|FL2-SL2|+……+|FL16-SL16|DL=|FL1-SL1|+|FL2-SL2|+...+|FL16-SL16|
由上式,求出字形字库中的标准字形与输入字形的距离,从距离小的依次输出50个作为识别结果的候选文字。From the above formula, the distance between the standard glyph in the font library and the input glyph is obtained, and 50 candidate characters as recognition results are output in sequence from the smaller distance.
以上,图4的步骤S11~S13的处理,是不依赖于与划数、笔序的文字整体形状特征的识别处理。As described above, the processing of steps S11 to S13 in FIG. 4 is a recognition processing that does not depend on the character's overall shape characteristics such as the number of strokes and the order of strokes.
选出50个候选文字之后,移至图2的步骤S3。After selecting 50 candidate characters, move to step S3 of FIG. 2 .
接着进入图2的S2,控制单元59发出指令将由字形识别单元52所得的50个候选文字送至笔序对应命名识别单元54,以便进行笔序对应命名识别。Then enter S2 of FIG. 2 , the
用图8对笔序对应命名识别单元54的动作进行详细说明。The operation of the stroke-order correspondence naming recognition unit 54 will be described in detail with reference to FIG. 8 .
图8是说明笔序对应命名识别单元54的动作流程图。FIG. 8 is a flow chart illustrating the operation of the stroke-order corresponding naming recognition unit 54 .
首先,笔序对应命名识别单元54进入图8的步骤S20,从输入字形抽出各笔划的笔划特征。First, the stroke-order corresponding naming recognition unit 54 enters step S20 of FIG. 8 , and extracts the stroke features of each stroke from the input font.
作为笔划特征抽出各笔划的每一笔划形状特征,笔划从始点到终点方向,笔划外接矩形的宽度,笔划外接矩形高度,假想笔划的方向,假想笔划外接矩形宽度,假想笔划外接矩形高度共7种。Each stroke shape feature of each stroke is extracted as a stroke feature, the direction from the start point to the end point of the stroke, the width of the stroke circumscribing rectangle, the height of the stroke circumscribing rectangle, the direction of the imaginary stroke, the width of the imaginary stroke circumscribing rectangle, and the height of the imaginary stroke circumscribing rectangle. .
笔划的外接矩形就是相对于笔划进行外接的矩形。The circumscribed rectangle of the stroke is the rectangle circumscribed relative to the stroke.
假想笔划就是连接现笔划的终点与下一笔划始点的直线。在最终笔划时,下一笔划为最初的笔划。假想笔划的外接矩形就是将假想笔划作为对角线的矩形。An imaginary stroke is exactly a straight line connecting the end point of the current stroke and the starting point of the next stroke. When the final stroke, the next stroke is the original stroke. The circumscribing rectangle of the imaginary stroke is a rectangle having the imaginary stroke as a diagonal.
而且,各笔划的方向用图9所示的8个方向量子化。另外,对于宽度、高度是将文字整体的外接矩形的宽、高度各自设为100的规一化值。图10表示就输入字形的第一划66抽出7种笔划特征的例子。Also, the direction of each stroke is quantized using the eight directions shown in FIG. 9 . In addition, the width and height are normalized values in which the width and height of the circumscribing rectangle of the entire character are set to 100, respectively. FIG. 10 shows an example of extracting seven stroke features from the
接着,将输入字形的笔划与由字形识别单元52得到的50个候选文字的笔划特征字库中的标准字形笔划进行比较,因为在笔序对应命名识别单元只进行笔序的对应命名,所以若输入字形的划数与标准字形的划数不一致,就不能对应命名。Then, compare the strokes of the input font with the standard font strokes in the stroke feature font library of 50 candidate characters obtained by the
在此,因为在字形识别单元52中没有利用划数的信息,因此候选文字中含有各种各样划数的文字。因而,首先在S21进行输入字形与标准字形的划数比较。划数不一致时,S21为「No」,则不进行与现候选文字的标准字形相对应命名,而进行与下一文字的对应命名。在S21一致的场合,即成「YES」进入S22。Here, since the stroke number information is not used in the
在S22,用笔划特征字库53的标准字母的笔划特征与输入字形的笔划特征进行对应命名。这里,标准字形为如图12那样的正确的笔序或多数人的书写笔序的笔划特征。另外,笔划彼此是否对应命名的判定,参照后述的对应命名规则进行对应命名检定。In S22, the stroke features of the standard letters in the stroke
图11说明对应命名规则的例子。例中记述有对各种笔划形状进行检测定的特征,就其特征进行输入字形与标准字形的检定。例如,在模划彼此的对应命名的检定中,只就模划笔划的重要特征的笔划宽进行检定。Fig. 11 illustrates an example of corresponding naming rules. In the example, the characteristics of various stroke shapes are described, and the input font and the standard font are verified based on the characteristics. For example, in the test of imitating the corresponding names of each other, only the stroke width of the important feature of imitating strokes is tested.
这里,就方向的检定而言,在方向的差为+-2以上时判定为不对应命名。Here, in the direction test, when the difference in direction is +-2 or more, it is determined that the name does not correspond.
对文字整体的外接矩形的宽高各为100,宽、高的差在70以上时,对宽、高的检定就判定为不对应命名。但是,基准不一样,短的笔划相互间对应命名的场合与长的笔划相互间对应命名的场合相比方向不稳定,因此要扩大对应命名的许可范围。The width and height of the circumscribed rectangle of the whole text are 100, and when the difference between the width and height is more than 70, the verification of the width and height will be judged as not corresponding to the naming. However, the standard is different, and the direction is unstable when short strokes are named in correspondence with each other compared to when long strokes are named in correspondence with each other. Therefore, the allowable range of the correspondence name should be expanded.
另外,对于假想笔划的方向、宽度、高度,因表示笔划间位置关系为尺度,所以不就笔划形状进行检定。这里,上述的对应命名在对应命名成功时进行,对应命名失败了的场合或所有笔划已经对应命名时,进入S23。In addition, since the direction, width, and height of virtual strokes represent the positional relationship between strokes as a scale, the stroke shape is not tested. Here, the above-mentioned corresponding naming is performed when the corresponding naming is successful, and when the corresponding naming fails or when all strokes have been correspondingly named, the process proceeds to S23.
接着在S23,在输入字形的笔划之内,就所有笔划检查对应命名是否成功,已全部对应时进入S26,中间对应命名失败时则进入S24。在本例中,因输入字形是以正确的笔序写入的,所有笔划都对应命名,就进入S26。Then in S23, within the strokes of the input font, check whether the corresponding naming is successful with respect to all strokes, enter S26 when all corresponding, and then enter S24 when the corresponding naming fails in the middle. In this example, because the input font is written in the correct stroke order, all strokes are named correspondingly, and just enter S26.
接着在S26,就所有笔划计算输入字形的笔划特征与标准字形的笔划特征的距离,对每个特征规一化后,对各特征的距离加权后最终求出与现标准字形的距离。加权可以预先设定以提高识别率,或通过划数等改变权重。Then at S26, calculate the distance between the stroke feature of the input font and the stroke feature of the standard font with respect to all strokes, after each feature is normalized, after weighting the distance of each feature, finally find the distance with the existing standard font. The weighting can be pre-set to improve the recognition rate, or the weight can be changed by the number of strokes, etc.
但是,在距离计算时不使用笔划形状。However, the stroke shape is not used in distance calculations.
这里,从笔划方向规一化后的距离Dd、笔划宽度规一化后的距离Dw、笔划高度规一化后的距离Dn、假想笔划宽度规一化后的距离Dvw、假想笔划高度规一化后距离Dvh、假笔划方向规一化后的距离Dvd求得与校准字形的距离D。Here, the normalized distance Dd from the stroke direction, the normalized distance Dw from the stroke width, the normalized distance Dn from the stroke height, the normalized distance Dvw from the virtual stroke width, and the normalized distance Dvw from the normalized virtual stroke height The distance D to the calibration glyph is obtained from the back distance Dvh and the normalized distance Dvd of the false stroke direction.
D=(Wd×Dd+Ww×Dw+Wn×Dh+Wvw×Dvw+Wvh×Dvh+Wvd×Dvd)这里,Wd表示笔划方向权重、Ww表示笔划宽度权重,Wh为笔划高度权重,Wvw为假想笔划宽度权重,Wvh为假想笔划高度权重,Wvd为假想笔划方向权重。这些加权有如下关系。D=(Wd×Dd+Ww×Dw+Wn×Dh+Wvw×Dvw+Wvh×Dvh+Wvd×Dvd) Here, Wd represents stroke direction weight, Ww represents stroke width weight, Wh is stroke height weight, Wvw is imaginary Stroke width weight, Wvh is the weight of imaginary stroke height, Wvd is the weight of imaginary stroke direction. These weights have the following relationship.
Wd+Ww+Wh+Wvw+Wvh+Wvd=1 Wd+Ww+Wh+Wvw+Wvh+Wvd=1
此权重是为提高识别率预先设定的。另外,这种加权也可根据要识别文字划数来变更设定。This weight is preset to improve the recognition rate. In addition, such weighting can also be changed according to the number of character strokes to be recognized.
以下,对在S23中对应命名失败情况下的动作进行说明。设将图13那样的文字
按与标准字形不同的笔序写入。这时,输入字形的第二划91与标准字形的第二划81同为竖划,判定在笔划的高度的检定中不对应命名。因此,在到第二划的对应命名结束了的阶段,进入S24。Hereinafter, the operation in the case of failing to correspond to the naming in S23 will be described. Set the text as shown in Figure 13 Write in a different stroke order than the standard glyph. At this time, the
接着在S24,判断出输入字形的笔序不同,就对应命名失败了的输入字形的笔划,通过与依次未在对应命名中使用的输入字形的笔划(若要进行与现标准字形的第K划的对应命名,则在至标准字形的第K-1划的对应命名中所使用过的输入笔划以外的笔划中,进行第K划的对应命名)调换,进行笔序排列更换。Then at S24, it is judged that the stroke order of the input font is different, the stroke of the input font that has failed in corresponding naming, by the stroke of the input font that is not used in the corresponding naming successively (if the Kth stroke with the existing standard font is to be carried out The corresponding naming of the stroke, then in the strokes other than the input strokes used in the corresponding naming of the K-1 stroke to the standard font, the corresponding naming of the K stroke is carried out) exchange, and the stroke order arrangement is replaced.
但是,只有在最终笔划不对应命名时,或就现正进行对应命名的标准字形的笔划不存在未使用的笔划时,才能判断至此的对应命名有无错误,同样,只有将前一笔划的对应命名作为不对应命名,才能对未使用的输入笔划进行对应命名。But only when the final stroke does not correspond to the naming, or when there is no unused stroke in the stroke of the standard font that is carrying out the corresponding naming now, it is possible to judge whether the corresponding naming so far has any errors. The unused input strokes can be named correspondingly only if the naming is not corresponding to the naming.
本例中,如图14(a)那样,将不对对应命名的笔划91与下一笔划92调换。In this example, as shown in FIG. 14( a ), the
接着进入S25,将至今的输入笔划的复算次数(笔序调换次数)与预定的复算次数上限进行比较,在没有越出时变成「No」,回到S22,再次从不对应命名的标准字形的笔划开始对应命名。Then enter into S25, compare the number of times of recalculation (stroke order exchange times) of the input stroke so far with the upper limit of the number of times of recalculation of the predetermined number of times of recalculation, become "No" when not exceeding, get back to S22, and again never correspond to the name The strokes of standard glyphs start to be named accordingly.
在S22排列起来的输入笔划92,不与标准字形的第二划的笔划81对应命名,与前次同样,标准字形的第二划的笔划对应命名失败,进入S24。The
在S24,下一未使用的输入笔划93正好在第二划,在标准字形的第二划的对应命名失败后的输入笔划91,92返回到原先的输入字形的笔序。In S24, the next
图14(b)表示移动后的输入笔划排列。Fig. 14(b) shows the arrangement of input strokes after shifting.
反复上述处理,直至最后所有的笔划对应命名的笔序的调换过程示于图14(c)~(f)。本例中,经6次复算对应命名成功,与用正确地笔序书写的场合一同进入S26,进行与标准字形距离的计算。另外,随着上述笔序调换,在变更了笔序的输入笔划中,再计算出表示笔划间位置关系的假想笔划的方向、宽、高。Repeat the above process until finally all the strokes correspond to the named stroke sequence exchange process shown in Figure 14(c)-(f). In this example, after 6 recalculations, the corresponding naming is successful, and enters into S26 together with the occasion of writing in the correct stroke order to calculate the distance from the standard font. In addition, in accordance with the above-mentioned stroke order exchange, among the input strokes whose stroke order has been changed, the direction, width, and height of virtual strokes representing the positional relationship between strokes are recalculated.
就由字形识别单元52所得的全部识别候选文字完成了上述对应命名后,进入图2的S4。After completing the above corresponding naming of all the recognition candidate characters obtained by the
在S4,控制单元59将在笔序对应命名识别单元54对应命名成功了的文字中最小距离的文字的距离送入笔序优先输出单元55。In S4 , the
在笔序优先输出单元55,将对应命名成功了的文字的距离与预先确定的阈值进行比较,若判定距离比阈值小时则将结果输出,将判定距离比阈值大时则结果不输出。In the stroke
控制单元59,若在判断出笔序优先输出单元55进行输出时,在将文字识别单元52中选出的50个文字内,从由笔序对应命名识别单元54算出的距离小的文字开始依次将一定个数送到显示单元58,在显示单元58上显示出文字。
在图3的输入字形及图13的输入字形的情况下,共同的笔序的对应命名成功,所得的距离比阈值小得多,判定笔序优先输出单元55进行输出,将笔序对应命名识别单元54的识别结果作为最终结果输出。In the case of the input font of Fig. 3 and the input font of Fig. 13, the corresponding naming of the common stroke order is successful, and the distance obtained is much smaller than the threshold value, and it is determined that the stroke order
若判断出为笔序优先输出单元55不输出的场合,则进入图2的S5。If it is judged that the stroke order
下面,为了说明图2的S5的动作,就图15那样的将文字 用六划书写成字形的场合进行讨论。Next, in order to explain the operation of S5 in Figure 2, the characters in Figure 15 will be Discuss the occasions where the characters are written with six strokes.
用六划书写的图15的输入字形,因在笔划特征字库53中只有7划的字的标准字形,不能用笔序对应命名识别单元54作与图15的输入字形的对应命名。With the input font of Figure 15 written in six strokes, because of only having 7 strokes in the stroke
因而,即使假定与其它的六划文字的对应命名成功,也因变成为不同的文字对应命名,与输入字形的距离很大,笔序优先输出单元55判定因不是比阈值小的值,在图2的S4没有输出,进入S5。Therefore, even assuming that the corresponding naming with other six-stroke characters is successful, because it becomes a different character corresponding naming, and the distance from the input font is very large, the stroke order
在S5,划数对应命名识别单元57就文字识别单元52中选出的50个文字,利用与原有技术例2同样的DP匹配,进行划数不同者的对应命名使距离变为最小。In S5, the stroke number-corresponding
将输入字形的笔划表示如下,The strokes of the input glyph are expressed as follows,
A=a1,a2,…,ai,…,aI(I为输入字形划数)字库的标准字形的笔划表示如下,A=a1, a2, ..., ai, ..., the stroke of the standard font of aI (I is the stroke number of input font) font storehouse represents as follows,
B=b1,b2,…,bj,…,bJ(J为标准字形划数)则ai,bj的距离d(i,j)可按如下算出。B=b1, b2, ..., bj, ..., bJ (J is the stroke number of a standard font) then ai, the distance d (i, j) of bj can be calculated as follows.
d(i,j)=|wai-wbj|+|Hai-Hbj|d(i, j)=|wai-wbj|+|Hai-Hbj|
这里,W为笔划的宽,H为笔划的高。Here, W is the width of the stroke, and H is the height of the stroke.
设定d(i,j)=|ai-bj|Set d(i,j)=|ai-bj|
(1)在i=1,j=1时(1) When i=1, j=1
g(1,1)=d(1,1)g(1,1)=d(1,1)
(2) i=1,j≠1时(2) When i=1, j≠1
g(1,j)=g(1,j-1)+d(1,j)g(1,j)=g(1,j-1)+d(1,j)
(3) i≠1,j=1时(3) When i≠1, j=1
g(i,1)=g(i-1,1)+d(i,1)g(i,1)=g(i-1,1)+d(i,1)
(4) 其它情况由上面表示的递推式求出g(i,j)(初始条件),用下式求出A的笔划列与B的笔划列的距离S(A,B)。(4) Other situations g(i, j) (initial condition) is obtained from the recursive expression shown above, and the distance S(A, B) between the stroke sequence of A and the stroke sequence of B is obtained by the following equation.
S(A,B)=I/(I+J)*g(I,J)就字形识别单元52所得的50个识别结果的候选文字进行上述的DP匹配,距离S(A,B)从最小的开始依次作为识别结果对显示单元58输出。S (A, B)=I/(I+J)*g (I, J) carries out above-mentioned DP matching with regard to the candidate text of 50 recognition results that
显示单元58将从由DP匹配算出的距离最小的文字开始,顺序将文字显示出来。The
本例中,通过DP匹配,将图15的输入字形与图12的标准字形的笔划排列作图16那样的对应命名,可正确地识别。In this example, through DP matching, the stroke arrangement of the input glyph in FIG. 15 and the standard glyph in FIG. 12 are named correspondingly as in FIG. 16 , which can be correctly identified.
另外,在本实施方式中,作为字形识别单元52的特征虽然是使用方向代码特征,但是也可以是将从文字外接矩形直至正面接触到文字笔划的面积作为特征的周边分布特征,或笔划的线密度等反映文字整体的字形特征。In addition, in this embodiment, although the feature of the
还有,在本实施方式中,作为代向代码特征被分割成4×4个区域,但也可根据字库容量,识别能力改变区域的分割数量。Also, in the present embodiment, the substituting code feature is divided into 4×4 regions, but the number of divisions of the regions may also be changed according to the capacity of the font library and the recognition ability.
还有,在本实施方式中,作为从字形识别单元52输出的候选文字数固定有50个,但也可根据字形识别单元52的识别能力变更该值。In addition, in this embodiment, 50 characters are fixed as the number of candidate characters output from the
另外,本实施方式中,从字形识别单元52向笔序变动对应命名识别单元54与划数对应命名识别单元57输出同样数量的候选文字,但也可根据各个识别单元的识别能力变更变更输出候选文字数。In addition, in this embodiment, the same number of candidate characters are output from the
下面就本实施方式的在线文字识别装置的效果作说明。The effects of the online character recognition device of this embodiment will be described below.
若采用上述的本实施方式,只就通过文字的大体特征识别结果的候选文字,进行通过笔序对应命外识别,通过划数对应命名的识别,不用字形不类似的文字的特征进行对应命名,因此,可实现高速且高精度地识别。If adopt above-mentioned present embodiment, only with regard to the candidate character of recognition result by the general feature of character, carry out by stroke order correspondence out-of-name recognition, by the recognition of stroke number corresponding naming, do not carry out corresponding naming with the feature of the dissimilar character of font, Therefore, high-speed and high-precision identification can be realized.
具体说来,在进行手写文字识别时,在以JIS第二水准的一部分文字为对象的场合,需要有对4000种文字进行高速识别,若从开始就对约4000个字进行笔序对应命名或划数对应命名,实际上要花费的处理时间说来,是有困难的。在这里,在本实施方式中,用字形识别单元52对文字识别作第一步处理,利用文字的大体特征高速地压缩了在第二阶段进行笔序对应命名识别或划数对应命名识别的文字。Specifically, when performing handwritten character recognition, in the case of a part of the characters of the second level of JIS, it is necessary to perform high-speed recognition of 4,000 characters. The number of strokes corresponds to the naming, and it is actually difficult in terms of the processing time it takes. Here, in this embodiment, the
而且,因为输出字形类似的文字作为候选文字,从而使用户提高了对误识别时的印象,也提高了包含在识别候选文字中的正确文字的比率。Furthermore, since characters with similar font styles are output as candidate characters, the user's impression of misrecognition is enhanced, and the ratio of correct characters included in the recognition candidate characters is also increased.
另外,由于在笔序对应命名识别单元54中通过多个特征严密地进行笔划的对应命名,所以可高精度的识别遵守划数书写文字,就是对于连笔字等标准字形不存在的发生划数变动的输入字形,可以正确地送入划数对应命名识别单元57,使连笔字也可得到高精度识别。In addition, since the corresponding naming of strokes is strictly carried out through multiple features in the stroke-order corresponding naming recognition unit 54, it is possible to recognize with high precision the written characters that follow the number of strokes, that is, the number of strokes that do not exist in standard fonts such as ligatures. The changed input font can be correctly sent into the corresponding naming
作为本实施方式的文字显示,有以笔序优先输出单元55的识别结果显示的情况和以划数对应命名识别单元57的识别结果显示的情况。As the character display in the present embodiment, there are cases where the recognition result of the
因而,对显示单元58的输出由两个阶段形成。因此,在确定为笔序优先输出单元55输出的场合,因为可在前一阶段显示,所以能使识别时间提前。Thus, the output to the
另外,在本实施方式中,虽然是在第一项进行字形识别,第二项进行笔序对应命名识别,第三项进行划数对应命名识别,但在比连笔字更重视笔序的场合,在此笔序更重视连笔的场合等,也可根据情况改变第二项以后的识别顺序。In addition, in the present embodiment, although the first item is to carry out font recognition, the second item is to carry out stroke order corresponding naming recognition, and the third item is to carry out stroke number corresponding naming recognition, but in the occasion that pays more attention to stroke order than ligature , where the sequence of strokes places more emphasis on continuous strokes, etc., the recognition sequence after the second item can also be changed according to the situation.
因此,作为文字识别顺序,可考虑为如下方式。Therefore, the following forms can be considered as the character recognition procedure.
A:字形识别→划数对应命名识别→笔序对应命名识别A: Font recognition → stroke number corresponds to naming recognition → stroke order corresponds to naming recognition
B:字形识别→笔序对应命名识别B: font recognition → stroke order corresponding naming recognition
C:字形识别→划数对应命名识别。实施方式2C: font recognition → stroke number corresponds to naming recognition.
本实施方式的在线文字识别装置,通过字形识别判定是否输出由这形识别抽出的候选文字,在判定为不输出场合进行笔序对应命名识别,划数对应命名识别。以下参照图17,图18进行说明。The online character recognition device of the present embodiment judges whether to output the candidate characters extracted by the shape recognition through font recognition, and performs name recognition corresponding to stroke order and naming recognition corresponding to stroke number when it is judged not to output. The following description will be made with reference to FIG. 17 and FIG. 18 .
图17中,50~50是与实施方式1同样的设备,说明从略。In FIG. 17 , 50 to 50 denote the same devices as those in
110是根据由字形识别单元52所得的识别候选文字的信息,判定是否输出字形识别单元52的识别候选文字的字形优先输出单元。110 is a font priority output unit for judging whether to output the recognition candidate characters of the
以下,用图18对实施方式的在线文字识别装置的动作进行说明。Hereinafter, the operation of the online character recognition device according to the embodiment will be described with reference to FIG. 18 .
图18表示本实施方式的在线文字识别装置的控制单元59的处理过程的流程图。FIG. 18 is a flowchart showing the processing procedure of the
至S2的处理与实施方式1方式,说明省略。The description of the processing up to S2 is the same as that in the first embodiment, and the description thereof is omitted.
完成S1,S2处理后,控制单元59将由字形识别单元52所得的50个识别候选文字中最小距离的文字送入字形优先输出单元110。After the processing of S1 and S2 is completed, the
接着进入图18的S30,字形优先输出单元110判定是否将字形识别单元52的识别结果输出。判定,对于由字形识别单元52送来的最小距离的文字,求出输出字形与标准字形的类似度(例如规一化距离),在该类似度比预先为各文字准备的阈值大时判定为输出结果,在小的场合判为不输出结果。Next, in S30 of FIG. 18 , the font
设输入字形向量为f,标准字形向量为g,则类似度按以下公式求出。Assuming that the input font vector is f, and the standard font vector is g, then the similarity is calculated according to the following formula.
S(f,g)=(f、g)/‖f‖·‖g‖S(f, g)=(f, g)/‖f‖·‖g‖
这里,文字A的阈值根据文字A的字形字库51中的标准字形与包含识别对象的所有文字字形的书写字形中的文字A以外的书写字形的方向代码特征求出类似度,对最大的类似度(最靠近文字A的标准字形的,文字A以外的字形的类似度)进行设定。这样,字形类似的文字不存在的文字A的阈值变为小的值,字形识别单元52的结果原封不动地输出的比率增大。Here, the threshold value of character A obtains the degree of similarity according to the standard font in the character font library 51 of character A and the direction code feature of the written font other than character A in the written font including all the character fonts of the recognition object, and the maximum similarity (similarity of glyphs other than the standard glyph closest to character A) is set. In this way, the threshold value of the character A in which no character with a similar font exists becomes a small value, and the rate at which the result of the
另外,由于字形类似的文字存在的文字阈值变大,所输出的部分变小,则可由详细的笔划特征进行识别。In addition, since the threshold value of characters with similar fonts becomes larger, the output part becomes smaller, and can be identified by detailed stroke features.
若在S30判定为字形优先输出单元110将结果输出的场合,则从由S2算出的距离小者开始顺序显示由字形识别单元52所得的50个识别候选文字。If it is determined in S30 that the font
若在S30判定为字形优先输出单元110不将结果输出的场合,则就由字形识别单元52所得的50个识别候选文字进行与实施方式1同样的S3以后的处理。If it is determined in S30 that the font
另外,在本实施方式中,字形优先输出单元110的输出判定,作为类似度虽然采用规一化距离,但也可以用欧几里得距离或马哈赖诺毕斯距离等。In addition, in the present embodiment, the output determination of the font
以下就本实施方式的在线文字识别装置的效果进行说明。The effect of the online character recognition device of this embodiment will be described below.
本实施方式的在线文字识别装置,在不存在字形类似的文字时,输出字形识别单元52的识别结果的比率增大,可能获得不受笔序变,划数变动,及文字的局部变形的左右的高精度识别。另外,在输出定形识别单元52识别结果的场合可能得到很高速的识别。In the online character recognition device of the present embodiment, when there are no characters with similar fonts, the ratio of the recognition results of the output
而且,如果采用本实施方式,对于非常仔细书写的文字,由字形识别单元得到高的识别精度,不进行笔序对应命名识别而以高速度显示,因此,和通常人们识别的场合同样,能迅速地识别仔细书写的文字,提高给用户的印象。And if this embodiment is adopted, for very carefully written characters, the high recognition accuracy is obtained by the font recognition unit, and the stroke order corresponding naming recognition is not performed and displayed at a high speed. Accurately recognize carefully written text and improve the impression on users.
而且在本实施方式的在线文字识别装置中,在进行第一阶段的字形识别的结果的可靠性高的场合,由于是显示本阶段识别结果,因此可能高速的识别比较仔细书写的文字。In addition, in the online character recognition device of this embodiment, when the reliability of the result of the first-stage character recognition is high, since the recognition result of this stage is displayed, it is possible to recognize relatively carefully written characters at high speed.
实施例方式3
本实施方式 在线文字识别装置,就通过字形识别抽出的候选文字中正确解析文字可能性高的文字,对笔序对应命名识别的复算次数进行多次设定,以下参照图19~图20进行说明。In this embodiment, the online character recognition device sets multiple times the number of times of recalculation of the stroke sequence corresponding to the name recognition for the characters with a high possibility of correctly analyzing the characters among the candidate characters extracted through font recognition, and the following is performed with reference to FIGS. illustrate.
图19中,50~59,110是与前面实施方式相同的设备,对其说明省略。120是对输入字形与标准字形的笔划的对应命名进行加权的笔序对应命名识别单元。In FIG. 19 , 50 to 59 , and 110 are the same devices as those in the previous embodiment, and their descriptions are omitted. 120 is a stroke sequence corresponding naming identification unit for weighting the corresponding naming of the strokes of the input font and the standard font.
本实施方式的在线文字识别装置的动作用图20进行说明。The operation of the online character recognition device of this embodiment will be described with reference to FIG. 20 .
图20是本实施方式的在线文字识别装置的控制单元59的处理过程的流程图。FIG. 20 is a flowchart of the processing procedure of the
图20中,至S30的处理与实施方式2相同,说明省略。In FIG. 20 , the processing up to S30 is the same as that in
进行S1,S2,S3的处理,进入S40。在S40,控制单元59将与实施方式1同样地由字形识别单元52所得到的50个候选文字送入加权笔序对应命名识别单元120,加权笔序对应命名识别单元120进行输入字形与标准字形的笔划的对应命名识别。Carry out the processing of S1, S2, and S3, and proceed to S40. In S40, the
详细的笔划对应命名操作与图8的实施方式1的操作相同,但是S25的笔序的排列调换次数(复算次数),在实施方式1,就字形识别单元52所得的候选文字是完全相同的,反之,在本实施方式中,则根据字形识别单元52的候选文字的顺序(从距离小者开始依次排列的顺序)而改变。The detailed stroke-corresponding naming operation is the same as the operation in
具体说来,在字形识别单元52的识别结果的候选文字之内,排列位次高的文字即为正确解答文字的可能性高的文字,复算次数的阈值就越大,位次低的文字即为正确解答文字的可能性低的文字,复算次数的阈值就越小。Specifically, among the candidate characters of the recognition result of the
如果采用本实施方式,对正确解答可能性低的文字,减少了笔序排更更换的数量,不必进行上述的对应命名,所以可高速识别,而且能减少因正确解答可能性低的文字的必要的上述的笔序调换而引起的错误识别。If this embodiment is adopted, for the characters with low probability of correct answer, the number of strokes arranged and replaced is reduced, and the above-mentioned corresponding naming is not necessary, so it can be recognized at a high speed, and the need for characters with low possibility of correct answer can be reduced Misrecognition caused by the above-mentioned stroke order exchange.
实施方式4
本实施方式的在线文字识别装置,是在划数对应命名识别时,抽出输入字形与标准字形一一对应的稳定点,就稳定点进行各笔划的对应命名,以下,参照图21~图25进行说明。The online character recognition device of this embodiment is to extract the stable point corresponding to the input font and the standard font one-to-one when the stroke number is named and recognized, and carry out the corresponding naming of each stroke on the stable point. Below, refer to FIGS. 21 to 25. illustrate.
图21中,50~59与实施方式1,110与实施方式2,120与实施方式3均为相同的设备,说明省略。In FIG. 21 , 50 to 59 are the same devices as
130为DP匹配单元,131为笔划对应命名检定单元,132为稳定笔划对应命名单元。130 is a DP matching unit, 131 is a verification unit corresponding to a stroke, and 132 is a naming unit corresponding to a stable stroke.
参照图22对本实施方式的在线文字识别装置的动作进行说明。The operation of the online character recognition device according to this embodiment will be described with reference to FIG. 22 .
图22是表示本实施方式的在线文字识别装置的控制单元59的处理过程的流程图。FIG. 22 is a flowchart showing the processing procedure of the
图22中,至S4的动作与前面实施方式1~3相同,说明省略,只就S50以后的动作进行说明。In FIG. 22 , the operations up to S4 are the same as those in
首先,在S4判定为笔序优先输出单元55无输出时,控制单元59将字形识别单元52所得识别结果的候选文字50个都送至DP匹配单元130。First, when it is determined in S4 that the stroke order
然后,在S50,控制单元59对DP匹配单元130作出指示,就字形识别单元52所得的识别结果的50个候选文字,进行输入字形与标准字形的对应命名。Then, at S50 , the
在此,笔划对应命名检定单元131检定DP匹配单元130的对应命令路径,对不对应命名加以补偿,以防止选择该路径。Here, the stroke corresponding naming
就DP匹配单元130的对应命名的路径,以图15的输入字形与图12的标准字形的对应命名为例进行说明。图15的输入字形与图12的标准字形作如图16那样地将应命名,其对应命名的路径以二维呈现在图23。Regarding the corresponding naming path of the DP matching unit 130, the corresponding naming of the input glyph in FIG. 15 and the standard glyph in FIG. 12 is taken as an example for illustration. The input glyphs in FIG. 15 and the standard glyphs in FIG. 12 are named as in FIG. 16 , and the corresponding naming paths are presented in FIG. 23 in two dimensions.
图23中,横轴为标准字形的笔划排列,纵轴为输入字形的笔划排列。图中粗线是DP匹配的对应命名路径。In Fig. 23, the horizontal axis is the stroke arrangement of the standard font, and the vertical axis is the stroke arrangement of the input font. The thick line in the figure is the corresponding named path of DP matching.
图24的正确对应命名场合当然无问题,但是还有要进行图24那样的对应命名的地方。Of course, there is no problem in the case of correct corresponding naming in FIG. 24 , but there is still a place where corresponding naming like that in FIG. 24 is required.
图24表示在DP匹配单元130的输入字形与标准字形的实际书写中进行不对应命名地方的路径。FIG. 24 shows the path of uncorresponding naming places in the actual writing of the input glyph and the standard glyph by the DP matching unit 130 .
图24的对应命名情况是与输入字形的第三划的笔划与标准字形的第三、第四、第五划对应。这样的对应状态称为笔划的结合,表示本来采用二划记载的所有笔划记成一划连笔状态。The corresponding naming situation of Fig. 24 is that the stroke of the 3rd stroke of the input font corresponds to the 3rd, the 4th, the 5th stroke of the standard font. Such a corresponding state is called a combination of strokes, which means that all the strokes originally recorded in two strokes are recorded into a state of one-stroke continuous strokes.
而且,标准字形的第五划与输入字形的第三划及第四划对应命名。这样的状态称为笔划分离,表示本来用一划记载的整个笔划现在用二划分开记载。Moreover, the fifth stroke of the standard font is named correspondingly to the third stroke and the fourth stroke of the input font. Such a state is called stroke separation, which means that the entire stroke recorded by one stroke is now recorded by two divisions.
即,标准字形的第五划笔划的结合与分离同时发生。因为一般的文字书写并不发生这样的现象,所以通过笔划对应命名检定单元131来防止不对应命名。That is, the combination and separation of the fifth stroke of the standard font occur simultaneously. Because this phenomenon does not occur in general character writing, the corresponding naming
即,笔划对应命名检定单元131在DP单元130进行DP匹配时,就从横到竖路径及从竖到横路径等通常的书写中不对应命名,通过对其所选择路径时的距离值加以极大的补偿而防止选择该路径。That is to say, when the DP unit 130 performs DP matching, the corresponding naming
接着进入S51,控制单元59使在DP匹配单元130上所得路径信息送至稳定笔划检定单元132。所谓路径信息就是如图23至图25所示的用线表现各笔划的对应的信息。Then enter S51, the
稳定笔划检定单元132在路径上的各点在前后路径为斜向路径时将该点作为稳定点抽出。但是在路径起始点正后方的路径为斜向时,在路径的终点正前方为斜路径时为稳定点。The stable stroke checking unit 132 extracts each point on the path as a stable point when the forward and backward paths are oblique paths. However, when the path directly behind the starting point of the path is an oblique path, and when the oblique path is directly in front of the end point of the path, it is a stable point.
图25表示用稳定笔划检测单元132抽出的稳定点。FIG. 25 shows the stable points extracted by the stable stroke detection unit 132. In FIG.
图25中,在图15的输入字形与图12的标准字形的对应命名结果的路径中,将稳定笔划抽出单元132所抽出的稳定点用O表示。本例中抽出5个稳定点。In FIG. 25 , the stable points extracted by the stable stroke extraction unit 132 are denoted by O in the path of the corresponding naming result of the input font in FIG. 15 and the standard font in FIG. 12 . In this example, 5 stable points are drawn.
接着,稳定笔划检定单元132,对抽出的稳定点进行输入字形与标准字形笔划的于一一对应命名,用笔序对应命名识别单元54使用过的7个特片对笔划命名进行检定。Next, the stable stroke verification unit 132 performs a one-to-one corresponding naming of the input font and the standard font strokes for the extracted stable points, and verifies the stroke naming with the 7 special pictures used by the stroke sequence corresponding naming recognition unit 54 .
这个对应命名,在判断出在稳定点的笔划之中是一个也没有对应命名时,则判定DP匹配中的对应命名失败,进入下一字形识别单元52的候选文字的对应命名。This corresponding naming, when judging that there is no corresponding naming among the strokes of the stable point, then it is judged that the corresponding naming in the DP matching fails, and enters the corresponding naming of the candidate characters of the next
图25的例中,就输入字形的笔划之内结合着笔划102以外的笔划,进行与标准字形的笔划的对应命名的检定由于进行正确对应命名,所以判定对应命名成功。In the example of FIG. 25, the strokes other than the
就文字识别单元52的所有50个候选文字进行以上的处理,将从对应命名成功的文字之内距离最小的文字开始顺序的一定个数作为最终的识别结果输出。The above processing is performed on all 50 candidate characters of the
现在就本实施方式的在线文字识别装置的效果进行说明。Now, the effects of the online character recognition device of this embodiment will be described.
若采用以上的本实施方式,关于DP匹配时的对应命名,由于对实际书写中控制路径,以便不发生不对应命名,因此可实现高精度的识别。According to the above-mentioned present embodiment, regarding the corresponding naming at the time of DP matching, since the path is controlled in actual writing so that non-corresponding naming does not occur, high-precision recognition can be realized.
另外,如果采用本实施方式,就DP匹配结果一一对应的笔划,用多个特征对笔划的对应命名进行检定,所以能防止错误的对应命名,可实现高精度的识别。In addition, according to this embodiment, with respect to the strokes corresponding to the DP matching results one-to-one, the corresponding names of the strokes are verified by using multiple features, so wrong corresponding names can be prevented and high-precision recognition can be realized.
本发明由于有上述那样的构成,而有以下效果。The present invention has the following effects due to the configuration as described above.
第一点发明因为就字形识别步骤中选出的标准文字进行笔序对应命名识别,可减少进行笔序对应名识别的标准文字数量,而能作高速高精度的识别。The first invention can reduce the number of standard characters for name recognition according to the stroke order and can perform high-speed and high-precision recognition because the standard characters selected in the glyph recognition step are identified according to the stroke order corresponding to the name.
第二点发明因为就字形识别步骤中选出的标准文字进行划数对应命名识别,可减少进行划数对应命名识别的标准文字数量,而能作高速高精度的识别。The second invention can reduce the number of standard characters for stroke number corresponding naming recognition because of the standard characters selected in the font recognition step, and can perform high-speed and high-precision recognition.
第三点发明因为有通过将在笔序对应命名识别步骤中所算出的类似度与预定值进行比较判断是否输出标准文字的第一判别步骤,以及在上述第一判别步骤中判别为输出时,根据上述笔序对应命名识别步骤中算出的上述类似度输出字形识别步骤中选出的标准文字的第一输出步骤,所以在笔序对应命名识别步骤的识别结果良好的场合可将识别结果高速输出。The third invention is because there is a first judging step for judging whether to output standard characters by comparing the similarity calculated in the stroke-corresponding naming recognition step with a predetermined value, and when judging as output in the first judging step, The first output step of standard characters selected in the font recognition step according to the above-mentioned similarity calculated in the above-mentioned stroke-corresponding naming recognition step, so when the recognition result of the stroke-corresponding naming recognition step is good, the recognition result can be output at a high speed .
第四点发明由于以上述书写文字的笔序进行书写文字的笔划与上述标准文字的笔划的对应命名,在不能作上述对应命名时变更对应命名笔划的顺序再次进行对应命名,因此,即使上述书写文字笔序与标准文字笔序不同也能高精度进行识别。The fourth point of the invention is to carry out the corresponding naming of the strokes of the written characters and the strokes of the above-mentioned standard characters due to the stroke order of the above-mentioned written characters. The character stroke order is different from the standard character stroke order and can be recognized with high precision.
第五个点发明因为以在字形识别步骤中算出的类似度相应的次数进行变更笔序对应命名识别步骤的笔划顺序的对应命名,所以可使正确解析可能性高的文字对应较柔顺地序序变动,使正确解析可能性低的文字不怎么进行笔序变动,而可高速高精度的识别。In the fifth point of invention, the corresponding naming of the stroke order corresponding to the stroke order in the naming recognition step is changed by the number of times corresponding to the similarity calculated in the font recognition step, so characters with a high possibility of correct analysis can be sequenced more smoothly. Change, so that the characters with low probability of correct analysis can be recognized at high speed and high precision without changing the stroke order.
第六点发明因为有在字形识别步骤之后有通过将相对上述书写文字的上述字形识别步骤中所选出的标准文字的类似度与预定值进行比较,判断是否输出上述标准文字的第二判别步骤,The sixth invention is because after the font recognition step, there is a second judging step of judging whether to output the standard characters by comparing the similarity of the standard characters selected in the font recognition step with respect to the written characters with a predetermined value. ,
以及,在上述第二判别步骤中判别为输出时,根据上述字形识别步骤中算出的上述类似度输出上述标准文字的第三输出步骤,所以,在字形识别步骤的识别结果良好的场合,可将识别高速输出。And, when it is judged to be output in the above-mentioned second judging step, the third output step of outputting the above-mentioned standard characters according to the above-mentioned similarity calculated in the above-mentioned font recognition step, so when the recognition result of the font recognition step is good, the Identify high-speed outputs.
第七点发明由于从在DP匹配中对应命名的笔划,抽出书写文字的笔划与标准文字的笔划一一对应的稳定笔划,就上述稳定笔划有算出相对于上述书写文字的上述标准文字的类似度的稳定笔划检出步骤,所以能防止在实际书写中因不对应命名而引起的识别率恶化。The seventh point of the invention is that from the strokes correspondingly named in the DP matching, the stable strokes corresponding to the strokes of the written characters and the strokes of the standard characters are extracted, and the similarity of the above-mentioned standard characters with respect to the above-mentioned written characters can be calculated with respect to the above-mentioned stable strokes The stable stroke detection step can prevent the deterioration of the recognition rate caused by non-corresponding naming in actual writing.
第八点发明因为就字形识别单元中选出的标准文字进行笔序对应命名识别,可减少进行笔序对应命名识别的标准文字数量,可高速高精度的识别。The eighth invention can reduce the number of standard characters for stroke-corresponding naming recognition on the standard characters selected in the glyph recognition unit, enabling high-speed and high-precision recognition.
第九点发明因就字形识别单元中选出的标准文字进行划数对应命名识别,可减少进行划数对应命名识别的标准文字数量,而能以高速高精度地进行识别。The ninth aspect of the invention can reduce the number of standard characters to be recognized for stroke-number-corresponding naming by performing stroke-number-corresponding naming recognition on the standard characters selected in the font recognition unit, thereby enabling high-speed and high-precision recognition.
第十点发明因有通过将在笔序对应命名识别单元中算出的类似度与预定值进行比较判断是否输出标准文字的第一判别单元,以及,在上述第1判单元中判别为输出时,根据上述笔序对应命名识别单元中所算出的上述类似度在字形识别单元中选出的标准文字的第一输出单元,所以在笔序对应命名识别单元中识别结果良好的场合可将识别结果高速输出。The tenth invention has a first judging unit for judging whether to output a standard character by comparing the similarity calculated in the stroke-order corresponding naming recognition unit with a predetermined value, and when judging as output in the first judging unit, Corresponding to the first output unit of the standard character selected in the font recognition unit according to the above-mentioned degree of similarity calculated in the above-mentioned stroke-order corresponding naming recognition unit, so in the occasion that the recognition result is good in the stroke-corresponding naming recognition unit, the recognition result can be transferred at a high speed output.
第十一发明因有在DP匹配单元中从对应命名的笔划中,抽出书写文字的笔划与标准文字的笔划一一对应的稳定笔划并就上述稳定笔划算出相对上述书写文字的上述标准文字的类似度的稳定笔划检出单元,所以能防止在实际书写中因不对应命名而引起的识别率恶化。In the eleventh invention, in the DP matching unit, the stable strokes corresponding to the strokes of the written characters and the strokes of the standard characters are extracted from the strokes of the corresponding names, and the similarity of the above-mentioned standard characters relative to the above-mentioned written characters is calculated based on the above-mentioned stable strokes. High-speed stable stroke detection unit, so it can prevent the deterioration of recognition rate caused by non-corresponding naming in actual writing.
第十二点发明因有通过将文字识别单元中选出的标准文字相对于上述书写文字的类似度与预定值进行比较,判断是否输出上述标准文字的第二判别单元,以及The twelfth invention has a second judging unit for judging whether to output the above-mentioned standard characters by comparing the similarity between the standard characters selected in the character recognition unit and the above-mentioned written characters with a predetermined value, and
在上述第二判别单元中判别为输出时,根据上述字形识别单元中所算出的上述类似度输出上述标准文字的第三输出单元,因此,在上述字形识别单元的识别结果良好的场合,能将识别结果高速地输出。When it is judged as output in the above-mentioned second judging unit, the third output unit that outputs the above-mentioned standard characters according to the above-mentioned similarity calculated in the above-mentioned font recognition unit, therefore, in the occasion where the recognition result of the above-mentioned font recognition unit is good, can use The recognition result is output at high speed.
Claims (12)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP4760/96 | 1996-01-16 | ||
| JP00476096A JP3360513B2 (en) | 1996-01-16 | 1996-01-16 | Online character recognition method and online character recognition device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1155129A CN1155129A (en) | 1997-07-23 |
| CN1093965C true CN1093965C (en) | 2002-11-06 |
Family
ID=11592855
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN96121613A Expired - Fee Related CN1093965C (en) | 1996-01-16 | 1996-11-13 | On-line character recognition method and apparatus thereof |
Country Status (4)
| Country | Link |
|---|---|
| JP (1) | JP3360513B2 (en) |
| KR (1) | KR100236247B1 (en) |
| CN (1) | CN1093965C (en) |
| TW (1) | TW315446B (en) |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0887753B1 (en) | 1996-11-15 | 2007-10-24 | Toho Business Management Center | Business management system |
| KR20010002230A (en) * | 1999-06-12 | 2001-01-15 | 김항준 | Construction of Hidden Markov Model for On-line Korean Character Recognition |
| JP4817297B2 (en) * | 2006-02-10 | 2011-11-16 | 富士通株式会社 | Character search device |
| CN100382098C (en) * | 2006-09-08 | 2008-04-16 | 华南理工大学 | On-line extraction method of the first and last strokes of handwritten Chinese characters |
| KR101457456B1 (en) * | 2008-01-28 | 2014-11-04 | 삼성전자 주식회사 | Apparatus and Method of personal font generation |
| KR101679744B1 (en) | 2009-11-10 | 2016-12-06 | 삼성전자주식회사 | Apparatus and method for processing data in terminal having touch screen |
| CN102375994B (en) * | 2010-08-10 | 2013-05-29 | 广东因豪信息科技有限公司 | Method and device for detecting and reducing correctness of order of strokes of written Chinese character |
| CN115061587B (en) * | 2022-06-30 | 2025-10-14 | 深圳市沃特沃德信息有限公司 | Character recognition method, device, equipment and medium based on displacement sensor |
| TWM651837U (en) * | 2023-08-23 | 2024-02-21 | 泓宇星私人有限責任公司 | Information system based on automatic branching of handwritten documents |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS57178579A (en) * | 1981-04-27 | 1982-11-02 | Nippon Telegr & Teleph Corp <Ntt> | Online recognition system for handwritten character |
| JPS61198385A (en) * | 1985-02-27 | 1986-09-02 | Oki Electric Ind Co Ltd | Character recognizing system |
-
1996
- 1996-01-16 JP JP00476096A patent/JP3360513B2/en not_active Expired - Fee Related
- 1996-09-23 TW TW085111604A patent/TW315446B/zh active
- 1996-11-12 KR KR1019960053419A patent/KR100236247B1/en not_active Expired - Fee Related
- 1996-11-13 CN CN96121613A patent/CN1093965C/en not_active Expired - Fee Related
Also Published As
| Publication number | Publication date |
|---|---|
| CN1155129A (en) | 1997-07-23 |
| JPH09198466A (en) | 1997-07-31 |
| JP3360513B2 (en) | 2002-12-24 |
| TW315446B (en) | 1997-09-11 |
| KR100236247B1 (en) | 1999-12-15 |
| KR970059977A (en) | 1997-08-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1159673C (en) | Device and method for extracting management information from images | |
| CN1091906C (en) | Pattern recognizing method and system and pattern data processing system | |
| CN1225484A (en) | Address recognition device and method | |
| CN1365078A (en) | Title extracting device and its method for extracting title from file images | |
| CN1331449A (en) | Method and relative system for dividing or separating text or decument into sectional word by process of adherence | |
| CN1105464A (en) | Interactive computer system capable of recognizing spoken commands | |
| CN1269081C (en) | Image angle detector and scanning line interpolating apparatus having same | |
| CN1249454A (en) | Method and apparatus for dividing gesture | |
| CN1763743A (en) | Automatic label placement system and method on charts | |
| CN1311423C (en) | System and method for performing speech recognition by utilizing a multi-language dictionary | |
| CN1093965C (en) | On-line character recognition method and apparatus thereof | |
| CN1471078A (en) | Word recognition apapratus, word recognition method and word recognition programme | |
| CN1202670A (en) | Pattern extraction apparatus | |
| CN1761996A (en) | Speech recognition system and method using combined dictionaries | |
| CN1469229A (en) | auxiliary input device | |
| CN1991863A (en) | Medium processing apparatus, medium processing method, and medium processing system | |
| CN1667998A (en) | Data Detector and Multi-Channel Data Detector | |
| CN1348559A (en) | Portable character input device | |
| CN1050914C (en) | Lin code Chinese character input method | |
| CN1198214C (en) | Method for accessing data in arbitrary bit ranges between different platforms | |
| CN1100301C (en) | Chinese kanji character converting method and apparatus | |
| CN1068127C (en) | Text data processing method and device | |
| CN1691745A (en) | Data consistency detection device, data consistency judgment device and data selection device | |
| CN1170158A (en) | Arrangement of symbol marking for keyboard inputing Chinese characters and principle of keyboard design | |
| CN85104927A (en) | The method of the font data that the generation ratio is suitable |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C06 | Publication | ||
| PB01 | Publication | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C19 | Lapse of patent right due to non-payment of the annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |