[go: up one dir, main page]

CN118314588A - Method and system for recognizing handwriting fonts based on AI technology - Google Patents

Method and system for recognizing handwriting fonts based on AI technology Download PDF

Info

Publication number
CN118314588A
CN118314588A CN202410302264.6A CN202410302264A CN118314588A CN 118314588 A CN118314588 A CN 118314588A CN 202410302264 A CN202410302264 A CN 202410302264A CN 118314588 A CN118314588 A CN 118314588A
Authority
CN
China
Prior art keywords
handwriting
image
vector
font
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410302264.6A
Other languages
Chinese (zh)
Inventor
华敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Leyi Smart Technology Co ltd
Original Assignee
Jiangsu Leyi Smart Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Leyi Smart Technology Co ltd filed Critical Jiangsu Leyi Smart Technology Co ltd
Priority to CN202410302264.6A priority Critical patent/CN118314588A/en
Publication of CN118314588A publication Critical patent/CN118314588A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to the technical field of image recognition, and discloses a method for recognizing handwriting fonts based on an AI technology, which comprises the following steps: collecting an original handwriting image, carrying out gray level processing, simplifying the handwriting image after gray level processing by adopting a binarization technology, and highlighting the contrast between characters and a background to obtain the simplified handwriting image; cutting and scaling the simplified handwriting image to obtain a handwriting image with uniform size, generalizing the unified handwriting image through rotation and translation operation to obtain an expanded handwriting image, and forming a standard handwriting image set together with the simplified handwriting image; extracting high-dimensional features of images in a standard handwriting image set to obtain a characterization vector, and integrating the characterization vectors extracted from the handwriting images of the same handwriting type to obtain a characterization vector sequence; and constructing a continuous dependency analysis model to obtain an identification font semantic vector sequence, and decoding and mapping the output semantic vector sequence to obtain identification font content.

Description

一种基于AI技术识别手写字体的方法与系统A method and system for recognizing handwritten fonts based on AI technology

技术领域Technical Field

本发明涉及图像识别的技术领域,尤其涉及一种基于AI技术识别手写字体的方法及系统。The present invention relates to the technical field of image recognition, and in particular to a method and system for recognizing handwritten fonts based on AI technology.

背景技术Background technique

随着数字化转型的深入发展,人们对于信息处理的效率和精度提出了更高的要求。手写字体识别技术作为连接传统书写和数字信息处理的桥梁,具有重要的研究价值和应用前景。手写字体识别不仅可以应用于自动表单填写、历史文献数字化、个人身份验证等领域,还能够促进无障碍通信技术的发展,帮助视障人士更好地获取信息。但是,在连笔书写或草书中,字符之间的边界不明显,传统的基于分割的识别方法难以准确处理该问题。鉴于此,本专利提出一种基于AI技术识别手写字体的方法,通过智能分割提高手写字体识别的准确性和实时性,可以极大地丰富人机交互方式,提升用户体验。With the in-depth development of digital transformation, people have put forward higher requirements for the efficiency and accuracy of information processing. As a bridge connecting traditional writing and digital information processing, handwriting recognition technology has important research value and application prospects. Handwriting recognition can not only be applied to automatic form filling, digitization of historical documents, personal identity verification and other fields, but also promote the development of barrier-free communication technology and help the visually impaired to better obtain information. However, in cursive writing or cursive writing, the boundaries between characters are not obvious, and traditional segmentation-based recognition methods are difficult to accurately handle this problem. In view of this, this patent proposes a method for recognizing handwriting based on AI technology, which improves the accuracy and real-time performance of handwriting recognition through intelligent segmentation, which can greatly enrich the human-computer interaction mode and enhance the user experience.

发明内容Summary of the invention

有鉴于此,本发明提供一种基于AI技术识别手写字体的方法及系统,目的在于:1)提出一种基于AI技术识别手写字体的方法,通过特征提取、模型设计和纠偏策略,有效地提高了手写字体识别的准确性和鲁棒性;2)对标准手写图像集合中的图像进行高维特征提取,并将同一笔迹类型的手写图像特征整合成表征向量序列,有效地捕获了手写图像的关键特征,并为后续的连续依赖关系分析奠定了基础;3)采用融合注意力机制的长短期记忆网络(LSTM)作为连续依赖关系分析模型,能够更有效地处理序列数据,特别是在捕捉长距离依赖关系方面,注意力机制的引入使得模型能够在处理每个时间步的输入时,动态地聚焦于序列中的关键信息,从而提高了模型的准确性和效率;4)在解码映射阶段,通过结合语言模型进行识别纠偏,不仅考虑了字体识别的结果,还利用语言模型对识别结果进行评分和校正,能够有效减少识别错误,特别是在复杂或不清晰的手写图像中,能够显著提高识别的准确率。In view of this, the present invention provides a method and system for recognizing handwritten fonts based on AI technology, the purpose of which is to: 1) propose a method for recognizing handwritten fonts based on AI technology, which effectively improves the accuracy and robustness of handwritten font recognition through feature extraction, model design and correction strategy; 2) perform high-dimensional feature extraction on images in a standard handwritten image set, and integrate the handwritten image features of the same handwriting type into a representation vector sequence, effectively capture the key features of the handwritten image, and lay the foundation for subsequent continuous dependency analysis; 3) adopt a long short-term memory network (LSTM) fused with an attention mechanism as a continuous dependency analysis model, which can more effectively process sequence data, especially in capturing long-distance dependencies. The introduction of the attention mechanism enables the model to dynamically focus on the key information in the sequence when processing the input of each time step, thereby improving the accuracy and efficiency of the model; 4) in the decoding mapping stage, by combining the language model for recognition correction, not only the results of font recognition are considered, but also the language model is used to score and correct the recognition results, which can effectively reduce recognition errors, especially in complex or unclear handwritten images, and can significantly improve the recognition accuracy.

为实现上述目的,本发明提供的一种基于AI技术识别手写字体的方法,包括以下步骤:To achieve the above object, the present invention provides a method for recognizing handwritten fonts based on AI technology, comprising the following steps:

S1:采集原始手写图像并进行灰度化处理,采用二值化技术对灰度处理后的手写图像进行简化,突出文字与背景的对比度得到简化后的手写图像;S1: Collect the original handwritten image and grayscale it, use binarization technology to simplify the grayscale processed handwritten image, highlight the contrast between the text and the background to obtain a simplified handwritten image;

S2:对简化后的手写图像进行裁剪缩放得到统一大小的手写图像,通过旋转平移操作对统一后的手写图像进行泛化处理得到扩充手写图像,并与简化后的手写图像进行共同构成标准手写图像集合;S2: The simplified handwritten image is cropped and scaled to obtain a handwritten image of uniform size, and the unified handwritten image is generalized by rotation and translation operations to obtain an expanded handwritten image, which together with the simplified handwritten image constitutes a standard handwritten image set;

S3:对标准手写图像集合中的图像进行高维特征提取得到表征向量,并对同一笔迹类型手写图像提取得到的表征向量进行整合得到表征向量序列;S3: extracting high-dimensional features from images in the standard handwriting image set to obtain representation vectors, and integrating the representation vectors extracted from handwriting images of the same handwriting type to obtain a representation vector sequence;

S4:构建连续依赖关系分析模型,所述模型以表征向量序列为输入,以识别字体语义向量序列为输出,其中融合注意力机制的LSTM为所述连续依赖关系分析模型的主要实施方法;S4: Construct a continuous dependency analysis model, wherein the model takes a representation vector sequence as input and a font semantic vector sequence as output, wherein the LSTM integrated with the attention mechanism is the main implementation method of the continuous dependency analysis model;

S5:对输出的语义向量序列进行解码映射得到识别字体内容,其中结合语言模型的识别纠偏为所述解码映射的主要实施方法。S5: Decoding and mapping the output semantic vector sequence to obtain the recognized font content, wherein recognition and correction combined with the language model is the main implementation method of the decoding and mapping.

作为本发明的进一步改进方法:As a further improvement method of the present invention:

可选地,所述S1步骤中采集原始手写图像并进行灰度化处理,包括:Optionally, in step S1, collecting the original handwritten image and performing grayscale processing includes:

S21:对手写文本进行扫描得到原始手写图像,其中300dpi为扫描图像的分辨率;S21: Scan the handwritten text to obtain an original handwritten image, where 300 dpi is the resolution of the scanned image;

S22:对原始手写图像中的每一像素点进行灰度化处理,计算公式为:S22: grayscale each pixel in the original handwritten image, and the calculation formula is:

M=0.299×R+0.587×G+0.114×BM=0.299×R+0.587×G+0.114×B

其中:R、G、B分别代表原始彩色图像中每个像素点的红色、绿色和蓝色通道的强度值,M表示计算得到的灰度值。Where: R, G, B represent the intensity values of the red, green and blue channels of each pixel in the original color image, respectively, and M represents the calculated grayscale value.

可选地,所述S3步骤中对标准手写图像集合中的图像进行高维特征提取得到表征向量,包括:Optionally, in step S3, high-dimensional feature extraction is performed on the images in the standard handwritten image set to obtain a representation vector, including:

采用改进的卷积神经网络进行高维特征提取,包括:An improved convolutional neural network is used for high-dimensional feature extraction, including:

S31:通过卷积层从输入图像中提取特征,在卷积层中,利用卷积核在输入图像上滑动,与其覆盖的局部区域进行元素乘积后求和,生成特征图,计算公式为:S31: Extract features from the input image through the convolution layer. In the convolution layer, the convolution kernel is used to slide on the input image, and the element-wise product with the local area covered by it is summed to generate a feature map. The calculation formula is:

其中:in:

Fij表示特征图在位置(i,j)的值;F ij represents the value of the feature map at position (i, j);

I(i+m)(j+n)表示输入图像在位置(i+m,j+n)的像素值;I (i+m)(j+n) represents the pixel value of the input image at position (i+m,j+n);

Kmn表示卷积核在(m,n)位置的权重;K mn represents the weight of the convolution kernel at the (m,n) position;

b表示偏置项;b represents the bias term;

m和n遍历卷积核的所有位置的索引变量;m and n are index variables that traverse all positions of the convolution kernel;

S32:通过池化层降低特征图的空间维度,减少参数数量和计算量,同时保持特征不变,计算公式为:S32: The pooling layer reduces the spatial dimension of the feature map, reduces the number of parameters and the amount of calculation, while keeping the features unchanged. The calculation formula is:

其中:in:

Pij表示池化后特征图在位置(i,j)的值;P ij represents the value of the feature map at position (i, j) after pooling;

F(i+m)(j+n)表示卷积层输出的特征图在位置(i+m,j+n)的值;F (i+m)(j+n) represents the value of the feature map output by the convolutional layer at position (i+m,j+n);

M×N是池化窗口的大小;M×N is the size of the pooling window;

S33:将提取的特征图矩阵拼接成全局特征向量并通过残差连接得到表征向量。S33: Concatenate the extracted feature map matrices into a global feature vector and obtain a representation vector through residual connection.

所述S33步骤中通过残差连接得到表征向量,包括:In the step S33, the representation vector is obtained by residual connection, including:

通过引入短路连接学习全局特征向量与表征向量之间的残差映射关系,计算公式为:By introducing short-circuit connections to learn the residual mapping relationship between the global feature vector and the representation vector, the calculation formula is:

F(x)=H(x)+xF(x)=H(x)+x

其中:in:

X表示该残差连接的输入;X represents the input of the residual connection;

H(x)表示卷积层的堆叠对输入x处理的结果;H(x) represents the result of the stack of convolutional layers processing the input x;

F(x)表示残差模块的输出。F(x) represents the output of the residual module.

可选地,所述S4步骤中构建连续依赖关系分析模型,包括:Optionally, constructing a continuous dependency analysis model in step S4 includes:

S51:对表征向量序列进行上下文信息提取得到上下文向量ct-1,将提取得到的上下文向量与当前时刻输入的表征向量xt作为LSTM的输入得到当前时刻LSTM的输出;S51: extracting context information from the representation vector sequence to obtain a context vector c t-1 , and using the extracted context vector and the representation vector x t input at the current moment as inputs to the LSTM to obtain the output of the LSTM at the current moment;

S52:利用当前时刻的LSTM输出ht和上下文向量ct,通过全连接层生成当前时刻的语义特征向量ytS52: Using the LSTM output h t and context vector c t at the current moment, generate the semantic feature vector y t at the current moment through a fully connected layer.

所述S52步骤中利用当前时刻的LSTM输出ht和上下文向量ct,通过全连接层生成当前时刻的语义特征向量yt,包括:In the step S52, the LSTM output h t and the context vector c t at the current moment are used to generate the semantic feature vector y t at the current moment through a fully connected layer, including:

S61:根据表征向量序列计算注意力权重,计算公式为:S61: Calculate the attention weight according to the representation vector sequence. The calculation formula is:

et,s=a(ht-1,hs)e t,s = a(h t-1 ,h s )

其中:et,s为注意力得分;αt,s表示归一化后的注意力权重,t,s为时序索引;a()为自注意力计算函数;Where: e t,s is the attention score; α t,s represents the normalized attention weight, t,s is the time index; a() is the self-attention calculation function;

S62:根据计算得到的注意力权重,计算公式为:S62: According to the calculated attention weight, the calculation formula is:

其中:ct表示加权平均后的上下文向量;hs表示s时刻的LSTM输出。Where: c t represents the context vector after weighted averaging; h s represents the LSTM output at time s.

可选地,所述S5步骤中对输出的语义向量序列进行解码映射得到识别字体内容,包括:Optionally, in step S5, decoding and mapping the output semantic vector sequence to obtain the identified font content includes:

S71:对于每个时间步的语义向量通过全连接映射得到该步输出,计算公式为:S71: For each time step, the semantic vector is mapped through full connection to obtain the output of this step. The calculation formula is:

zt=W·yt+bz t = W·y t + b

其中:zt表示全连接层输出;W表示全连接层权重矩阵;b表示偏置项;Where: z t represents the output of the fully connected layer; W represents the weight matrix of the fully connected layer; b represents the bias term;

S72:通过Softmax将全连接层的输出转换为概率分布,对于每个时间步t的全连接层输出zt,概率分布计算公式为:S72: The output of the fully connected layer is converted into a probability distribution through Softmax. For the fully connected layer output z t at each time step t, the probability distribution calculation formula is:

其中:in:

P(yt=k|zt)表示在时间步t,给定输入zt下,预测为类别k的概率;P(y t = k|z t ) represents the probability of predicting category k given input z t at time step t;

zt,k表示向量zt中对应类别k的分量;z t,k represents the component of vector z t corresponding to category k;

K是类别的总数,即需要识别的不同字体的数量。K is the total number of categories, i.e. the number of different fonts that need to be recognized.

所述S5步骤中结合语言模型的识别纠偏,包括:The recognition and correction combined with the language model in step S5 includes:

S81:构建语言评分模型,其中N-gram为所述评分模型的主要实施方法,计算公式为:S81: Construct a language scoring model, wherein N-gram is the main implementation method of the scoring model, and the calculation formula is:

其中:(S=s1,s2,...,sN)表示前N时刻的字体序列;S(si|s1,s2,...,si-1)表示在给定前i-1个字体的情况下,第i个字体出现的条件概率;Where: (S = s 1 , s 2 , ..., s N ) represents the font sequence at the previous N moments; S (s i | s 1 , s 2 , ..., s i-1 ) represents the conditional probability of the i-th font appearing given the previous i-1 fonts;

S82:将字体识别的概率分布与语言模型的评分进行融合,计算每个可能序列的综合得分,并选择得分最高的序列作为最终识别结果。S82: The probability distribution of font recognition is integrated with the score of the language model, a comprehensive score of each possible sequence is calculated, and the sequence with the highest score is selected as the final recognition result.

所述S82步骤中将字体识别的概率分布与语言模型的评分进行融合,计算每个可能序列的综合得分,包括:In step S82, the probability distribution of font recognition is integrated with the score of the language model to calculate the comprehensive score of each possible sequence, including:

计算所有可能序列的得分,计算公式为:Calculate the scores of all possible sequences using the formula:

Score(S)=αPfont(S)+βPlang(S)Score(S)=αP font (S)+βP lang (S)

其中:in:

Pfont(S)表示解码映射计算得到序列S的概率;P font (S) represents the probability of obtaining sequence S by decoding mapping calculation;

Plang(S)表示语言评分模型为序列S计算的概率;P lang (S) represents the probability calculated by the language scoring model for the sequence S;

α和β表示用于平衡字体识别概率和语言模型概率的权重参数。α and β represent weight parameters used to balance the probability of font recognition and the probability of language model.

为了解决上述问题,本发明提供一种基于AI技术识别手写字体的系统,所述系统包括:In order to solve the above problems, the present invention provides a system for recognizing handwritten fonts based on AI technology, the system comprising:

数据采集模块,用于采集原始手写图像并进行灰度化处理,得到简化后的手写图像形成标准手写图像集合,并对标准手写图像集合中的图像进行高维特征提取得到表征向量,对同一笔迹类型手写图像提取得到的表征向量进行整合得到表征向量序列;The data acquisition module is used to acquire the original handwritten image and perform grayscale processing to obtain the simplified handwritten image to form a standard handwritten image set, and to perform high-dimensional feature extraction on the images in the standard handwritten image set to obtain a representation vector, and to integrate the representation vectors extracted from the handwritten images of the same handwriting type to obtain a representation vector sequence;

连续依赖关系分析模块,用于构建连续依赖关系分析模型,根据表征向量序列为输入,得到字体语义向量序列;A continuous dependency analysis module is used to construct a continuous dependency analysis model, and obtain a font semantic vector sequence based on a representation vector sequence as input;

字体识别模块,用于对输出的语义向量序列进行解码映射得到识别字体内容。The font recognition module is used to decode and map the output semantic vector sequence to obtain the recognized font content.

为了解决上述问题,本发明还提供一种电子设备,所述电子设备包括:In order to solve the above problem, the present invention further provides an electronic device, the electronic device comprising:

存储器,存储至少一个指令;A memory storing at least one instruction;

通信接口,实现电子设备通信;及Communication interface, enabling electronic equipment to communicate; and

处理器,执行所述存储器中存储的指令以实现上述所述的基于AI技术识别手写字体的方法。The processor executes the instructions stored in the memory to implement the above-mentioned method for recognizing handwritten fonts based on AI technology.

为了解决上述问题,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个指令,所述至少一个指令被电子设备中的处理器执行以实现上述所述的基于AI技术识别手写字体的方法。In order to solve the above problem, the present invention also provides a computer-readable storage medium, in which at least one instruction is stored. The at least one instruction is executed by a processor in an electronic device to implement the above-mentioned method for recognizing handwritten fonts based on AI technology.

相对于现有技术,本发明提出一种基于AI技术识别手写字体的方法,该技术具有以下优势:Compared with the prior art, the present invention proposes a method for recognizing handwritten fonts based on AI technology, which has the following advantages:

首先,本方案提出一种基于AI技术识别手写字体的方法,通过特征提取、模型设计和纠偏策略,有效地提高了手写字体识别的准确性和鲁棒性;First, this scheme proposes a method for handwriting recognition based on AI technology, which effectively improves the accuracy and robustness of handwriting recognition through feature extraction, model design and correction strategy;

同时,本方案对标准手写图像集合中的图像进行高维特征提取,并将同一笔迹类型的手写图像特征整合成表征向量序列,有效地捕获了手写图像的关键特征,并为后续的连续依赖关系分析奠定了基础,并采用融合注意力机制的长短期记忆网络(LSTM)作为连续依赖关系分析模型,能够更有效地处理序列数据,特别是在捕捉长距离依赖关系方面,注意力机制的引入使得模型能够在处理每个时间步的输入时,动态地聚焦于序列中的关键信息,从而提高了模型的准确性和效率。At the same time, this scheme extracts high-dimensional features from images in the standard handwritten image set, and integrates the handwritten image features of the same handwriting type into a representation vector sequence, which effectively captures the key features of the handwritten image and lays the foundation for subsequent continuous dependency analysis. The long short-term memory network (LSTM) with integrated attention mechanism is used as the continuous dependency analysis model, which can process sequence data more effectively, especially in capturing long-distance dependencies. The introduction of the attention mechanism enables the model to dynamically focus on the key information in the sequence when processing the input of each time step, thereby improving the accuracy and efficiency of the model.

此外,本方案在解码映射阶段,通过结合语言模型进行识别纠偏,不仅考虑了字体识别的结果,还利用语言模型对识别结果进行评分和校正,能够有效减少识别错误,特别是在复杂或不清晰的手写图像中,能够显著提高识别的准确率。In addition, during the decoding and mapping stage, this scheme combines the language model for recognition correction. It not only takes into account the results of font recognition, but also uses the language model to score and correct the recognition results. It can effectively reduce recognition errors, especially in complex or unclear handwritten images, and can significantly improve the recognition accuracy.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明一实施例提供的一种基于AI技术识别手写字体的方法的流程示意图;FIG1 is a flow chart of a method for recognizing handwritten fonts based on AI technology provided by an embodiment of the present invention;

图2为本发明一实施例提供的基于AI技术识别手写字体的系统的功能模块图;FIG2 is a functional module diagram of a system for recognizing handwritten fonts based on AI technology provided by an embodiment of the present invention;

图3为本发明一实施例提供的实现基于AI技术识别手写字体的方法的电子设备的结构示意图。FIG3 is a schematic diagram of the structure of an electronic device for implementing a method for recognizing handwritten fonts based on AI technology provided by an embodiment of the present invention.

本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization of the purpose, functional features and advantages of the present invention will be further explained in conjunction with embodiments and with reference to the accompanying drawings.

具体实施方式Detailed ways

应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, and are not used to limit the present invention.

本申请实施例提供一种基于AI技术识别手写字体的方法。所述基于AI技术识别手写字体的方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述基于AI技术识别手写字体的方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。The embodiment of the present application provides a method for recognizing handwritten fonts based on AI technology. The execution subject of the method for recognizing handwritten fonts based on AI technology includes but is not limited to at least one of the electronic devices such as a server, a terminal, etc. that can be configured to execute the method provided by the embodiment of the present application. In other words, the method for recognizing handwritten fonts based on AI technology can be executed by software or hardware installed on a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, etc.

实施例1:Embodiment 1:

S1:采集原始手写图像并进行灰度化处理,采用二值化技术对灰度处理后的手写图像进行简化,突出文字与背景的对比度得到简化后的手写图像。S1: The original handwritten image is collected and gray-scaled, and the gray-scaled handwritten image is simplified by using a binarization technique to highlight the contrast between the text and the background to obtain a simplified handwritten image.

所述S1步骤中采集原始手写图像并进行灰度化处理,包括:In the step S1, the original handwritten image is collected and grayscaled, including:

S21:对手写文本进行扫描得到原始手写图像,其中300dpi为扫描图像的分辨率;S21: Scan the handwritten text to obtain an original handwritten image, where 300 dpi is the resolution of the scanned image;

S22:对原始手写图像中的每一像素点进行灰度化处理,计算公式为:S22: grayscale each pixel in the original handwritten image, and the calculation formula is:

M=0.299×R+0.587×G+0.114×BM=0.299×R+0.587×G+0.114×B

其中:R、G、B分别代表原始彩色图像中每个像素点的红色、绿色和蓝色通道的强度值,M表示计算得到的灰度值。Where: R, G, B represent the intensity values of the red, green and blue channels of each pixel in the original color image, respectively, and M represents the calculated grayscale value.

S2:对简化后的手写图像进行裁剪缩放得到统一大小的手写图像,通过旋转平移操作对统一后的手写图像进行泛化处理得到扩充手写图像,并与简化后的手写图像进行共同构成标准手写图像集合。S2: The simplified handwritten image is cropped and scaled to obtain a handwritten image of uniform size, and the unified handwritten image is generalized through rotation and translation operations to obtain an expanded handwritten image, which together with the simplified handwritten image constitutes a standard handwritten image set.

S3:对标准手写图像集合中的图像进行高维特征提取得到表征向量,并对同一笔迹类型手写图像提取得到的表征向量进行整合得到表征向量序列。S3: extract high-dimensional features from images in the standard handwriting image set to obtain representation vectors, and integrate the representation vectors extracted from handwriting images of the same handwriting type to obtain a representation vector sequence.

所述S3步骤中对标准手写图像集合中的图像进行高维特征提取得到表征向量,包括:In the step S3, high-dimensional feature extraction is performed on the images in the standard handwritten image set to obtain a representation vector, including:

采用改进的卷积神经网络进行高维特征提取,包括:An improved convolutional neural network is used for high-dimensional feature extraction, including:

S31:通过卷积层从输入图像中提取特征,在卷积层中,利用卷积核在输入图像上滑动,与其覆盖的局部区域进行元素乘积后求和,生成特征图,计算公式为:S31: Extract features from the input image through the convolution layer. In the convolution layer, the convolution kernel is used to slide on the input image, and the element-wise product is performed with the local area covered by it, and then the sum is taken to generate a feature map. The calculation formula is:

其中:in:

Fij表示特征图在位置(i,j)的值;F ij represents the value of the feature map at position (i, j);

I(i+m)(j+n)表示输入图像在位置(i+m,j+n)的像素值;I (i+m)(j+n) represents the pixel value of the input image at position (i+m,j+n);

Kmn表示卷积核在(m,n)位置的权重;K mn represents the weight of the convolution kernel at the (m,n) position;

b表示偏置项;b represents the bias term;

m和n遍历卷积核的所有位置的索引变量;m and n are index variables that traverse all positions of the convolution kernel;

S32:通过池化层降低特征图的空间维度,减少参数数量和计算量,同时保持特征不变,计算公式为:S32: The pooling layer reduces the spatial dimension of the feature map, reduces the number of parameters and the amount of calculation, while keeping the features unchanged. The calculation formula is:

其中:in:

Pij表示池化后特征图在位置(i,j)的值;P ij represents the value of the feature map at position (i, j) after pooling;

F(i+m)(j+n)表示卷积层输出的特征图在位置(i+m,j+n)的值;F (i+m)(j+n) represents the value of the feature map output by the convolutional layer at position (i+m,j+n);

M×N是池化窗口的大小;M×N is the size of the pooling window;

S33:将提取的特征图矩阵拼接成全局特征向量并通过残差连接得到表征向量。S33: Concatenate the extracted feature map matrices into a global feature vector and obtain a representation vector through residual connection.

所述S33步骤中通过残差连接得到表征向量,包括:In the step S33, the representation vector is obtained by residual connection, including:

通过引入短路连接学习全局特征向量与表征向量之间的残差映射关系,计算公式为:By introducing short-circuit connections to learn the residual mapping relationship between the global feature vector and the representation vector, the calculation formula is:

F(x)=H(x)+xF(x)=H(x)+x

其中:in:

X表示该残差连接的输入;X represents the input of the residual connection;

H(x)表示卷积层的堆叠对输入x处理的结果;H(x) represents the result of the stack of convolutional layers processing the input x;

F(x)表示残差模块的输出。F(x) represents the output of the residual module.

S4:构建连续依赖关系分析模型,所述模型以表征向量序列为输入,以识别字体语义向量序列为输出,其中融合注意力机制的LSTM为所述连续依赖关系分析模型的主要实施方法。S4: Construct a continuous dependency analysis model, which takes a representation vector sequence as input and an identification font semantic vector sequence as output, wherein the LSTM integrated with the attention mechanism is the main implementation method of the continuous dependency analysis model.

所述S4步骤中构建连续依赖关系分析模型,包括:The continuous dependency analysis model is constructed in step S4, including:

S51:对表征向量序列进行上下文信息提取得到上下文向量ct-1,将提取得到的上下文向量与当前时刻输入的表征向量xt作为LSTM的输入得到当前时刻LSTM的输出;S51: extracting context information from the representation vector sequence to obtain a context vector c t-1 , and using the extracted context vector and the representation vector x t input at the current moment as inputs to the LSTM to obtain the output of the LSTM at the current moment;

S52:利用当前时刻的LSTM输出ht和上下文向量ct,通过全连接层生成当前时刻的语义特征向量ytS52: Using the LSTM output h t and context vector c t at the current moment, generate the semantic feature vector y t at the current moment through a fully connected layer.

所述S52步骤中利用当前时刻的LSTM输出ht和上下文向量ct,通过全连接层生成当前时刻的语义特征向量yt,包括:In the step S52, the current LSTM output h t and the context vector c t are used to generate the current semantic feature vector y t through a fully connected layer, including:

S61:根据表征向量序列计算注意力权重,计算公式为:S61: Calculate the attention weight according to the representation vector sequence. The calculation formula is:

et,s=a(ht-1,hs)e t,s = a(h t-1 ,h s )

其中:et,s为注意力得分;αt,s表示归一化后的注意力权重,t,s为时序索引;a()为自注意力计算函数;Where: e t,s is the attention score; α t,s represents the normalized attention weight, t,s is the time index; a() is the self-attention calculation function;

S62:根据计算得到的注意力权重,计算公式为:S62: According to the calculated attention weight, the calculation formula is:

其中:ct表示加权平均后的上下文向量;hs表示s时刻的LSTM输出。Where: c t represents the context vector after weighted averaging; h s represents the LSTM output at time s.

S5:对输出的语义向量序列进行解码映射得到识别字体内容,其中结合语言模型的识别纠偏为所述解码映射的主要实施方法。S5: Decoding and mapping the output semantic vector sequence to obtain the recognized font content, wherein recognition and correction combined with the language model is the main implementation method of the decoding and mapping.

所述S5步骤中对输出的语义向量序列进行解码映射得到识别字体内容,包括:In the step S5, decoding and mapping the output semantic vector sequence to obtain the recognized font content includes:

S71:对于每个时间步的语义向量通过全连接映射得到该步输出,计算公式为:S71: For each time step, the semantic vector is mapped through full connection to obtain the output of this step. The calculation formula is:

zt=W·yt+bz t = W·y t + b

其中:zt表示全连接层输出;W表示全连接层权重矩阵;b表示偏置项;Where: z t represents the output of the fully connected layer; W represents the weight matrix of the fully connected layer; b represents the bias term;

S72:通过Softmax将全连接层的输出转换为概率分布,对于每个时间步t的全连接层输出zt,概率分布计算公式为:S72: The output of the fully connected layer is converted into a probability distribution through Softmax. For the fully connected layer output z t at each time step t, the probability distribution calculation formula is:

其中:in:

P(yt=k|zt)表示在时间步t,给定输入zt下,预测为类别k的概率;P(y t = k|z t ) represents the probability of predicting category k given input z t at time step t;

zt,k表示向量zt中对应类别k的分量;z t,k represents the component of vector z t corresponding to category k;

K是类别的总数,即需要识别的不同字体的数量。K is the total number of categories, i.e. the number of different fonts that need to be recognized.

所述S5步骤中结合语言模型的识别纠偏,包括:The recognition and correction combined with the language model in step S5 includes:

S81:构建语言评分模型,其中N-gram为所述评分模型的主要实施方法,计算公式为:S81: Construct a language scoring model, wherein N-gram is the main implementation method of the scoring model, and the calculation formula is:

其中:(S=s1,s2,...,sN)表示前N时刻的字体序列;P(si|s1,s2,...,si-1)表示在给定前i-1个字体的情况下,第i个字体出现的条件概率;Where: (S = s 1 , s 2 , ..., s N ) represents the font sequence at the previous N moments; P (s i | s 1 , s 2 , ..., s i-1 ) represents the conditional probability of the i-th font appearing given the previous i-1 fonts;

S82:将字体识别的概率分布与语言模型的评分进行融合,计算每个可能序列的综合得分,并选择得分最高的序列作为最终识别结果。S82: The probability distribution of font recognition is integrated with the score of the language model, a comprehensive score of each possible sequence is calculated, and the sequence with the highest score is selected as the final recognition result.

所述S82步骤中将字体识别的概率分布与语言模型的评分进行融合,计算每个可能序列的综合得分,包括:In step S82, the probability distribution of font recognition is integrated with the score of the language model to calculate the comprehensive score of each possible sequence, including:

计算所有可能序列的得分,计算公式为:Calculate the scores of all possible sequences using the formula:

Score(S)=αPfont(S)+βPlang(S)Score(S)=αP font (S)+βP lang (S)

其中:in:

Pfont(S)表示解码映射计算得到序列S的概率;P font (S) represents the probability of obtaining sequence S by decoding mapping calculation;

Plang(S)表示语言评分模型为序列S计算的概率;P lang (S) represents the probability calculated by the language scoring model for the sequence S;

α和β表示用于平衡字体识别概率和语言模型概率的权重参数。α and β represent weight parameters used to balance the probability of font recognition and the probability of language model.

实施例2:Embodiment 2:

如图2所示,是本发明一实施例提供的基于AI技术识别手写字体的系统的功能模块图,其可以实现实施例1中的基于AI技术识别手写字体的方法。As shown in FIG. 2 , it is a functional module diagram of a system for recognizing handwritten fonts based on AI technology provided in an embodiment of the present invention, which can implement the method for recognizing handwritten fonts based on AI technology in Embodiment 1.

本发明所述基于AI技术识别手写字体的系统100可以安装于电子设备中。根据实现的功能,所述基于AI技术识别手写字体的系统可以包括数据采集模块101、连续依赖关系分析模块102及字体识别模块103。本发明所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The system 100 for recognizing handwritten fonts based on AI technology of the present invention can be installed in an electronic device. According to the functions implemented, the system for recognizing handwritten fonts based on AI technology can include a data acquisition module 101, a continuous dependency analysis module 102 and a font recognition module 103. The module of the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by an electronic device processor and can complete fixed functions, which are stored in the memory of the electronic device.

数据采集模块101,用于采集原始手写图像并进行灰度化处理,得到简化后的手写图像形成标准手写图像集合,并对标准手写图像集合中的图像进行高维特征提取得到表征向量,对同一笔迹类型手写图像提取得到的表征向量进行整合得到表征向量序列;The data acquisition module 101 is used to acquire the original handwritten image and perform grayscale processing to obtain simplified handwritten images to form a standard handwritten image set, and to perform high-dimensional feature extraction on the images in the standard handwritten image set to obtain representation vectors, and to integrate the representation vectors extracted from the handwritten images of the same handwriting type to obtain a representation vector sequence;

连续依赖关系分析模块102,用于构建连续依赖关系分析模型,根据表征向量序列为输入,得到字体语义向量序列;The continuous dependency analysis module 102 is used to construct a continuous dependency analysis model, and obtain a font semantic vector sequence according to the representation vector sequence as input;

字体识别模块103,用于对输出的语义向量序列进行解码映射得到识别字体内容。The font recognition module 103 is used to decode and map the output semantic vector sequence to obtain the recognized font content.

详细地,本发明实施例中所述基于AI技术识别手写字体的系统100中的所述各模块在使用时采用与上述的图1中所述的基于AI技术识别手写字体的方法一样的技术手段,并能够产生相同的技术效果,这里不再赘述。In detail, the modules in the system 100 for recognizing handwritten fonts based on AI technology in the embodiment of the present invention adopt the same technical means as the method for recognizing handwritten fonts based on AI technology described in the above-mentioned Figure 1 when used, and can produce the same technical effects, which will not be repeated here.

实施例3:Embodiment 3:

如图3所示,是本发明一实施例提供的实现基于AI技术识别手写字体的方法的电子设备的结构示意图。As shown in FIG. 3 , it is a schematic diagram of the structure of an electronic device for implementing a method for recognizing handwritten fonts based on AI technology provided by an embodiment of the present invention.

所述电子设备1可以包括处理器10、存储器11、通信接口13和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如程序12。The electronic device 1 may include a processor 10 , a memory 11 , a communication interface 13 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10 , such as a program 12 .

其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(SecureDigital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如程序12的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 includes at least one type of readable storage medium, and the readable storage medium includes a flash memory, a mobile hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a magnetic memory, a disk, an optical disk, etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash card (Flash Card), etc. equipped on the electronic device 1. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 may not only be used to store application software and various types of data installed in the electronic device 1, such as the code of the program 12, etc., but also be used to temporarily store data that has been output or is to be output.

所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(用于实现基于AI技术识别手写字体的程序12等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or a plurality of packaged integrated circuits with the same or different functions, including one or more central processing units (CPUs), microprocessors, digital processing chips, graphics processors, and combinations of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect various components of the entire electronic device, and executes or executes programs or modules stored in the memory 11 (such as a program 12 for realizing handwriting font recognition based on AI technology), and calls data stored in the memory 11 to execute various functions of the electronic device 1 and process data.

所述通信接口13可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接,并实现电子设备内部组件之间的连接通信。The communication interface 13 may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device 1 and other electronic devices, and to achieve connection and communication between internal components of the electronic device.

所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is configured to realize connection and communication between the memory 11 and at least one processor 10, etc.

图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图3示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG3 merely shows an electronic device with components. Those skilled in the art will appreciate that the structure shown in FIG3 does not limit the electronic device 1 and may include fewer or more components than shown in the figure, or a combination of certain components, or a different arrangement of components.

例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to each component. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that the power management device can realize functions such as charging management, discharging management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and other arbitrary components. The electronic device 1 may also include a variety of sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, which may be a display, an input unit (such as a keyboard), or a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an OLED (Organic Light-Emitting Diode) touch device. The display may also be appropriately referred to as a display screen or a display unit, which is used to display information processed in the electronic device 1 and to display a visual user interface.

应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiment is for illustration only and the scope of the patent application is not limited to this structure.

需要说明的是,上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that the serial numbers of the above embodiments of the present invention are only for description and do not represent the advantages and disadvantages of the embodiments. And the terms "including", "comprising" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, device, article or method. In the absence of further restrictions, an element defined by the sentence "including a ..." does not exclude the presence of other identical elements in the process, device, article or method including the element.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that the above-mentioned embodiment methods can be implemented by means of software plus a necessary general hardware platform, and of course by hardware, but in many cases the former is a better implementation method. Based on such an understanding, the technical solution of the present invention is essentially or the part that contributes to the prior art can be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above, and includes a number of instructions for a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in each embodiment of the present invention.

以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the patent scope of the present invention. Any equivalent structure or equivalent process transformation made using the contents of the present invention specification and drawings, or directly or indirectly applied in other related technical fields, are also included in the patent protection scope of the present invention.

Claims (10)

1. A method and a system for recognizing handwriting fonts based on AI technology are characterized in that the method comprises the following steps:
s1: collecting an original handwriting image, carrying out gray level processing, simplifying the handwriting image after gray level processing by adopting a binarization technology, and highlighting the contrast between characters and a background to obtain the simplified handwriting image;
S2: cutting and scaling the simplified handwriting image to obtain a handwriting image with uniform size, generalizing the unified handwriting image through rotation and translation operation to obtain an expanded handwriting image, and forming a standard handwriting image set together with the simplified handwriting image;
S3: extracting high-dimensional features of images in a standard handwriting image set to obtain a characterization vector, and integrating the characterization vectors extracted from the handwriting images of the same handwriting type to obtain a characterization vector sequence;
S4: constructing a continuous dependency analysis model, wherein the model takes a representation vector sequence as input, and recognizes a font semantic vector sequence as output, and LSTM fused with an attention mechanism is a main implementation method of the continuous dependency analysis model;
s5: and carrying out decoding mapping on the output semantic vector sequence to obtain the identification font content, wherein the identification deviation correction combined with the language model is a main implementation method of the decoding mapping.
2. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 1, wherein the step S1 of collecting an original handwriting image and performing graying processing includes:
s21: scanning the handwritten text to obtain an original handwritten image, wherein 300dpi is the resolution of the scanned image;
S22: gray processing is carried out on each pixel point in the original handwriting image, and a calculation formula is as follows:
M=0.299×R+0.587×G+0.114×B
Wherein: r, G, B represent the intensity values of the red, green and blue channels, respectively, of each pixel in the original color image, M representing the calculated gray values.
3. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 2, wherein the step S3 of extracting high-dimensional features of images in the standard handwriting image set to obtain the characterization vector includes:
High-dimensional feature extraction using an improved convolutional neural network, comprising:
S31: features are extracted from an input image through a convolution layer, in the convolution layer, a convolution kernel is utilized to slide on the input image, element products are carried out on the convolution kernel and a local area covered by the convolution kernel, and then the convolution kernel and the local area are summed to generate a feature map, wherein a calculation formula is as follows:
Wherein:
f ij represents the value of the feature map at position (i, j);
I (i+m)(j+n) denotes the pixel value of the input image at position (i+m, j+n);
K mn represents the weight of the convolution kernel at the (m, n) position;
b represents a bias term;
m and n traverse index variables for all positions of the convolution kernel;
S32: the space dimension of the feature map is reduced through the pooling layer, the parameter quantity and the calculated amount are reduced, the feature is kept unchanged, and the calculation formula is as follows:
Wherein:
P ij represents the value of the pooled feature map at position (i, j);
F (i+m)(j+n) represents the value of the feature map output by the convolution layer at position (i+m, j+n);
M×n is the size of the pooling window;
S33: and splicing the extracted feature map matrixes into total feature vectors, and obtaining characterization vectors through residual connection.
4. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 3, wherein the step S33 of obtaining the characterization vector through residual connection comprises:
By introducing a residual mapping relation between a short circuit connection learning global feature vector and a characterization vector, a calculation formula is as follows:
F(x)=H(x)+x
Wherein:
x represents the input of the residual connection;
H (x) represents the result of the input x processing by the stack of convolutional layers;
F (x) represents the output of the residual block.
5. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 1, wherein constructing a continuous dependency analysis model in step S4 includes:
S51: extracting context information from the characterization vector sequence to obtain a context vector c t-1, and taking the extracted context vector and a characterization vector x t input at the current moment as inputs of LSTM to obtain the output of the LSTM at the current moment;
S52: the semantic feature vector y t at the current time is generated by the full connection layer using the LSTM output h t at the current time and the context vector c t.
6. The method and system for recognizing handwriting according to claim 5, wherein generating the semantic feature vector y t at the current time through the full connection layer by using the LSTM output h t at the current time and the context vector c t in the step S52 comprises:
s61: and calculating attention weight according to the characterization vector sequence, wherein the calculation formula is as follows:
et,s=a(ht-1,hs)
Wherein: e t,s is the attention score;
Alpha t,s represents the normalized attention weight;
t, s is a time sequence index; a () is a self-attention computation function;
S62: according to the calculated attention weight, the calculation formula is as follows:
wherein: c t denotes the context vector after weighted averaging;
h s represents the LSTM output at s time.
7. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 1, wherein the step S5 of decoding and mapping the output semantic vector sequence to obtain the recognized font content includes:
s71: the semantic vector of each time step is mapped through full connection to obtain the output of the step, and the calculation formula is as follows:
zt=W·yt+b
Wherein: z t represents the full link layer output;
W represents a full connection layer weight matrix;
b represents a bias term;
s72: the output of the full-connection layer is converted into probability distribution through Softmax, and for the full-connection layer output z t of each time step t, the probability distribution calculation formula is as follows:
Wherein:
P (y t=k|zt) represents the probability of being predicted as class k given input z t at time step t;
z t,k represents the component of vector z t that corresponds to category k;
k is the total number of categories, i.e. the number of different fonts that need to be identified.
8. The method and system for recognizing handwriting fonts based on AI technology as claimed in claim 1, wherein the step S5 combines recognition correction of language model, comprising:
s81: constructing a language scoring model, wherein N-gram is a main implementation method of the scoring model, and the calculation formula is as follows:
Wherein:
(s=s 1,s2,...,sN) represents the font sequence at the first N times;
P (s i|s1,s2,...,si-1) represents the conditional probability that the i-th font appears given the first i-1 fonts;
S82: and fusing the probability distribution of the font recognition with the score of the language model, calculating the comprehensive score of each possible sequence, and selecting the sequence with the highest score as a final recognition result.
9. The method and system for recognizing handwritten fonts based on AI technology as recited in claim 8, wherein the step S82 of fusing a probability distribution of font recognition with scores of a language model, calculating a composite score for each possible sequence, includes:
The scores of all possible sequences are calculated as:
Score(S)=αPfont(S)+βPlang(S)
Wherein:
P font (S) represents the probability of the decoding map calculation to obtain the sequence S;
P lang (S) represents the probability that the language scoring model calculates for sequence S;
alpha and beta represent weight parameters for balancing the font recognition probability and the language model probability.
10. A system for recognizing handwriting fonts based on AI technology, said system comprising:
The data acquisition module is used for acquiring an original handwriting image and carrying out graying treatment to obtain a simplified handwriting image to form a standard handwriting image set, carrying out high-dimensional feature extraction on images in the standard handwriting image set to obtain a characterization vector, and integrating the characterization vectors obtained by extracting handwriting images of the same handwriting type to obtain a characterization vector sequence;
the continuous dependency analysis module is used for constructing a continuous dependency analysis model and obtaining a font semantic vector sequence according to the characterization vector sequence as input;
A font recognition module, configured to decode and map the output semantic vector sequence to obtain a recognized font content, so as to implement the method for recognizing a handwritten font based on AI technology according to any one of claims 1-9.
CN202410302264.6A 2024-03-15 2024-03-15 Method and system for recognizing handwriting fonts based on AI technology Pending CN118314588A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410302264.6A CN118314588A (en) 2024-03-15 2024-03-15 Method and system for recognizing handwriting fonts based on AI technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410302264.6A CN118314588A (en) 2024-03-15 2024-03-15 Method and system for recognizing handwriting fonts based on AI technology

Publications (1)

Publication Number Publication Date
CN118314588A true CN118314588A (en) 2024-07-09

Family

ID=91724697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410302264.6A Pending CN118314588A (en) 2024-03-15 2024-03-15 Method and system for recognizing handwriting fonts based on AI technology

Country Status (1)

Country Link
CN (1) CN118314588A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050226512A1 (en) * 2001-10-15 2005-10-13 Napper Jonathon L Character string identification
CN109492679A (en) * 2018-10-24 2019-03-19 杭州电子科技大学 Based on attention mechanism and the character recognition method for being coupled chronological classification loss
WO2021025290A1 (en) * 2019-08-06 2021-02-11 삼성전자 주식회사 Method and electronic device for converting handwriting input to text
CN113297986A (en) * 2021-05-27 2021-08-24 新东方教育科技集团有限公司 Handwritten character recognition method, device, medium and electronic equipment
US20220350998A1 (en) * 2021-04-30 2022-11-03 International Business Machines Corporation Multi-Modal Learning Based Intelligent Enhancement of Post Optical Character Recognition Error Correction
CN115761764A (en) * 2022-11-21 2023-03-07 中国科学院合肥物质科学研究院 A Chinese Handwritten Text Line Recognition Method Based on Visual-Language Joint Reasoning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050226512A1 (en) * 2001-10-15 2005-10-13 Napper Jonathon L Character string identification
CN109492679A (en) * 2018-10-24 2019-03-19 杭州电子科技大学 Based on attention mechanism and the character recognition method for being coupled chronological classification loss
WO2021025290A1 (en) * 2019-08-06 2021-02-11 삼성전자 주식회사 Method and electronic device for converting handwriting input to text
US20220350998A1 (en) * 2021-04-30 2022-11-03 International Business Machines Corporation Multi-Modal Learning Based Intelligent Enhancement of Post Optical Character Recognition Error Correction
CN113297986A (en) * 2021-05-27 2021-08-24 新东方教育科技集团有限公司 Handwritten character recognition method, device, medium and electronic equipment
CN115761764A (en) * 2022-11-21 2023-03-07 中国科学院合肥物质科学研究院 A Chinese Handwritten Text Line Recognition Method Based on Visual-Language Joint Reasoning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵杰 等: "《跨院电子病历数据融合关键技术及标准构建》", 31 May 2022, 湖南科学技术出版社, pages: 65 - 67 *

Similar Documents

Publication Publication Date Title
CN112580643B (en) License plate recognition method and device based on deep learning and storage medium
WO2021012494A1 (en) Deep learning-based face recognition method and apparatus, and computer-readable storage medium
CN110738203B (en) Field structured output method, device and computer readable storage medium
CN111898544B (en) Text image matching method, device and equipment and computer storage medium
US12277801B2 (en) Gesture recognition method, device and computer-readable storage medium
WO2021208617A1 (en) Method and apparatus for recognizing station entering and exiting, terminal, and storage medium
CN111414916A (en) Method and device for extracting and generating text content in image and readable storage medium
CN116343237A (en) Bill identification method based on deep learning and knowledge graph
CN110991279B (en) Document Image Analysis and Recognition Method and System
CN114677526A (en) Image classification method, device, equipment and medium
CN112883980A (en) Data processing method and system
CN114155540A (en) Character recognition method, device and equipment based on deep learning and storage medium
CN114241499A (en) Table picture identification method, device and equipment and readable storage medium
CN116841893A (en) Improved GPT 2-based automatic generation method and system for Robot Framework test cases
CN117831056A (en) Bill information extraction method, device and bill information extraction system
CN118227773A (en) Question answering method and device based on multi-mode large model
CN115810197A (en) Multi-mode electric power form recognition method and device
CN112749639B (en) Model training method and device, computer equipment and storage medium
CN118628457A (en) A molecular biology image analysis method and system
CN118314588A (en) Method and system for recognizing handwriting fonts based on AI technology
CN116895074A (en) Digital verification method, device, equipment and medium based on optical character recognition
CN117373076A (en) Attribute identification method, attribute identification system and related device
CN116484844A (en) Document OCR recognition result error correction method, system, equipment and medium
CN112950749B (en) Handwriting picture generation method based on generation countermeasure network
CN116912872A (en) Drawing identification method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination