[go: up one dir, main page]

WO2018099194A1 - 一种字符识别方法及装置 - Google Patents

一种字符识别方法及装置 Download PDF

Info

Publication number
WO2018099194A1
WO2018099194A1 PCT/CN2017/105843 CN2017105843W WO2018099194A1 WO 2018099194 A1 WO2018099194 A1 WO 2018099194A1 CN 2017105843 W CN2017105843 W CN 2017105843W WO 2018099194 A1 WO2018099194 A1 WO 2018099194A1
Authority
WO
WIPO (PCT)
Prior art keywords
target image
character
neural network
feature map
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/105843
Other languages
English (en)
French (fr)
Inventor
郑钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to US16/464,922 priority Critical patent/US11003941B2/en
Priority to EP17877227.3A priority patent/EP3550473A4/en
Publication of WO2018099194A1 publication Critical patent/WO2018099194A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to a character recognition method and apparatus.
  • the license plate number of the vehicle included in the image, the building identification, and the like can be acquired; or, by identifying the delivery order, the express delivery number can be obtained.
  • the existing character recognition method mainly detects a character region containing characters according to an artificially designed feature; then, the character region is divided to obtain each character block; finally, the classifier is identified for each character block, thereby obtaining an image.
  • the character area in the image is detected and the character area is divided, it is detected based on the artificially designed feature.
  • different image conditions, different shooting conditions, etc. may result in large differences in image quality.
  • the artificially designed features are not well adapted to images of various qualities, resulting in lower accuracy of character region detection and character region segmentation results, which further leads to lower accuracy of character recognition results.
  • the purpose of the embodiment of the present application is to provide a character recognition method and device to improve the accuracy of character recognition.
  • the specific technical solutions are as follows:
  • an embodiment of the present application provides a character recognition method, where the method includes:
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the step of determining a feature map corresponding to the character region of the target image includes:
  • a feature map including characters is identified based on the feature map corresponding to each candidate region, and the identified feature map is determined as a feature map corresponding to the character region of the target image.
  • the method further includes:
  • the position and/or shape of each candidate area is adjusted.
  • the step of determining a feature map corresponding to the character region of the target image includes:
  • the deep neural network at least includes: a convolutional neural network, a cyclic neural network, a classifier, and a sequence decoder; and performing character recognition on the feature map corresponding to each character region by using the deep neural network
  • the step of obtaining characters included in the target image includes:
  • the extracted feature map is classified and identified by the classifier and the sequence decoder to obtain characters included in the target image.
  • the training process of the deep neural network includes:
  • the depth neural network is trained by using each sample image, the character region calibration result of each sample image, and the characters included in each sample image as training samples.
  • an embodiment of the present application provides a character recognition apparatus, where the apparatus includes:
  • a first acquiring module configured to acquire a target image including characters to be analyzed
  • a determining module configured to input the target image into a pre-trained deep neural network, and determine a feature map corresponding to a character region of the target image
  • An identification module configured to perform character recognition on the feature map corresponding to each character region by using the deep neural network, to obtain characters included in the target image
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the determining module includes:
  • Determining a sub-module configured to determine each candidate region included in the target image according to a preset dividing rule
  • a first extraction sub-module configured to perform feature extraction on each candidate region to obtain a feature map corresponding to each candidate region
  • the first identification sub-module is configured to identify a feature map including characters according to the feature map corresponding to each candidate region, and determine the identified feature map as a feature map corresponding to the character region of the target image.
  • the device further includes:
  • An adjustment module for adjusting the position and/or shape of each candidate region.
  • the determining module includes:
  • a second extraction sub-module configured to perform feature extraction on the target image to obtain a feature map corresponding to the target image
  • a second identification sub-module configured to perform pixel-level analysis on the feature image corresponding to the target image, identify an area including the character, and determine a feature map corresponding to the identified area as a character area corresponding to the target image Feature map.
  • the deep neural network at least includes: a convolutional neural network, a cyclic neural network, a classifier, and a sequence decoder;
  • the identification module includes:
  • a third extraction submodule configured to perform character level feature extraction on each character region by using the convolutional neural network
  • a fourth extraction submodule configured to perform context feature extraction on each character region by using the cyclic neural network
  • a third identification submodule configured to perform classification and identification on the extracted feature image by using the classifier and the sequence decoder to obtain characters included in the target image.
  • the device further includes:
  • a second acquiring module configured to acquire a sample image, a character region calibration result of each sample image, and characters included in each sample image
  • a training module configured to use the sample image, the character region calibration result of each sample image, and the characters included in each sample image as training samples to train the deep neural network.
  • the embodiment of the present application further provides an electronic device, including:
  • processor a memory, a communication interface, and a bus
  • the processor, the memory, and the communication interface are connected by the bus and complete communication with each other;
  • the memory stores executable program code
  • the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory for performing a character recognition method according to the first aspect of the present application at runtime .
  • the present application provides a storage medium for storing executable program code for performing the first aspect of the present application at runtime A method of character recognition.
  • the present application provides an application for performing a character recognition method according to the first aspect of the present application at runtime.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the feature map corresponding to each character region can be recognized by the deep neural network, thereby accurately obtaining the target image. character of.
  • FIG. 1 is a flowchart of a character recognition method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of a target image including characters according to an embodiment of the present application.
  • FIG. 3(a) is a schematic diagram of a character area according to an embodiment of the present application.
  • Figure 3 (b) is a schematic diagram showing the result of adjusting the character area shown in Figure 3 (a);
  • FIG. 4 is another flowchart of a character recognition method according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a character recognition apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is another schematic structural diagram of a character recognition apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the embodiment of the present application provides a character recognition method and Device.
  • the embodiment of the present application provides a character recognition method process. As shown in FIG. 1, the process may include the following steps:
  • the method provided by the embodiment of the present application can be applied to an electronic device.
  • the electronic device can be a desktop computer, a portable computer, a smart mobile terminal, or the like.
  • the electronic device may identify an image including characters to obtain characters included therein.
  • the electronic device may identify an image collected by the image capturing device on the road to obtain a license plate number included therein; or, the image captured by the user may be subjected to character recognition to obtain character information included therein.
  • a wired or wireless connection may be established between the image capturing device and the electronic device, so that the image capturing device may send the captured image to the electronic device.
  • the connection between the image acquisition device and the electronic device can be established through a wireless connection method such as WIFI (Wireless Fidelity), NFC (Near Field Communication), Bluetooth, etc. Not limited.
  • WIFI Wireless Fidelity
  • NFC Near Field Communication
  • Bluetooth Bluetooth
  • the electronic device may receive the target image transmitted by the image capturing device or the target image input by the user to identify the characters included in the target image.
  • FIG. 2 shows a schematic diagram of a target image including characters acquired by an electronic device.
  • the electronic device may also be obtained by other means.
  • the target image is not limited in this embodiment of the present application.
  • S102 Input the target image into a pre-trained deep neural network, and determine a feature map corresponding to the character region of the target image.
  • the electronic device may pre-calibrate the character region of each sample image according to a certain number of sample images, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and sample images.
  • the characters included in the training are derived from deep neural networks.
  • the deep neural network may determine a feature map corresponding to a character region of the target image, and according to the feature map, obtain characters included in the target image.
  • the electronic device may input the target image into the pre-trained deep neural network, and then detect each region of the target image to identify the character region including the character. And, the feature map corresponding to each character region can be determined.
  • the character region in the target image can be determined as the region 210 through the deep neural network.
  • feature extraction can be performed through a deep neural network to obtain a feature map corresponding to the character region.
  • the electronic device may further perform character recognition on the feature map corresponding to each character region according to the depth neural network to obtain characters included in the target image. For example, through the deep neural network, each character region can be separately identified, and each character included in each character region can be recognized, thereby obtaining characters included in the target image.
  • the electronic device recognizes that the character included therein may be: ⁇ FC508.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the deep neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the deep nerve can be The network performs character recognition on the feature map corresponding to each character region, thereby accurately obtaining characters included in the target image.
  • the image when the electronic device determines the feature map corresponding to the character region of the target image, the image may be first included in the target image according to a preset dividing rule, such as the size, shape, and the like of each candidate region.
  • a preset dividing rule such as the size, shape, and the like of each candidate region.
  • Each candidate area For example, a rectangular candidate region having a size of 20 pixels * 30 pixels can be determined from the target image.
  • the candidate regions may not overlap or overlap each other, which is not limited in this embodiment of the present application.
  • the electronic device may perform feature extraction on each candidate region to obtain a feature map corresponding to each candidate region; finally, identify a feature map corresponding to each candidate region, identify a feature map containing characters, and identify the identified feature map.
  • a feature map corresponding to the character area of the target image is determined.
  • the character format in the target image may be diverse, and the character region may not be a regular rectangle, a square, or the like, but other graphics, such as a parallelogram. Therefore, after the target image is divided into a plurality of regular patterns, the detected character area may not be particularly accurate.
  • the obtained character region may be an area as shown in FIG. 3(a).
  • the obtained character area does not contain all the character contents very accurately.
  • the position and/or shape of each candidate region may be adjusted. For example, operations such as rotation, translation, and the like can be performed on each candidate region.
  • a vector for adjusting the character region can be trained according to the irregularly shaped character region included in the sample image.
  • the position and/or shape of each candidate region can be adjusted according to the vector obtained by the training.
  • a character area as shown in FIG. 3(b) can be obtained.
  • the adjusted character area can contain all the character contents very accurately.
  • the feature image may be first extracted to obtain the feature image corresponding to the target image, and then the feature corresponding to the target image is obtained.
  • the figure performs pixel level analysis, identifies an area containing characters, and determines a feature map corresponding to the identified area as a feature map corresponding to the character area in the target image.
  • the electronic device may analyze each pixel in turn according to a set analysis sequence, such as from left to right and top to bottom. After the analysis is completed, the pixel of the character is determined, and the region corresponding to the pixel of the character is determined. Finally, the feature map corresponding to the region is determined as the feature map corresponding to the character region in the target image.
  • a set analysis sequence such as from left to right and top to bottom.
  • the deep neural network in this embodiment may at least include: a convolutional neural network, a cyclic neural network, a classifier, a sequence decoder, and the like.
  • the Convolutional Neural Network is a feedforward artificial neural network. Its neurons can respond to surrounding units within a limited coverage, and effectively extract the structural information of the image through weight sharing and feature aggregation.
  • Recurrent Neural Network is an artificial neural network with a cyclic structure. Through the transmission of hidden layer features in the sequence direction, the feature calculation of the current sequence points can be supported by context information. Through weight sharing and feature aggregation, it is suitable for deep learning modeling of complex sequence problems (such as time, space, etc.).
  • the electronic device performs character recognition on the feature map corresponding to each character region through the deep neural network, and when the characters included in the target image are obtained, the character level feature extraction may be first performed on each character region through the convolutional neural network; The character feature extraction is performed on each character region; finally, the extracted feature image is classified and recognized by the classifier and the sequence decoder to obtain characters included in the target image.
  • the electronic device may be pre-trained to obtain a deep neural network for performing character recognition.
  • the character recognition method provided by the embodiment of the present application may further include the following steps:
  • the electronic device when training the deep neural network, may first acquire the sample image.
  • the electronic device can acquire as many sample images as possible, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and each of the sample images can include characters.
  • the format of the characters included in each sample image may be diversified, for example, the sample image may include a font, a size, a font, and the like. Different characters.
  • the user can perform calibration of the character area on each sample image, and input the calibration result into the electronic device. Also, characters included in each sample image can be input into the electronic device. Therefore, the electronic device can acquire each sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the depth neural network is trained by using the sample image, the character region calibration result of each sample image, and the characters included in each sample image as training samples.
  • the electronic device may set the sample region, the character region calibration result of each sample image, and the characters included in each sample image.
  • a deep neural network is trained.
  • the training process of the deep neural network may adopt any existing method.
  • the electronic device may use a back propagation algorithm to train the deep neural network.
  • the calculation of the network parameter gradient can adopt the stochastic gradient descent method.
  • other methods may be used to train the deep neural network. This embodiment of the present application does not describe the process.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the feature map corresponding to each character region can be recognized by the deep neural network, thereby accurately obtaining the target image. character of.
  • the embodiment of the present application also provides a corresponding device embodiment.
  • FIG. 5 is a character recognition apparatus according to an embodiment of the present disclosure, where the apparatus includes:
  • a first acquiring module 510 configured to acquire a target image that includes a character to be analyzed
  • a determining module 520 configured to input the target image into a pre-trained deep neural network, and determine a feature map corresponding to the character region of the target image;
  • the identification module 530 is configured to perform character recognition on the feature map corresponding to each character region by using the deep neural network to obtain characters included in the target image;
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the feature map corresponding to each character region can be recognized by the deep neural network, thereby accurately obtaining the target image. character of.
  • the determining module 520 includes:
  • a first extraction sub-module (not shown) for performing feature extraction on each candidate region to obtain a feature map corresponding to each candidate region
  • a first identification sub-module (not shown) for identifying a feature map containing characters according to a feature map corresponding to each candidate region, and determining the identified feature map as a character region corresponding to the target image Feature map.
  • the device further includes:
  • An adjustment module (not shown) for adjusting the position and/or shape of each candidate region.
  • the determining module 520 includes:
  • a second identification sub-module (not shown) for performing pixel-level analysis on the feature image corresponding to the target image, identifying an area including the character, and determining a feature map corresponding to the identified area as the A feature map corresponding to a character area in the target image.
  • the deep neural network includes at least: a convolutional neural network, a cyclic neural network, a classifier, and a sequence decoder;
  • the identification module 530 includes:
  • a third extraction sub-module (not shown) for performing character-level feature extraction on the character regions by using the convolutional neural network
  • a fourth extraction sub-module (not shown) for performing context feature extraction on each character region by using the cyclic neural network
  • a third identification sub-module (not shown) for performing classification and identification on the extracted feature image by the classifier and the sequence decoder to obtain characters included in the target image.
  • the device further includes:
  • a second obtaining module 540 configured to acquire a sample image, a character region calibration result of each sample image, and a character included in each sample image;
  • the training module 550 is configured to train the depth neural network by using each sample image, the character region calibration result of each sample image, and the characters included in each sample image as training samples.
  • an electronic device which may include:
  • processor 710 a processor 710, a memory 720, a communication interface 730, and a bus 740;
  • the processor 710, the memory 720, and the communication interface 730 are connected by the bus 740 and complete communication with each other;
  • the memory 720 stores executable program code
  • the processor 710 is shipped by reading executable program code stored in the memory 720.
  • a program corresponding to the executable program code for performing a character recognition method according to an embodiment of the present application at runtime, wherein the character recognition method includes:
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the feature map corresponding to each character region can be recognized by the deep neural network, thereby accurately obtaining the target image. character of.
  • the embodiment of the present application further provides a storage medium, where the storage medium is used to store executable program code, and the executable program code is used to execute a character described in the embodiment of the present application at runtime.
  • the identification method wherein the character recognition method comprises:
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the depth neural network may be trained in advance according to each sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image containing the character is acquired, the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and the feature map corresponding to each character region can be performed through the deep neural network. Character recognition to accurately obtain the characters included in the target image.
  • the embodiment of the present application further provides an application program, where the application is used to execute a character recognition method according to an embodiment of the present application, wherein the character recognition method includes:
  • the deep neural network is obtained by training each sample image, a character region calibration result of each sample image, and characters included in each sample image.
  • the depth neural network may be obtained in advance according to the sample image, the character region calibration result of each sample image, and the characters included in each sample image.
  • the target image including the character is obtained.
  • the target image is input into the depth neural network, and the feature map corresponding to the character region of the target image can be accurately determined, and then the feature map corresponding to each character region can be recognized by the deep neural network, thereby accurately obtaining the target image. character of.
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

一种字符识别方法及装置,所述方法包括:获取待分析的包括字符的目标图像(S101);将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图(S102);通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符(S103);其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。该方法能够提高字符识别的准确性。

Description

一种字符识别方法及装置
本申请要求于2016年11月30日提交中国专利局、申请号为201611082212.4发明名称为“一种字符识别方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,特别是涉及一种字符识别方法及装置。
背景技术
随着图像处理技术的发展,从图像中能够获取到越来越多的信息。如,通过对监控图像进行识别,能够获取到图像中包括的车辆的车牌号,建筑物标识等;或者,通过对快递单进行识别,能够得到快递单号等。
现有的字符识别方法,主要为根据人工设计的特征,从图像中检测包含字符的字符区域;然后对字符区域进行分割,得到各字符块;最后对各字符块进行分类器识别,从而得到图像中包含的字符。
但是,上述方法中,检测图像中的字符区域,以及对字符区域进行分割时,是根据人工设计的特征进行检测的。实际应用中,由于不同的场景,不同的拍摄条件等,会导致图像质量差别较大。而人工设计的特征,不能很好地适应各种质量的图像,从而导致字符区域检测和字符区域分割结果准确性较低,进一步导致字符识别结果精确性较低。
发明内容
本申请实施例的目的在于提供一种字符识别方法及装置,以提高字符识别的准确性。具体技术方案如下:
第一方面,本申请实施例提供了一种字符识别方法,所述方法包括:
获取待分析的包括字符的目标图像;
将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
可选地,所述确定所述目标图像的字符区域对应的特征图的步骤包括:
根据预设的划分规则,确定所述目标图像中包括的各候选区域;
对所述各候选区域进行特征提取,得到各候选区域对应的特征图;
根据各候选区域对应的特征图,识别包含字符的特征图,并将所识别出的特征图确定为所述目标图像的字符区域对应的特征图。
可选地,所述确定所述目标图像中包括的各候选区域之后,所述方法还包括:
对各候选区域的位置和/或形状进行调整。
可选地,所述确定所述目标图像的字符区域对应的特征图的步骤包括:
对所述目标图像进行特征提取,得到所述目标图像对应的特征图;
对所述目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为所述目标图像中的字符区域对应的特征图。
可选地,所述深度神经网络至少包括:卷积神经网络、循环神经网络、分类器、以及序列解码器;所述通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符的步骤包括:
通过所述卷积神经网络对所述各字符区域进行字符级特征提取;
通过所述循环神经网络对所述各字符区域进行上下文特征提取;
通过所述分类器和序列解码器对所提取的特征图进行分类识别,得到所述目标图像中包括的字符。
可选地,所述深度神经网络的训练过程包括:
获取样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符;
将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
第二方面,本申请实施例提供了一种字符识别装置,所述装置包括:
第一获取模块,用于获取待分析的包括字符的目标图像;
确定模块,用于将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
识别模块,用于通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
可选地,所述确定模块,包括:
确定子模块,用于根据预设的划分规则,确定所述目标图像中包括的各候选区域;
第一提取子模块,用于对所述各候选区域进行特征提取,得到各候选区域对应的特征图;
第一识别子模块,用于根据各候选区域对应的特征图,识别包含字符的特征图,并将所识别出的特征图确定为所述目标图像的字符区域对应的特征图。
可选地,所述装置还包括:
调整模块,用于对各候选区域的位置和/或形状进行调整。
可选地,所述确定模块,包括:
第二提取子模块,用于对所述目标图像进行特征提取,得到所述目标图像对应的特征图;
第二识别子模块,用于对所述目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为所述目标图像中的字符区域对应的特征图。
可选地,所述深度神经网络至少包括:卷积神经网络、循环神经网络、分类器、以及序列解码器;所述识别模块,包括:
第三提取子模块,用于通过所述卷积神经网络对所述各字符区域进行字符级特征提取;
第四提取子模块,用于通过所述循环神经网络对所述各字符区域进行上下文特征提取;
第三识别子模块,用于通过所述分类器和序列解码器对所提取的特征图进行分类识别,得到所述目标图像中包括的字符。
可选地,所述装置还包括:
第二获取模块,用于获取样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符;
训练模块,用于将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
第三方面,本申请实施例还提供了一种电子设备,包括:
处理器、存储器、通信接口和总线;
所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
所述存储器存储可执行程序代码;
所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于在运行时执行本申请第一方面所述的一种字符识别方法。
第四方面,本申请提供了一种存储介质,其中,该存储介质用于存储可执行程序代码,所述可执行程序代码用于在运行时执行本申请第一方面所述 的一种字符识别方法。
第五方面,本申请提供了一种应用程序,其中,该应用程序用于在运行时执行本申请第一方面所述的一种字符识别方法。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种字符识别方法的流程图;
图2为本申请实施例的一种包括字符的目标图像示意图;
图3(a)为本申请实施例的一种字符区域示意图;
图3(b)为对图3(a)所示的字符区域进行调整后的结果示意图;
图4为本申请实施例提供的一种字符识别方法的另一流程图;
图5为本申请实施例提供的一种字符识别装置的结构示意图;
图6为本申请实施例提供的一种字符识别装置的另一结构示意图;
图7为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了提高字符识别的准确性,本申请实施例提供了一种字符识别方法及 装置。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
为了提高字符识别的准确性,本申请实施例提供了一种字符识别方法过程,如图1所示,该过程可以包括以下步骤:
S101,获取待分析的包括字符的目标图像。
本申请实施例提供的方法可以应用于电子设备。具体地,该电子设备可以为台式计算机、便携式计算机、智能移动终端等。
在本申请实施例中,电子设备可以对包括字符的图像进行识别,得到其中包括的字符。例如,电子设备可以对道路上的图像采集设备采集的图像进行识别,得到其中包括的车牌号;或者,也可以对用户拍摄的图像进行字符识别,得到其中包括的字符信息。
当电子设备对图像采集设备采集的图像进行字符识别时,可以在图像采集设备与电子设备之间建立有线或无线连接,从而图像采集设备可以将其采集的图像发送给电子设备。例如,可以通过WIFI(Wireless Fidelity,无线保真)、NFC(Near Field Communication,近距离无线通讯技术)、蓝牙等无线连接方式在图像采集设备与电子设备之间建立连接,本申请实施例对此不进行限定。当电子设备对用户拍摄的图像进行字符识别时,用户可以将其拍摄的图像输入电子设备中。
因此,在本申请实施例中,电子设备可以接收图像采集设备发送的目标图像,或者用户输入的目标图像,以识别目标图像中包括的字符。请参考图2,其示出了电子设备获取的一种包括字符的目标图像示意图。
需要说明的是,在本申请实施例中,电子设备还可以通过其他方式获取 目标图像,本申请实施例对此不做限定。
S102,将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图。
在本申请实施例中,为了提高字符识别的准确性,电子设备可以预先根据一定数量的样本图像,如100张、500张、1000张等,各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络。使用训练后的深度神经网络,当输入包括字符的目标图像时,该深度神经网络可以确定目标图像的字符区域对应的特征图,并且根据该特征图,得到目标图像中包括的字符。
在本申请实施例中,获取到包括字符的目标图像后,电子设备可以将该目标图像输入预先训练的深度神经网络中,进而对目标图像的各区域进行检测,识别出包含字符的字符区域,并且,可以确定各字符区域对应的特征图。
例如,当电子设备获取到的目标图像如图2所示时,通过深度神经网络,可以确定目标图像中的字符区域为区域210。并且,可以通过深度神经网络进行特征提取,得到字符区域对应的特征图。
S103,通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符。
本申请实施例中,确定目标图像的字符区域对应的特征图后,电子设备可以进一步地根据深度神经网络,对各字符区域对应的特征图进行字符识别,得到目标图像中包括的字符。如,通过深度神经网络,可以对各字符区域分别进行识别,识别出各字符区域包括的各字符,进而得到目标图像中包括的字符。
例如,针对如图2所示的目标图像,电子设备识别出其中包括的字符可以为:冀FC508。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经 网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
作为本申请实施例的一种实施方式,电子设备确定目标图像的字符区域对应的特征图时,可以首先根据预设的划分规则,如,各候选区域的大小、形状等,确定目标图像中包括的各候选区域。例如,可以从目标图像中确定大小均为20像素*30像素的长方形候选区域。其中,各候选区域可以均不重叠,也可以重叠,本申请实施例对此不做限定。
然后,电子设备可以对各候选区域进行特征提取,得到各候选区域对应的特征图;最后对各候选区域对应的特征图进行识别,识别出包含字符的特征图,并将所识别出的特征图确定为目标图像的字符区域对应的特征图。
作为本申请实施例的一种实施方式,有些情况下,目标图像中的字符格式可能存在多样性,字符区域也可能不是规则的长方形、正方形等,而是其他的一些图形,如平行四边形等。因此,将目标图像划分为多个规则图形后,检测到的字符区域可能不是特别准确。
例如,如图3(a)所示,当实际的字符为斜体,预设规则为将目标图像划分为长方形候选区域时,得到的字符区域可以为如图3(a)所示的区域。从图3(a)可以看出,得到的字符区域并不能很准确地包含全部的字符内容。
在本申请实施例中,在确定目标图像中包括的各候选区域之后,可以对各候选区域的位置和/或形状进行调整。例如,可以对各候选区域进行旋转、平移等操作。
具体地,在对深度神经网络进行训练时,可以根据样本图像中包括的不规则形状的字符区域,训练得到对字符区域进行调整的向量。在进行字符识别时,即可根据训练得到的向量,对各候选区域的位置和/或形状进行调整。
例如,针对如图3(a)所示的字符区域,对其进行调整后,可以得到如图3(b)所示的字符区域。从图3(b)可以看出,调整后的字符区域能够很准确地包含全部的字符内容。
作为本申请实施例的一种实施方式,电子设备确定目标图像的字符区域对应的特征图时,还可以首先对目标图像进行特征提取,得到目标图像对应的特征图,然后对目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为目标图像中的字符区域对应的特征图。
例如,当得到目标图像对应的特征图后,电子设备可以针对该特征图,按照设定的分析顺序,如从左到右、从上到下的顺序,依次对每个像素进行分析,识别包含字符的像素,分析完成后,确定包含字符的像素组成的区域,最后,将该区域对应的特征图确定为目标图像中的字符区域对应的特征图。
作为本申请实施例的一种实施方式,本实施例中的深度神经网络至少可以包括:卷积神经网络、循环神经网络、分类器、以及序列解码器等。
卷积神经网络(Convolutional Neural Network,CNN)是一种前馈的人工神经网络,其神经元可以响应有限覆盖范围内周围单元,并通过权值共享和特征汇聚,有效提取图像的结构信息。
循环神经网络(Recurrent Neural Network,RNN)是一种拥有循环结构的人工神经网络,通过隐层特征在序列方向的传递,可以使当前序列点的特征计算得到上下文的信息的支援。通过权值共享和特征汇聚,适用于复杂的序列问题(如时间、空间等)的深度学习建模。
电子设备通过深度神经网络对各字符区域对应的特征图进行字符识别,得到目标图像中包括的字符时,可以首先通过卷积神经网络对各字符区域进行字符级特征提取;然后通过循环神经网络对各字符区域进行上下文特征提取;最后可以通过分类器和序列解码器对所提取的特征图进行分类识别,得到目标图像中包括的字符。
在本申请实施例中,电子设备可以预先训练得到用于进行字符识别的深度神经网络。具体地,如图4所示,本申请实施例提供的字符识别方法,还可以包括以下步骤:
S201,获取样本图像、各样本图像的字符区域标定结果、以及各样本图 像中包括的字符。
本申请实施例中,电子设备在训练深度神经网络时,可以首先获取样本图像。例如,电子设备可以获取尽可能多的样本图像,如100张、500张、1000张等,各样本图像中均可以包括字符。并且,为了在进行字符识别时,适应不同质量的图像、适应图像中不同格式的字符,各样本图像中包括的字符的格式可以多样化,如,样本图像中可以包括字体、大小、字形等均不同的字符。
在本申请实施例中,用户可以对各样本图像,进行字符区域的标定,并将标定结果输入电子设备中。并且,还可以将各样本图像中包括的字符输入电子设备中。因此,电子设备可以获取到各样本图像,各样本图像的字符区域标定结果、以及各样本图像中包括的字符。
S202,将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
获取到各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符后,电子设备可以将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到深度神经网络。
需要说明的是,在本申请实施例中,深度神经网络的训练过程可以采用现有的任一种方法,例如,电子设备可以采用反向传播算法训练得到深度神经网络。其中,网络参数梯度的计算可以采用随机梯度下降法。或者,还可以采用其它方法,训练得到深度神经网络,本申请实施例对此过程不进行赘述。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
相应于上面的方法实施例,本申请实施例还提供了相应的装置实施例。
图5为本申请实施例提供的一种字符识别装置,所述装置包括:
第一获取模块510,用于获取待分析的包括字符的目标图像;
确定模块520,用于将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
识别模块530,用于通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
作为本申请实施例的一种实施方式,所述确定模块520,包括:
确定子模块(图中未示出),用于根据预设的划分规则,确定所述目标图像中包括的各候选区域;
第一提取子模块(图中未示出),用于对所述各候选区域进行特征提取,得到各候选区域对应的特征图;
第一识别子模块(图中未示出),用于根据各候选区域对应的特征图,识别包含字符的特征图,并将所识别出的特征图确定为所述目标图像的字符区域对应的特征图。
作为本申请实施例的一种实施方式,所述装置还包括:
调整模块(图中未示出),用于对各候选区域的位置和/或形状进行调整。
作为本申请实施例的一种实施方式,所述确定模块520,包括:
第二提取子模块(图中未示出),用于对所述目标图像进行特征提取,得到所述目标图像对应的特征图;
第二识别子模块(图中未示出),用于对所述目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为所述目标图像中的字符区域对应的特征图。
作为本申请实施例的一种实施方式,所述深度神经网络至少包括:卷积神经网络、循环神经网络、分类器、以及序列解码器;所述识别模块530,包括:
第三提取子模块(图中未示出),用于通过所述卷积神经网络对所述各字符区域进行字符级特征提取;
第四提取子模块(图中未示出),用于通过所述循环神经网络对所述各字符区域进行上下文特征提取;
第三识别子模块(图中未示出),用于通过所述分类器和序列解码器对所提取的特征图进行分类识别,得到所述目标图像中包括的字符。
作为本申请实施例的一种实施方式,如图6所示,所述装置还包括:
第二获取模块540,用于获取样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符;
训练模块550,用于将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
相应地,如图7所示,本申请实施例还提供了一种电子设备,可以包括:
处理器710、存储器720、通信接口730和总线740;
所述处理器710、所述存储器720和所述通信接口730通过所述总线740连接并完成相互间的通信;
所述存储器720存储可执行程序代码;
所述处理器710通过读取所述存储器720中存储的可执行程序代码来运 行与所述可执行程序代码对应的程序,以用于在运行时执行本申请实施例所述的一种字符识别方法,其中,所述字符识别方法包括:
获取待分析的包括字符的目标图像;
将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
相应地,本申请实施例还提供了一种存储介质,其中,该存储介质用于存储可执行程序代码,所述可执行程序代码用于在运行时执行本申请实施例所述的一种字符识别方法,其中,所述字符识别方法包括:
获取待分析的包括字符的目标图像;
将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符 识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
相应地,本申请实施例还提供了一种应用程序,其中,该应用程序用于在运行时执行本申请实施例所述的一种字符识别方法,其中,所述字符识别方法包括:
获取待分析的包括字符的目标图像;
将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
本申请实施例中,可以预先根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到深度神经网络,在进行字符识别时,获取到包含字符的目标图像后,将目标图像输入深度神经网络中,可以准确地确定目标图像的字符区域对应的特征图,进而可以通过深度神经网络对各字符区域对应的特征图进行字符识别,从而准确地得到目标图像中包括的字符。
对于装置/电子设备/存储介质/应用程序实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明 确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本领域普通技术人员可以理解实现上述方法实施方式中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机可读取存储介质中,这里所称得的存储介质,如:ROM/RAM、磁碟、光盘等。
以上所述仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。

Claims (15)

  1. 一种字符识别方法,其特征在于,所述方法包括:
    获取待分析的包括字符的目标图像;
    将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
    通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
    其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述目标图像的字符区域对应的特征图的步骤包括:
    根据预设的划分规则,确定所述目标图像中包括的各候选区域;
    对所述各候选区域进行特征提取,得到各候选区域对应的特征图;
    根据各候选区域对应的特征图,识别包含字符的特征图,并将所识别出的特征图确定为所述目标图像的字符区域对应的特征图。
  3. 根据权利要求2所述的方法,其特征在于,所述确定所述目标图像中包括的各候选区域之后,所述方法还包括:
    对各候选区域的位置和/或形状进行调整。
  4. 根据权利要求1所述的方法,其特征在于,所述确定所述目标图像的字符区域对应的特征图的步骤包括:
    对所述目标图像进行特征提取,得到所述目标图像对应的特征图;
    对所述目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为所述目标图像中的字符区域对应的特征图。
  5. 根据权利要求1所述的方法,其特征在于,所述深度神经网络至少包 括:卷积神经网络、循环神经网络、分类器、以及序列解码器;所述通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符的步骤包括:
    通过所述卷积神经网络对所述各字符区域进行字符级特征提取;
    通过所述循环神经网络对所述各字符区域进行上下文特征提取;
    通过所述分类器和序列解码器对所提取的特征图进行分类识别,得到所述目标图像中包括的字符。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述深度神经网络的训练过程包括:
    获取样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符;
    将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
  7. 一种字符识别装置,其特征在于,所述装置包括:
    第一获取模块,用于获取待分析的包括字符的目标图像;
    确定模块,用于将所述目标图像输入预先训练的深度神经网络中,确定所述目标图像的字符区域对应的特征图;
    识别模块,用于通过所述深度神经网络对所述各字符区域对应的特征图进行字符识别,得到所述目标图像中包括的字符;
    其中,所述深度神经网络是根据各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符训练得到的。
  8. 根据权利要求7所述的装置,其特征在于,所述确定模块,包括:
    确定子模块,用于根据预设的划分规则,确定所述目标图像中包括的各候选区域;
    第一提取子模块,用于对所述各候选区域进行特征提取,得到各候选区域对应的特征图;
    第一识别子模块,用于根据各候选区域对应的特征图,识别包含字符的特征图,并将所识别出的特征图确定为所述目标图像的字符区域对应的特征图。
  9. 根据权利要求8所述的装置,其特征在于,所述装置还包括:
    调整模块,用于对各候选区域的位置和/或形状进行调整。
  10. 根据权利要求7所述的装置,其特征在于,所述确定模块,包括:
    第二提取子模块,用于对所述目标图像进行特征提取,得到所述目标图像对应的特征图;
    第二识别子模块,用于对所述目标图像对应的特征图进行像素级分析,识别包含字符的区域,并将所识别出的区域对应的特征图确定为所述目标图像中的字符区域对应的特征图。
  11. 根据权利要求7所述的装置,其特征在于,所述深度神经网络至少包括:卷积神经网络、循环神经网络、分类器、以及序列解码器;所述识别模块,包括:
    第三提取子模块,用于通过所述卷积神经网络对所述各字符区域进行字符级特征提取;
    第四提取子模块,用于通过所述循环神经网络对所述各字符区域进行上下文特征提取;
    第三识别子模块,用于通过所述分类器和序列解码器对所提取的特征图进行分类识别,得到所述目标图像中包括的字符。
  12. 根据权利要求7-11任一项所述的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符;
    训练模块,用于将各样本图像、各样本图像的字符区域标定结果、以及各样本图像中包括的字符作为训练样本,训练得到所述深度神经网络。
  13. 一种电子设备,其特征在于,包括:
    处理器、存储器、通信接口和总线;
    所述处理器、所述存储器和所述通信接口通过所述总线连接并完成相互间的通信;
    所述存储器存储可执行程序代码;
    所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于执行如权利要求1-6任一项所述的一种字符识别方法。
  14. 一种存储介质,其特征在于,所述存储介质用于存储可执行程序代码,所述可执行程序代码用于在运行时执行如权利要求1-6任一项所述的一种字符识别方法。
  15. 一种应用程序,其特征在于,所述应用程序用于在运行时执行如权利要求1-6任一项所述的一种字符识别方法。
PCT/CN2017/105843 2016-11-30 2017-10-12 一种字符识别方法及装置 Ceased WO2018099194A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/464,922 US11003941B2 (en) 2016-11-30 2017-10-12 Character identification method and device
EP17877227.3A EP3550473A4 (en) 2016-11-30 2017-10-12 METHOD AND DEVICE FOR CHARACTER IDENTIFICATION

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611082212.4A CN108121984B (zh) 2016-11-30 2016-11-30 一种字符识别方法及装置
CN201611082212.4 2016-11-30

Publications (1)

Publication Number Publication Date
WO2018099194A1 true WO2018099194A1 (zh) 2018-06-07

Family

ID=62226299

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/105843 Ceased WO2018099194A1 (zh) 2016-11-30 2017-10-12 一种字符识别方法及装置

Country Status (4)

Country Link
US (1) US11003941B2 (zh)
EP (1) EP3550473A4 (zh)
CN (1) CN108121984B (zh)
WO (1) WO2018099194A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827247A (zh) * 2019-10-28 2020-02-21 上海悦易网络信息技术有限公司 一种识别标签的方法及设备
CN110866530A (zh) * 2019-11-13 2020-03-06 云南大学 一种字符图像识别方法、装置及电子设备
CN110956170A (zh) * 2019-09-30 2020-04-03 京东数字科技控股有限公司 生成护照机读码样本的方法、装置、设备及存储介质
CN111027555A (zh) * 2018-10-09 2020-04-17 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
CN111046859A (zh) * 2018-10-11 2020-04-21 杭州海康威视数字技术股份有限公司 字符识别方法及装置
CN111325194A (zh) * 2018-12-13 2020-06-23 杭州海康威视数字技术股份有限公司 一种文字识别方法、装置及设备、存储介质
CN111401289A (zh) * 2020-03-24 2020-07-10 国网上海市电力公司 一种变压器部件的智能识别方法和装置
CN111414908A (zh) * 2020-03-16 2020-07-14 湖南快乐阳光互动娱乐传媒有限公司 一种视频中字幕字符的识别方法及装置
CN112287932A (zh) * 2019-07-23 2021-01-29 上海高德威智能交通系统有限公司 一种确定图像质量的方法、装置、设备及存储介质
CN113298188A (zh) * 2021-06-28 2021-08-24 深圳市商汤科技有限公司 字符识别及神经网络训练方法和装置

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276342B (zh) * 2018-03-14 2023-04-18 台达电子工业股份有限公司 车牌辨识方法以及其系统
CN110717366A (zh) * 2018-07-13 2020-01-21 杭州海康威视数字技术股份有限公司 文本信息的识别方法、装置、设备及存储介质
CN109117738A (zh) * 2018-07-19 2019-01-01 江苏黄金屋教育发展股份有限公司 基于人工智能的阅卷方法
CN109447080B (zh) * 2018-11-12 2020-04-17 北京奇艺世纪科技有限公司 一种字符识别方法及装置
CN111210399B (zh) * 2018-11-22 2023-10-17 杭州海康威视数字技术股份有限公司 一种成像质量评价方法、装置及设备
CN109495784A (zh) * 2018-11-29 2019-03-19 北京微播视界科技有限公司 信息推送方法、装置、电子设备及计算机可读存储介质
CN111274845B (zh) * 2018-12-04 2023-09-05 杭州海康威视数字技术股份有限公司 商店货架陈列情况的识别方法、装置、系统及电子设备
CN109871521A (zh) * 2019-01-08 2019-06-11 平安科技(深圳)有限公司 一种电子文档的生成方法及设备
CN111027557B (zh) * 2019-03-11 2024-03-19 广东小天才科技有限公司 一种基于题目图像的科目识别方法及电子设备
CN111753814B (zh) * 2019-03-26 2023-07-25 杭州海康威视数字技术股份有限公司 样本生成方法、装置及设备
CN111767908B (zh) * 2019-04-02 2024-07-02 顺丰科技有限公司 字符检测方法、装置、检测设备及存储介质
US10984279B2 (en) * 2019-06-13 2021-04-20 Wipro Limited System and method for machine translation of text
CN110458011A (zh) * 2019-07-05 2019-11-15 北京百度网讯科技有限公司 端到端的文字识别方法及装置、计算机设备及可读介质
JP7479925B2 (ja) * 2020-05-14 2024-05-09 キヤノン株式会社 画像処理システム、画像処理方法、及びプログラム
CN112101343A (zh) * 2020-08-17 2020-12-18 广东工业大学 一种车牌字符分割与识别方法
CN113205511B (zh) * 2021-05-25 2023-09-29 中科芯集成电路有限公司 基于深层神经网络的电子元器件批量信息检测方法及系统
TWI847218B (zh) * 2022-08-12 2024-07-01 台灣大哥大股份有限公司 文字圖像辨識系統及其方法
CN115171129A (zh) * 2022-09-06 2022-10-11 京华信息科技股份有限公司 文字识别纠错方法、装置、终端设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298976A (zh) * 2014-10-16 2015-01-21 电子科技大学 基于卷积神经网络的车牌检测方法
CN105184312A (zh) * 2015-08-24 2015-12-23 中国科学院自动化研究所 一种基于深度学习的文字检测方法及装置
CN105608454A (zh) * 2015-12-21 2016-05-25 上海交通大学 基于文字结构部件检测神经网络的文字检测方法及系统
CN105787524A (zh) * 2014-12-26 2016-07-20 中国科学院沈阳自动化研究所 基于OpenCV的车牌识别方法及系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5299269A (en) * 1991-12-20 1994-03-29 Eastman Kodak Company Character segmentation using an associative memory for optical character recognition
US20070058856A1 (en) * 2005-09-15 2007-03-15 Honeywell International Inc. Character recoginition in video data
CN102184395B (zh) * 2011-06-08 2012-12-19 天津大学 基于字符串核的草图识别方法
US8965112B1 (en) * 2013-12-09 2015-02-24 Google Inc. Sequence transcription with deep neural networks
US10043112B2 (en) * 2014-03-07 2018-08-07 Qualcomm Incorporated Photo management
CN107430677B (zh) * 2015-03-20 2022-04-12 英特尔公司 基于对二进制卷积神经网络特征进行提升的目标识别
CN105335760A (zh) * 2015-11-16 2016-02-17 南京邮电大学 一种图像数字字符识别方法
CN105678293A (zh) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 一种基于cnn-rnn的复杂图像字序列识别方法
US9911055B2 (en) * 2016-03-08 2018-03-06 Conduent Business Services, Llc Method and system for detection and classification of license plates
CN107220579B (zh) * 2016-03-21 2020-02-04 杭州海康威视数字技术股份有限公司 一种车牌检测方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298976A (zh) * 2014-10-16 2015-01-21 电子科技大学 基于卷积神经网络的车牌检测方法
CN105787524A (zh) * 2014-12-26 2016-07-20 中国科学院沈阳自动化研究所 基于OpenCV的车牌识别方法及系统
CN105184312A (zh) * 2015-08-24 2015-12-23 中国科学院自动化研究所 一种基于深度学习的文字检测方法及装置
CN105608454A (zh) * 2015-12-21 2016-05-25 上海交通大学 基于文字结构部件检测神经网络的文字检测方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3550473A4

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027555A (zh) * 2018-10-09 2020-04-17 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
CN111027555B (zh) * 2018-10-09 2023-09-26 杭州海康威视数字技术股份有限公司 一种车牌识别方法、装置及电子设备
CN111046859B (zh) * 2018-10-11 2023-09-29 杭州海康威视数字技术股份有限公司 字符识别方法及装置
CN111046859A (zh) * 2018-10-11 2020-04-21 杭州海康威视数字技术股份有限公司 字符识别方法及装置
CN111325194A (zh) * 2018-12-13 2020-06-23 杭州海康威视数字技术股份有限公司 一种文字识别方法、装置及设备、存储介质
CN111325194B (zh) * 2018-12-13 2023-12-29 杭州海康威视数字技术股份有限公司 一种文字识别方法、装置及设备、存储介质
CN112287932A (zh) * 2019-07-23 2021-01-29 上海高德威智能交通系统有限公司 一种确定图像质量的方法、装置、设备及存储介质
CN112287932B (zh) * 2019-07-23 2024-05-10 上海高德威智能交通系统有限公司 一种确定图像质量的方法、装置、设备及存储介质
CN110956170A (zh) * 2019-09-30 2020-04-03 京东数字科技控股有限公司 生成护照机读码样本的方法、装置、设备及存储介质
CN110827247B (zh) * 2019-10-28 2024-03-15 上海万物新生环保科技集团有限公司 一种识别标签的方法及设备
CN110827247A (zh) * 2019-10-28 2020-02-21 上海悦易网络信息技术有限公司 一种识别标签的方法及设备
CN110866530A (zh) * 2019-11-13 2020-03-06 云南大学 一种字符图像识别方法、装置及电子设备
CN111414908A (zh) * 2020-03-16 2020-07-14 湖南快乐阳光互动娱乐传媒有限公司 一种视频中字幕字符的识别方法及装置
CN111414908B (zh) * 2020-03-16 2023-08-29 湖南快乐阳光互动娱乐传媒有限公司 一种视频中字幕字符的识别方法及装置
CN111401289A (zh) * 2020-03-24 2020-07-10 国网上海市电力公司 一种变压器部件的智能识别方法和装置
CN111401289B (zh) * 2020-03-24 2024-01-23 国网上海市电力公司 一种变压器部件的智能识别方法和装置
CN113298188A (zh) * 2021-06-28 2021-08-24 深圳市商汤科技有限公司 字符识别及神经网络训练方法和装置

Also Published As

Publication number Publication date
US20200311460A1 (en) 2020-10-01
EP3550473A4 (en) 2019-12-11
US11003941B2 (en) 2021-05-11
CN108121984A (zh) 2018-06-05
CN108121984B (zh) 2021-09-21
EP3550473A1 (en) 2019-10-09

Similar Documents

Publication Publication Date Title
WO2018099194A1 (zh) 一种字符识别方法及装置
CN112200081B (zh) 异常行为识别方法、装置、电子设备及存储介质
CN105574513B (zh) 文字检测方法和装置
US9098888B1 (en) Collaborative text detection and recognition
KR102279456B1 (ko) 문서의 광학 문자 인식 기법
CN103679168B (zh) 文字区域检测方法及装置
US8965117B1 (en) Image pre-processing for reducing consumption of resources
WO2019238063A1 (zh) 文本检测分析方法、装置及设备
JP7287823B2 (ja) 情報処理方法及び情報処理システム
WO2021047484A1 (zh) 文字识别方法和终端设备
CN112001406A (zh) 一种文本区域检测方法及装置
CN111738252B (zh) 图像中的文本行检测方法、装置及计算机系统
CN110889421A (zh) 目标物检测方法及装置
CN107918767B (zh) 目标检测方法、装置、电子设备及计算机可读介质
CN114067401B (zh) 目标检测模型的训练及身份验证方法和装置
CN104239873A (zh) 图像处理装置及处理方法
CN112329810B (zh) 一种基于显著性检测的图像识别模型训练方法及装置
CN115861400A (zh) 目标对象检测方法、训练方法、装置以及电子设备
CN106709490B (zh) 一种字符识别方法和装置
CN113902041A (zh) 目标检测模型的训练及身份验证方法和装置
CN115482509A (zh) 烟火识别方法、装置、电子设备及存储介质
CN116246161B (zh) 领域知识引导下的遥感图像目标精细类型识别方法及装置
EP2866171A2 (en) Object detection method and device
CN111062377A (zh) 一种题号检测方法、系统、存储介质及电子设备
KR20230030907A (ko) 가짜 영상 탐지 방법 및 이를 수행하기 위한 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17877227

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017877227

Country of ref document: EP

Effective date: 20190701