CN111832423A

CN111832423A - A kind of bill information identification method, device and system

Info

Publication number: CN111832423A
Application number: CN202010566121.8A
Authority: CN
Inventors: 齐勇刚; 苏国耀; 刘军; 杨洁
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2020-06-19
Filing date: 2020-06-19
Publication date: 2020-10-27

Abstract

Embodiments of the present invention provide a method, device and system for identifying bill information. The scheme is as follows: receiving an information identification request for the bill image to be identified; preprocessing the bill image to be identified; using a pre-trained bill information recognition model to identify bill information in the preprocessed bill image to be identified to obtain the bill image to be identified The bill information; wherein, the bill information recognition model is obtained by training according to the preset training set, the preset training set includes a plurality of sample bill images of different bill types, and sample bill information of each sample bill image; The first response message of the request, where the first response message includes the ticket information of the ticket image to be recognized. The technical solutions provided by the embodiments of the present invention effectively solve the problems existing in the existing bill information identification methods, thereby reducing the dependence on the quantity of human resources and improving the generality of the bill identification methods.

Description

A kind of bill information identification method, device and system

技术领域technical field

本发明涉及图像识别技术领域，特别是涉及一种票据信息识别方法、装置及系统。The present invention relates to the technical field of image recognition, in particular to a method, device and system for recognizing bill information.

背景技术Background technique

在生产生活中，票据的形式多种多样，例如，票据可以包括汇票、本票、支票、提单、存单等。每一票据中包含有大量的票据信息，以票据中的发票为例，发票可以包括发票代码、发票号码、开票日期、购买方名称等票据信息。In production and life, there are various forms of bills. For example, bills can include drafts, promissory notes, checks, bills of lading, certificates of deposit, etc. Each bill contains a large amount of bill information. Taking the invoice in the bill as an example, the invoice may include bill information such as the invoice code, the invoice number, the billing date, and the buyer's name.

目前，在识别上述票据所包含的票据信息时，常用的票据识别方法包括人工识别和传统的光学字符识别(Optical Character Recognition，OCR)技术。其中，传统的OCR技术通过对票据图像中的每一字符进行识别，从而确定出该票据图像所包含的票据信息，即该票据图像对应票据中包含的票据信息。传统的OCR技术中的一个重要环节为字符识别，即将票据图像中的每一字符与预设候选模板中的每一字符进行匹配，识别出票据图像所包含的字符，或者提取票据图像中字符的字符特征信息，并根据该字符特征信息利用支持向量机(Support Vector Machine，SVM)识别票据图像所包含的字符。At present, when recognizing the bill information contained in the above-mentioned bills, the commonly used bill recognition methods include manual identification and traditional Optical Character Recognition (OCR) technology. Among them, the traditional OCR technology identifies each character in the bill image to determine the bill information contained in the bill image, that is, the bill image corresponds to the bill information contained in the bill. An important part of the traditional OCR technology is character recognition, that is, to match each character in the bill image with each character in the preset candidate template, identify the characters contained in the bill image, or extract the characters in the bill image. Character feature information, and use a Support Vector Machine (SVM) to identify the characters contained in the bill image according to the character feature information.

对于上述人工识别方法，由于人力资源的数量限制，使得该方法将不适用于票据数量较多的场景。而对于传统的OCR技术，由于字符识别过程所采用的预设候选模板和字符特征信息存在一定的局限性，即预设候选模板限制了识别的票据的票据类型，字符特征信息的提取限制了可识别的字符的数量，这将使得传统的OCR技术的通用性较差。For the above manual identification method, due to the limitation of the number of human resources, this method will not be suitable for scenarios with a large number of bills. For the traditional OCR technology, the preset candidate templates and character feature information used in the character recognition process have certain limitations, that is, the preset candidate templates limit the bill types of the recognized bills, and the extraction of character feature information limits the available options. The number of characters recognized, which would make conventional OCR techniques less versatile.

发明内容SUMMARY OF THE INVENTION

本发明实施例的目的在于提供一种票据信息识别方法、装置及系统，以解决现有票据信息识别方法所存在的问题，从而降低对人力资源数量的依赖，提高票据识别方法的通用性。具体技术方案如下：The purpose of the embodiments of the present invention is to provide a bill information identification method, device and system to solve the problems existing in the existing bill information identification methods, thereby reducing the dependence on the quantity of human resources and improving the versatility of the bill identification method. The specific technical solutions are as follows:

本发明实施例提供了一种票据信息识别方法，应用于票据信息识别系统中的OCR设备，所述方法包括：An embodiment of the present invention provides a method for identifying bill information, which is applied to an OCR device in a bill information identifying system, and the method includes:

接收对待识别票据图像的信息识别请求；Receive an information identification request for the image of the ticket to be identified;

基于所述信息识别请求，对所述待识别票据图像进行预处理，得到预处理后的待识别票据图像；Based on the information identification request, preprocessing the to-be-identified bill image to obtain a pre-processed to-be-identified bill image;

利用预先训练好的票据信息识别模型，识别所述预处理后的待识别票据图像中的票据信息，得到所述待识别票据图像的票据信息；其中，所述票据信息识别模型是根据预设训练集训练得到的，所述预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；Using the pre-trained bill information recognition model to identify the bill information in the preprocessed bill image to be identified, to obtain bill information of the bill image to be identified; wherein the bill information identification model is based on preset training The preset training set includes a plurality of sample bill images of different bill types, and sample bill information of each sample bill image;

返回针对所述信息识别请求的第一响应消息，所述第一响应消息包括所述待识别票据图像的票据信息。A first response message to the information identification request is returned, where the first response message includes bill information of the bill image to be identified.

本发明实施例还提供了一种票据信息识别装置，应用于票据信息识别系统中的OCR设备，所述装置包括：The embodiment of the present invention also provides a bill information identification device, which is applied to OCR equipment in a bill information identification system, and the device includes:

接收模块，用于接收对待识别票据图像的信息识别请求；The receiving module is used for receiving the information identification request of the bill image to be identified;

预处理模块，用于基于所述信息识别请求，对所述待识别票据图像进行预处理，得到预处理后的待识别票据图像；a preprocessing module, configured to preprocess the bill image to be identified based on the information identification request to obtain a preprocessed bill image to be identified;

第一识别模块，用于利用预先训练好的票据信息识别模型，识别所述预处理后的待识别票据图像中的票据信息，得到所述待识别票据图像的票据信息；其中，所述票据信息识别模型是根据预设训练集训练得到的，所述预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；The first identification module is used to identify the bill information in the preprocessed bill image to be identified by using a pre-trained bill information identification model to obtain bill information of the bill image to be identified; wherein the bill information The recognition model is obtained by training according to a preset training set, and the preset training set includes a plurality of sample bill images of different bill types, and sample bill information of each sample bill image;

第一返回模块，用于返回针对所述信息识别请求的第一响应消息，所述第一响应消息包括所述待识别票据图像的票据信息。A first return module, configured to return a first response message to the information identification request, where the first response message includes bill information of the bill image to be identified.

本发明实施例还提供了一种票据信息识别系统，所述票据信息识别系统包括的OCR设备和存储设备；The embodiment of the present invention also provides a bill information identification system, the bill information identification system includes an OCR device and a storage device;

所述OCR设备，用于接收对待识别票据图像的信息识别请求；基于所述信息识别请求，对所述待识别票据图像进行预处理，得到预处理后的待识别票据图像；利用预先训练好的票据信息识别模型，识别所述预处理后的待识别票据图像中的票据信息，得到所述待识别票据图像的票据信息；其中，所述票据信息识别模型是根据预设训练集训练得到的，所述预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；返回针对所述信息识别请求的第一响应消息，所述第一响应消息包括所述待识别票据图像的票据信息；The OCR device is used for receiving an information identification request for the bill image to be identified; based on the information identification request, preprocessing the bill image to be identified to obtain a preprocessed bill image to be identified; using the pre-trained bill image A bill information recognition model, which identifies bill information in the preprocessed bill image to be identified, and obtains bill information of the bill image to be identified; wherein, the bill information identification model is obtained by training according to a preset training set, The preset training set includes a plurality of sample bill images of different bill types, and sample bill information of each sample bill image; and returns a first response message for the information identification request, where the first response message includes the The bill information of the bill image to be identified;

所述存储设备，用于存储所述待识别票据图像的票据信息。The storage device is used for storing bill information of the bill image to be identified.

本发明实施例还提供了一种电子设备，包括处理器、通信接口、存储器和通信总线，其中，处理器，通信接口，存储器通过通信总线完成相互间的通信；An embodiment of the present invention further provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus;

存储器，用于存放计算机程序；memory for storing computer programs;

处理器，用于执行存储器上所存放的程序时，实现上述任一所述的票据信息识别方法步骤。The processor is configured to implement any one of the above-mentioned steps of the bill information identification method when executing the program stored in the memory.

本发明实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质内存储有计算机程序，所述计算机程序被处理器执行时实现上述任一所述的票据信息识别方法步骤。Embodiments of the present invention further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, any one of the above-mentioned steps of the bill information identification method is implemented.

本发明实施例还提供了一种包含指令的计算机程序产品，当其在计算机上运行时，使得计算机执行上述任一所述的票据信息识别方法。Embodiments of the present invention also provide a computer program product containing instructions, which, when running on a computer, enables the computer to execute any one of the above-mentioned methods for identifying ticket information.

本发明实施例有益效果：Beneficial effects of the embodiment of the present invention:

本发明实施例提供的票据信息识别方法、装置及系统，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。In the bill information identification method, device and system provided by the embodiments of the present invention, the OCR device in the bill information identification system, after receiving the information identification request for the bill image to be identified, uses the pre-trained bill information identification model to identify the preprocessing The bill information in the subsequent bill image to be recognized is obtained, thereby obtaining bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

当然，实施本发明的任一产品或方法并不一定需要同时达到以上所述的所有优点。Of course, it is not necessary for any product or method of the present invention to achieve all of the advantages described above at the same time.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的实施例。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other embodiments can also be obtained according to these drawings without creative efforts.

图1为本发明实施例提供的票据信息识别模型训练方法的一种流程示意图；1 is a schematic flowchart of a method for training a ticket information recognition model according to an embodiment of the present invention;

图2为本发明实施例提供的预设深度学习网络的一种结构示意图；2 is a schematic structural diagram of a preset deep learning network provided by an embodiment of the present invention;

图3为本发明实施例提供的票据信息识别方法的第一种流程示意图；3 is a first schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention;

图4为本发明实施例提供的待识别票据图像预处理方法的一种流程示意图；4 is a schematic flowchart of a method for preprocessing an image of a bill to be recognized according to an embodiment of the present invention;

图5为本发明实施例提供的图像白边裁剪处理的一种示意图；5 is a schematic diagram of an image white border cropping process provided by an embodiment of the present invention;

图6为本发明实施例提供的字符识别的一种示意图；6 is a schematic diagram of character recognition provided by an embodiment of the present invention;

图7为本发明实施例提供的票据信息识别方法的第二种流程示意图；FIG. 7 is a second schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention;

图8为本发明实施例提供的票据信息识别方法的第三种流程示意图；8 is a third schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention;

图9为本发明实施例提供的票据信息识别方法的第四种流程示意图；9 is a fourth schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention;

图10为本发明实施例提供的票据信息识别过程的一种信令图；10 is a signaling diagram of a ticket information identification process provided by an embodiment of the present invention;

图11为本发明实施例提供的票据识别装置的一种结构示意图；11 is a schematic structural diagram of a bill identification device provided by an embodiment of the present invention;

图12为本发明实施例提供的票据识别系统的一种结构示意图；12 is a schematic structural diagram of a bill identification system provided by an embodiment of the present invention;

图13为本发明实施例提供的电子设备的一种结构示意图。FIG. 13 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

为了解决现有票据信息识别方法所存在的问题，本发明实施例提供了一种票据信息识别方法。该方法应用于票据信息识别系统中的OCR设备。该OCR设备可以为用于票据信息识别的任一电子设备。在本发明实施例提供的方法中，接收对待识别票据图像的信息识别请求；基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像；利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，该预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。In order to solve the problems existing in the existing bill information identification methods, an embodiment of the present invention provides a bill information identification method. The method is applied to the OCR equipment in the ticket information identification system. The OCR device can be any electronic device used for ticket information identification. In the method provided by the embodiment of the present invention, an information identification request for a bill image to be identified is received; based on the information identification request, the bill image to be identified is preprocessed to obtain a preprocessed bill image to be identified; the pre-trained bill information is used The recognition model recognizes the bill information in the preprocessed bill image to be identified, and obtains bill information of the bill image to be identified; wherein, the bill information identification model is trained according to a preset training set, and the preset training set includes a plurality of Sample bill images of different bill types, and sample bill information of each sample bill image; return a first response message to the information identification request, where the first response message includes bill information of the bill image to be identified.

在本发明实施例中，上述票据信息识别系统可以包括上述OCR设备和存储设备。其中，票据信息识别系统中OCR设备的数量至少为一个。In the embodiment of the present invention, the above-mentioned ticket information identification system may include the above-mentioned OCR device and storage device. Wherein, the number of OCR devices in the ticket information identification system is at least one.

通过本发明实施例提供的方法，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。With the method provided by the embodiment of the present invention, after receiving the information identification request for the bill image to be identified, the OCR device in the bill information identification system uses the pre-trained bill information identification model to identify the preprocessed bill image to be identified to obtain the bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

下面通过具体的实施例，对本发明实施例进行说明。The embodiments of the present invention will be described below through specific embodiments.

如图1所示，图1为本发明实施例提供的票据信息识别模型训练方法的一种流程示意图。该方法包括以下步骤。As shown in FIG. 1 , FIG. 1 is a schematic flowchart of a method for training a bill information recognition model according to an embodiment of the present invention. The method includes the following steps.

步骤S101，获取预设训练集。Step S101, obtaining a preset training set.

上述预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息。The above-mentioned preset training set includes a plurality of sample bill images of different bill types, and sample bill information of each sample bill image.

上述样本票据图像可以为对样本票据进行拍照后得到的图像，也可以是对样本票据进行扫描后得到的图像，还可以为对样本票据进行拍照或扫描所得到的图像进行预处理后的图像。其中，样本票据的票据类型包括但不限于汇票、本票、支票、提单、存单。关于预处理的方法可参照下文描述，在此不作具体说明。The above-mentioned sample bill image may be an image obtained by photographing the sample bill, or an image obtained by scanning the sample bill, or an image obtained by photographing or scanning the sample bill after preprocessing. The bill types of the sample bills include, but are not limited to, drafts, promissory notes, checks, bills of lading, and certificates of deposit. For the method of preprocessing, reference may be made to the following description, which will not be described in detail here.

根据上述预设训练集中样本票据对应票据类型的不同，每一样本票据图像所对应的样本票据信息也有所不同。例如，当上述样本票据为发票时，该发票的票据信息包括但不限于发票代码、发票号码、开票日期、购买方名称等信息。再例如，当上述样本票据为银行汇票时，该银行汇票的票据信息包括但不限于收款人、出票金额、实际计算金额、出票日期等信息。According to the different types of bills corresponding to the sample bills in the preset training set, the sample bill information corresponding to each sample bill image is also different. For example, when the above-mentioned sample bill is an invoice, the bill information of the invoice includes, but is not limited to, the invoice code, the invoice number, the billing date, the buyer's name and other information. For another example, when the sample bill is a bank draft, the bill information of the bank draft includes, but is not limited to, the payee, the billing amount, the actual calculated amount, and the billing date.

在本发明实施例中，对上述预设训练集中的每一票据类型对应样本票据图像的数量、样本票据的票据类型、样本票据图像以及样本票据图像的票据信息不作具体限定。In the embodiment of the present invention, the number of sample bill images corresponding to each bill type in the preset training set, the bill type of the sample bill, the sample bill image, and the bill information of the sample bill image are not specifically limited.

步骤S102，针对每一样本票据图像，利用预设深度学习网络，识别该样本票据图像中票据信息，得到该样本票据图像的预测票据信息。Step S102 , for each sample bill image, use a preset deep learning network to identify bill information in the sample bill image, and obtain predicted bill information of the sample bill image.

在本步骤中，针对每一样本票据图像，OCR设备可以将该样本票据图像输入到预设深度学习网络中，从而利用该预设深度学习网络识别该样本票据图像中包括的票据信息，并将识别出的票据信息作为该样本票据图像的预测票据信息输出。In this step, for each sample bill image, the OCR device can input the sample bill image into a preset deep learning network, so as to use the preset deep learning network to identify the bill information included in the sample bill image, and use the preset deep learning network to identify the bill information included in the sample bill image. The recognized ticket information is output as predicted ticket information for the sample ticket image.

一个可选的实施例中，为了提高训练得到的票据信息识别模型的准确性，上述预设深度学习网络可以是基于多种深度学习算法得到的。也就是上述预设深度学习网络可以由多种深度学习算法所对应的深度学习网络共同组成。In an optional embodiment, in order to improve the accuracy of the ticket information recognition model obtained by training, the above-mentioned preset deep learning network may be obtained based on multiple deep learning algorithms. That is, the above-mentioned preset deep learning network may be composed of deep learning networks corresponding to multiple deep learning algorithms.

例如，上述预设深度学习网络可以是根据连续文本区域建议网络(ConnectionistText Proposal Network，CTPN)算法和卷积递归神经网络(Convolutional RecurrentNeural Network，CRNN)算法得到的深度学习网络。也就是上述预设深度学习网络可以由CTPN网络和CRNN网络共同组成。其中，CTPN算法/网络可以用于确定样本票据图像中字符所在的字符区域。CRNN算法/网络可以用于识别字符区域中的字符，从而得到样本票据图像中的字符串集合。For example, the above-mentioned preset deep learning network may be a deep learning network obtained according to a continuous text region proposal network (Connectionist Text Proposal Network, CTPN) algorithm and a convolutional recurrent neural network (Convolutional Recurrent Neural Network, CRNN) algorithm. That is, the above-mentioned preset deep learning network can be composed of CTPN network and CRNN network. Among them, the CTPN algorithm/network can be used to determine the character region where the characters in the sample bill image are located. The CRNN algorithm/network can be used to identify the characters in the character region, resulting in a collection of strings in the sample ticket image.

再例如，上述预设深度学习网络也可以是根据渐进尺度扩展网络(Shape RobustText Detection with Progressive Scale Expansion Network，PSENET)算法和卷积递归神经网络(Convolutional Recurrent Neural Network，CRNN)算法得到的深度学习网络。也就是上述预设深度学习网络可以由PSENET网络和CRNN网络共同组成。其中，PSENET算法/网络可以用于确定样本票据图像中字符所在的字符区域。CRNN算法/网络可以用于识别字符区域中的字符，从而得到样本票据图像中的字符串集合。For another example, the above-mentioned preset deep learning network may also be a deep learning network obtained according to the Shape Robust Text Detection with Progressive Scale Expansion Network (PSENET) algorithm and the Convolutional Recurrent Neural Network (CRNN) algorithm. . That is, the above-mentioned preset deep learning network can be composed of a PSENET network and a CRNN network. Among them, the PSENET algorithm/network can be used to determine the character area where the characters in the sample bill image are located. The CRNN algorithm/network can be used to identify the characters in the character region, resulting in a collection of strings in the sample ticket image.

为便于理解，以上述预设深度学习网络可以是基于CTPN算法和CRNN算法得到的深度学习网络为例，对上述预设深度学习网络的进行说明。如图2所示，图2为本发明实施例提供的预设深度学习网络的一种结构示意图。在图2所示的预设深度学习网络中主要可以包括卷积层201，逻辑回归层202，全连接层203，卷积层204，循环层205和转录层206。其中，卷积层201，逻辑回归层202和全连接层203为上述CTPN网络的组成部分，卷积层204，循环层205和转录层206为上述CRNN网络的组成部分。除此以外，上述预设深度学习网络中还可以包括输入层、输出层(图2中未示出)等，在此不作具体说明。For ease of understanding, the above preset deep learning network will be described by taking as an example that the preset deep learning network may be a deep learning network obtained based on the CTPN algorithm and the CRNN algorithm. As shown in FIG. 2 , FIG. 2 is a schematic structural diagram of a preset deep learning network provided by an embodiment of the present invention. The preset deep learning network shown in FIG. 2 may mainly include a convolution layer 201 , a logistic regression layer 202 , a fully connected layer 203 , a convolution layer 204 , a recurrent layer 205 and a transcription layer 206 . Among them, the convolutional layer 201, the logistic regression layer 202 and the fully connected layer 203 are the components of the above-mentioned CTPN network, and the convolutional layer 204, the recurrent layer 205 and the transcription layer 206 are the components of the above-mentioned CRNN network. In addition to this, the above-mentioned preset deep learning network may also include an input layer, an output layer (not shown in FIG. 2 ), etc., which will not be described in detail here.

上述卷积层201可以对输入的样本票据图像进行特征提取。也就是提取样本票据图像中每一像素点对应位置处全通道的特征，如红色绿色蓝色(Red Green Blue，RGB)通道的特征。The above-mentioned convolutional layer 201 can perform feature extraction on the input sample bill image. That is, the feature of all channels at the corresponding position of each pixel point in the sample bill image is extracted, such as the feature of the red green blue (Red Green Blue, RGB) channel.

上述逻辑回归层202可以对卷积层201提取到的特征进行逻辑回归处理，如对卷积层201提取到的特征进行双向长短期记忆网络(Bi-directional Long Short-TermMemory，BLSTM)回归处理。The above-mentioned logistic regression layer 202 can perform logistic regression processing on the features extracted by the convolution layer 201 , for example, perform a bi-directional long short-term memory network (Bi-directional Long Short-Term Memory, BLSTM) regression processing on the features extracted by the convolution layer 201 .

全连接层203对接收到的逻辑回归层202的输出进行全连接处理，可以得到2k个垂直坐标(vertical coordinates)，2k个分数(scores)，k个边细化(side-refinement)。其中，k为锚定点(anchor)个数，一般默认为10。vertical coordinates中包含有纵向偏移y和区域高度h两个参数值。Scores中包含了对样本票据图像的前景和背景的两种评分。side-refinement中包含了水平方向的偏移量x。The fully connected layer 203 performs fully connected processing on the received output of the logistic regression layer 202, and can obtain 2k vertical coordinates, 2k scores, and k side-refinements. Among them, k is the number of anchor points, generally 10 by default. The vertical coordinates contain two parameter values, the vertical offset y and the area height h. Scores contains two scores for the foreground and background of the sample bill image. The side-refinement contains the offset x in the horizontal direction.

卷积层204对全连接层203的输出进行卷积处理，提取图像中的特征信息，得到特征序列。The convolution layer 204 performs convolution processing on the output of the fully connected layer 203 to extract feature information in the image to obtain a feature sequence.

循环层205为一个BLSTM网络，可以对卷积层204所提取到的特征序列进行进一步的预测，得到初步预测结果。相比于传统的循环神经网络(Recurrent Neural Network，RNN)，在循环层205中包括忘记门、输入门、输出门三种门结构来处理RNN的细胞单元状态。其中，忘记门主要通过逻辑回归(sigmoid，也称S型生长曲线)函数来决定保留上一细胞状态的信息，输入门通过sigmoid和双曲正切(tanh)函数为细胞状态添加新的信息，并联合旧的细胞状态更新当前的细胞状态，输入门通过sigmoid函数决定着输出的特征。通过忘记门、输入门和输出门，可以将每个字符与之前的字符取得联系。The recurrent layer 205 is a BLSTM network, which can further predict the feature sequence extracted by the convolution layer 204 to obtain a preliminary prediction result. Compared with the traditional recurrent neural network (Recurrent Neural Network, RNN), the recurrent layer 205 includes three gate structures of forget gate, input gate, and output gate to process the cell state of the RNN. Among them, the forget gate mainly uses the logistic regression (sigmoid, also known as sigmoid growth curve) function to decide to retain the information of the previous cell state, and the input gate adds new information to the cell state through the sigmoid and hyperbolic tangent (tanh) functions, and Combined with the old cell state to update the current cell state, the input gate determines the output characteristics through the sigmoid function. By forgetting gates, input gates, and output gates, each character can be connected to the previous character.

在本发明实施例中，采用两层BLSTM网络使得样本票据图像中的每个字符与其相邻字符(即该字符前后的字符)互相关联，经过两层LSTM后，循环层205输出每个特征序列预测的字符概率值，即上述初步预测结果。In the embodiment of the present invention, a two-layer BLSTM network is used to associate each character in the sample bill image with its adjacent characters (that is, the characters before and after the character). After two layers of LSTM, the recurrent layer 205 outputs each feature sequence. The predicted character probability value, that is, the above-mentioned preliminary prediction result.

转录层206基于上述初步预测结果进行转录得到最终预测结果。The transcription layer 206 performs transcription based on the above-mentioned preliminary prediction result to obtain the final prediction result.

在本发明实施例中，上述转录层206可以采用以下两种方式进行转录：In this embodiment of the present invention, the above-mentioned transcription layer 206 may perform transcription in the following two ways:

方式一，基于字典形式的转录。也就是在进行转录时，将输出的结果与预设字典对比，将概率最大预测值确定为最终的预测字符串，即上述最终预测结果。The first method is based on the transcription in the form of a dictionary. That is, when transcribing, the output result is compared with the preset dictionary, and the predicted value with the maximum probability is determined as the final predicted character string, that is, the above-mentioned final predicted result.

方式二，基于无字典形式的转录。也就是不存在用于对比字符串或预设字典，在进行转录时，直接将概率最大的字符确定为最终的预测字符串，即上述最终预测结果。The second method is based on the transcription without dictionary form. That is, there is no character string or preset dictionary for comparison, and when transcribing, the character with the highest probability is directly determined as the final predicted character string, that is, the above-mentioned final predicted result.

在本发明实施例中，上述卷积层201和卷积层204中包含多个卷积单元和池化单元。以卷积层201为例进行说明，卷积层201可以对输入的样本票据图像进行多次卷积操作和池化操作，从而得到全通道的特征。在此，对上述卷积操作和池化操作过程不作具体说明。In the embodiment of the present invention, the above-mentioned convolutional layer 201 and convolutional layer 204 include a plurality of convolutional units and pooling units. Taking the convolutional layer 201 as an example for illustration, the convolutional layer 201 can perform multiple convolution operations and pooling operations on the input sample bill image, so as to obtain the features of the full channel. Here, the above-mentioned convolution operation and pooling operation process will not be described in detail.

一个可选的实施例中，在卷积层204对全连接层203的输出进行卷积处理之前，可以根据样本票据图像的大小，将该样本票据图像的高度等比例缩放为预设高度。例如，样本票据图像的宽度为W_0，高度为H_0，记为(W_0，H_0)，预设高度为32，则等比例缩放后的样本票据图像的大小可以表示为(W，32)In an optional embodiment, before the convolution layer 204 performs convolution processing on the output of the fully connected layer 203, the height of the sample bill image may be proportionally scaled to a preset height according to the size of the sample bill image. For example, the width of the sample bill image is W_0, the height is H_0, denoted as (W_0, H_0), and the preset height is 32, then the size of the sample bill image after proportional scaling can be expressed as (W, 32)

通过对样本票据图像大小的调整可以使得卷积层204提取到的特征序列的感受野为矩形区域。具体的，可以将修改卷积层204中核尺度由正方形大小变为矩形大小，使得卷积层204得到矩形区域对应的特征序列。这将便于后期的字符识别，提高识别出的字符的准确性By adjusting the size of the sample bill image, the receptive field of the feature sequence extracted by the convolution layer 204 can be a rectangular area. Specifically, the kernel size in the modified convolutional layer 204 may be changed from a square size to a rectangular size, so that the convolutional layer 204 obtains a feature sequence corresponding to the rectangular area. This will facilitate later character recognition and improve the accuracy of the recognized characters

根据上述预设深度学习网络所采用的深度学习算法的不同，上述预设深度学习网络的结构也有所不同。在此，对上述预设深度学习网络所采用的深度学习算法以及上述预设深度学习网络的结构不作具体说明。According to different deep learning algorithms adopted by the above-mentioned preset deep learning network, the structure of the above-mentioned preset deep learning network is also different. Here, the deep learning algorithm adopted by the preset deep learning network and the structure of the preset deep learning network will not be described in detail.

步骤S103，基于每一样本票据图像对应的样本票据信息和预测票据信息，计算预设深度学习网络的的损失值。Step S103: Calculate the loss value of the preset deep learning network based on the sample bill information and the predicted bill information corresponding to each sample bill image.

在本步骤中，OCR设备可以根据上述预设训练集中每一样本票据图像所对应的样本票据信息和预测票据信息，利用预设损失函数，计算上述深度学习网络的损失值。In this step, the OCR device may use the preset loss function to calculate the loss value of the deep learning network according to the sample bill information and predicted bill information corresponding to each sample bill image in the preset training set.

上述预设损失函数包括但不限于Softmax-交叉熵损失函数、对数损失函数以及L2回归损失函数。其中，Softmax-交叉熵损失函数中的Softmax表示Softmax函数，也可以称为归一化指数函数。L2回归损失函数也可以称为均方误差(Mean Square Error，MSE)损失函数。在此，对上述预设损失函数不作具体限定。The above-mentioned preset loss functions include but are not limited to Softmax-cross-entropy loss functions, logarithmic loss functions, and L2 regression loss functions. Among them, Softmax in the Softmax-cross entropy loss function represents the Softmax function, which can also be called a normalized exponential function. The L2 regression loss function can also be called the Mean Square Error (MSE) loss function. Here, the above preset loss function is not specifically limited.

在本发明实施例中，由于上述预设深度学习网络可以是基于不同算法所构成的深度学习网络，因此，在计算上述深度学习网络的损失值时，OCR设备可以针对不同的算法分别计算对应的损失值。以上述预设深度学习网络是基于CTPN算法和CRNN算法所得到的深度学习网络为例，OCR设备可以利用上述L2回归损失函数，计算CTPN算法对应深度学习网络的损失值，利用上述Softmax-交叉熵损失函数，计算CRNN算法对应深度学习网络的损失值。在此，对上述损失值的计算过程不作具体说明。In the embodiment of the present invention, since the above-mentioned preset deep learning network may be a deep learning network formed based on different algorithms, when calculating the loss value of the above-mentioned deep learning network, the OCR device may calculate the corresponding corresponding loss value. Taking the above preset deep learning network based on the deep learning network obtained by the CTPN algorithm and the CRNN algorithm as an example, the OCR device can use the above L2 regression loss function to calculate the loss value of the deep learning network corresponding to the CTPN algorithm, using the above Softmax-cross entropy The loss function calculates the loss value of the CRNN algorithm corresponding to the deep learning network. Here, the calculation process of the above-mentioned loss value will not be described in detail.

步骤S104，当损失值大于预设损失值阈值时，调整预设深度学习网络的参数，并返回执行上述步骤S102。Step S104, when the loss value is greater than the preset loss value threshold, adjust the parameters of the preset deep learning network, and return to the above-mentioned step S102.

在本步骤中，OCR设备可以将上述预设深度学习网络的损失值与预设损失值阈值进行比较，从而确定上述预设深度学习网络是否收敛。当预设深度学习网络的损失值大于预设损失值阈值时，OCR设备可以确定预设深度学习网络未收敛。此时，OCR设备可以调整预设深度学习网络的参数，并返回执行上述步骤S102。也就是返回执行上述针对每一样本票据图像，利用预设深度学习网络，识别该样本票据图像中票据信息，得到该样本票据图像的预测票据信息的步骤。In this step, the OCR device may compare the loss value of the preset deep learning network with the preset loss value threshold, so as to determine whether the preset deep learning network has converged. When the loss value of the preset deep learning network is greater than the preset loss value threshold, the OCR device may determine that the preset deep learning network has not converged. At this time, the OCR device may adjust the parameters of the preset deep learning network, and return to performing the above step S102. That is, returning to the above step of recognizing the bill information in the sample bill image by using the preset deep learning network for each sample bill image, and obtaining the predicted bill information of the sample bill image.

在本发明实施例中，上述对预设深度学习网络中参数的调整包括但不限于对预设深度学习网络中的偏置量和权重的调整。OCR设备所采用的参数调整方法包括但不限于梯度下降法、反向调节法。In the embodiment of the present invention, the above-mentioned adjustment of parameters in the preset deep learning network includes, but is not limited to, adjustment of the offset and weight in the preset deep learning network. The parameter adjustment methods adopted by the OCR equipment include, but are not limited to, the gradient descent method and the reverse adjustment method.

步骤S105，当损失值不大于预设损失值阈值时，将当前的预设深度学习网络确定为训练好的票据信息识别模型。Step S105, when the loss value is not greater than the preset loss value threshold, determine the current preset deep learning network as the trained bill information recognition model.

在本步骤中，当上述预设深度学习网络的损失值不大于预设损失值阈值时，OCR设备可以确定预设深度学习网络收敛。此时，OCR设备可以将当前的预设深度学习网络确定为训练好的票据信息识别模型。In this step, when the loss value of the preset deep learning network is not greater than the preset loss value threshold, the OCR device may determine that the preset deep learning network is converged. At this time, the OCR device can determine the current preset deep learning network as the trained bill information recognition model.

采用图1所示的方法，利用上述预设训练集训练得到的票据信息识别模型可以适用于对不同票据类型的票据的识别，提高了训练得到的票据信息识别模型的通用性，从而提高了票据识别方法的通用性。Using the method shown in FIG. 1, the bill information recognition model trained by using the above preset training set can be applied to the identification of bills of different bill types, which improves the versatility of the trained bill information identification model, thereby improving bills. Generality of identification methods.

基于图1所示的方法训练得到的票据信息识别模型，本发明实施例提供了一种票据信息识别方法。如图3所示，图3为本发明实施例提供的票据信息识别方法的第一种流程示意图。该方法应用于票据信息识别系统中的OCR设备，具体包括以下步骤。Based on the bill information identification model trained by the method shown in FIG. 1 , an embodiment of the present invention provides a bill information identification method. As shown in FIG. 3 , FIG. 3 is a first schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention. The method is applied to the OCR equipment in the ticket information identification system, and specifically includes the following steps.

步骤S301，接收对待识别票据图像的信息识别请求。Step S301, receiving an information identification request for the image of the bill to be identified.

上述信息识别请求中可以包括的待识别票据的待识别票据图像。关于待识别票据图像可参照上述对样本票据图像的描述，在此不作具体说明。The to-be-recognized bill image of the to-be-identified bill may be included in the above information identification request. Regarding the bill image to be identified, reference may be made to the above description of the sample bill image, which will not be described in detail here.

一个可选的实施例中，用户可以通过客户端向OCR设备发送针对待识别票据图像的信息识别请求。OCR设备接收用户发送的信息识别请求。In an optional embodiment, the user may send an information identification request for the image of the bill to be identified to the OCR device through the client. The OCR device receives the information identification request sent by the user.

另一个可选的实施例中，上述OCR设备中可以包括多个OCR计算实例单元(即服务实例，Service Instance)，为了实现各个OCR计算实例单元之间的负载均衡，在上述票据信息识别系统中可以设置有一请求代理中间件，如反向代理(Reverse Proxy)。用户在向票据信息识别系统发送对待识别票据图像的信息识别请求时，可以由该请求代理中间件接收该用户发送的信息识别请求，从而根据OCR设备中每一OCR计算实例单元的计算资源，将接收到的信息识别请求转发给计算资源未被占用的OCR计算实例单元。OCR设备中的该OCR计算实例单元接收请求代理中间件转发的信息识别请求。In another optional embodiment, the above-mentioned OCR device may include multiple OCR computing instance units (ie, service instances, Service Instances). A request proxy middleware can be set, such as a reverse proxy (Reverse Proxy). When a user sends an information identification request for a ticket image to be identified to the ticket information identification system, the request proxy middleware can receive the information identification request sent by the user, so that according to the computing resources of each OCR calculation instance unit in the OCR device, the The received information identification request is forwarded to the OCR computing instance unit whose computing resources are not occupied. The OCR calculation instance unit in the OCR device receives the information identification request forwarded by the request proxy middleware.

上述请求代理中间件可以集成在上述OCR设备中，也可以集成在其他电子设备中。为便于描述，在本发明实施例中以上述请求代理中间件集成在OCR设备中为例进行说明，并不起任何限定作用。The above-mentioned request proxy middleware may be integrated in the above-mentioned OCR device, and may also be integrated in other electronic devices. For the convenience of description, in the embodiment of the present invention, the above-mentioned request proxy middleware is integrated in the OCR device as an example for description, which does not play any limiting role.

步骤S302，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。Step S302 , based on the information identification request, preprocess the image of the bill to be identified to obtain the preprocessed bill image to be identified.

在本步骤中，上述预处理包括但不限于图像对比度增强处理、图像灰度化处理、图像二值化处理、图像白边裁剪处理、倾斜校正处理。也就是OCR设备在接收到上述信息识别请求时，可以按照预设预处理顺序，对该信息识别请求中携带的待识别票据图像进行预处理，得到预处理后的待识别票据图像。例如，OCR设备可以依次对待识别票据图像进行图像对比度增强处理、图像灰度化处理、图像二值化处理、图像白边裁剪处理、倾斜校正处理，得到预处理后的待识别票据图像。关于预处理的具体过程可参见下文描述，在此不作具体说明。In this step, the above-mentioned preprocessing includes, but is not limited to, image contrast enhancement processing, image grayscale processing, image binarization processing, image white border cropping processing, and skew correction processing. That is, when the OCR device receives the above information identification request, it can preprocess the bill image to be identified carried in the information identification request according to the preset preprocessing sequence, and obtain the preprocessed bill image to be identified. For example, the OCR device can sequentially perform image contrast enhancement processing, image grayscale processing, image binarization processing, image white edge cropping processing, and tilt correction processing on the image of the bill to be recognized, to obtain a preprocessed bill image to be identified. The specific process of preprocessing can be found in the following description, which will not be described in detail here.

在本发明实施例中，根据上述待识别票据的票据类别的不同，以及待识别票据图像的不同，上述对待识别票据图像的预处理过程也有所不同。In the embodiment of the present invention, according to the different types of bills of the bills to be identified and the different bill images to be identified, the preprocessing process of the bill images to be identified is also different.

例如，在一种常见的发票中左上角区域为二维码所在区域。当上述待识别票据为该发票时，上述预处理过程还可以包括文字方向校正过程。也就是OCR设备可以利用预设滤波器，如高斯滤波器对该发票对应的待识别票据图像进行卷积处理，确定最大值所在位置，并将该最大值所在位置确定为该发票的左上角位置，从而确定该待识别票据图像文字的方向。For example, in a common invoice, the upper left area is the area where the QR code is located. When the above-mentioned bill to be identified is the invoice, the above-mentioned preprocessing process may further include a text direction correction process. That is, the OCR device can use a preset filter, such as a Gaussian filter, to perform convolution processing on the image of the bill to be identified corresponding to the invoice, determine the position of the maximum value, and determine the position of the maximum value as the upper left corner of the invoice. , so as to determine the direction of the image text of the bill to be recognized.

再例如，当上述待识别票据图像为对待识别票据进行拍照得到的图像，即待识别票据图像为彩色图像时，上述预处理过程可以包括图像灰度化处理过程。当上述待识别票据图像为对待识别票据进行扫描后的图像，即该图像为灰度图像时，上述预处理过程可以不包括上述图像灰度化处理过程。For another example, when the above-mentioned image of the bill to be identified is an image obtained by photographing the bill to be identified, that is, when the image of the bill to be identified is a color image, the above-mentioned preprocessing process may include an image grayscale processing process. When the above-mentioned image of the bill to be identified is an image of the bill to be identified, that is, the image is a grayscale image, the above-mentioned preprocessing process may not include the above-mentioned image grayscale processing process.

在本发明实施例中，对上述待识别票据图像的预处理过程不作具体限定。In the embodiment of the present invention, the preprocessing process of the above-mentioned bill image to be recognized is not specifically limited.

一个可选的实施例中，当上述待识别票据图像中包括上述二维码时，OCR设备可以直接识别待识别票据图像中的二维码，得到待设备票据图像的票据信息，并存储该票据信息。当上述待识别票据图像中不包括上述二维码时，OCR设备执行上述步骤S302，也就是执行基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像的步骤。In an optional embodiment, when the above-mentioned two-dimensional code is included in the above-mentioned bill image to be recognized, the OCR device can directly identify the two-dimensional code in the to-be-recognized bill image, obtain the bill information of the bill image to be identified, and store the bill. information. When the above two-dimensional code is not included in the to-be-recognized bill image, the OCR device executes the above step S302, that is, the step of preprocessing the to-be-recognized bill image based on the information identification request to obtain the pre-processed to-be-recognized bill image.

步骤S303，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息。Step S303, using the pre-trained bill information recognition model to identify the bill information in the pre-processed bill image to be identified, to obtain bill information of the bill image to be identified; wherein, the bill information identification model is obtained by training according to a preset training set Yes, the preset training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image.

关于待识别票据图像的票据信息的识别过程可参见下文描述，在此不作具体说明。For the identification process of the bill information of the bill image to be identified, reference may be made to the following description, which will not be described in detail here.

步骤S304，返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。Step S304, returning a first response message to the information identification request, where the first response message includes the bill information of the bill image to be identified.

在本步骤中，OCR设备可以根据上述票据识别模型输出的识别结果，即上述待识别票据图像的票据信息，返回针对上述信息识别请求的第一响应消息。In this step, the OCR device may return a first response message for the information identification request according to the identification result output by the above-mentioned bill identification model, that is, the bill information of the above-mentioned bill image to be identified.

一个可选的实施例中，若上述信息识别请求是由用户直接发送给OCR设备的，则OCR设备可以向用户所使用的客户端发送包括待识别票据图像的票据信息的第一响应消息。In an optional embodiment, if the above-mentioned information identification request is directly sent by the user to the OCR device, the OCR device may send a first response message including the ticket information of the ticket image to be identified to the client used by the user.

另一个可选的实施例中，若上述信息识别请求是由上述请求代理中间件转发的，则OCR设备可以向请求代理中间件发送包括待识别票据图像的票据信息的第一响应消息。请求代理中间件将该第一消息消息发送给上述信息识别请求所对应的客户端，也就是发送上述信息识别请求的客户端。In another optional embodiment, if the above-mentioned information identification request is forwarded by the above-mentioned request proxy middleware, the OCR device may send a first response message including bill information of the bill image to be identified to the request proxy middleware. The request proxy middleware sends the first message to the client corresponding to the above-mentioned information identification request, that is, the client that sends the above-mentioned information identification request.

采用图3所示的方法，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。Using the method shown in Fig. 3, after receiving the information identification request for the bill image to be identified, the OCR device in the bill information identification system uses the pre-trained bill information identification model to identify the preprocessed bill image to be identified. to obtain the bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

一个可选的实施例中，由于上述OCR设备中的中央处理器(Central ProcessingUnit，CPU)、内存、图形处理单元(Graphics Processing Unit，GPU)的硬件特性等因素的影响，使得无法在OCR设备上同时启动多个进程进行服务。因此，在本发明实施例中上述OCR设备可以为单进程多线程模式处理上述信息识别请求，这使得OCR设备可以同步处理多个信息识别请求。In an optional embodiment, due to the influence of factors such as the central processing unit (Central Processing Unit, CPU), memory, and hardware characteristics of the graphics processing unit (Graphics Processing Unit, GPU) in the above-mentioned OCR device, it is impossible to use the OCR device on the OCR device. Start multiple processes at the same time for service. Therefore, in the embodiment of the present invention, the above-mentioned OCR device may process the above-mentioned information identification request in a single-process multi-thread mode, which enables the OCR device to process multiple information identification requests synchronously.

一个可选的实施例中，针对上述步骤S302，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像，本发明实施例提供了一种待识别票据图像的预处理方法。如图4所示，图4为本发明实施例提供的待识别票据图像预处理方法的一种流程示意图。具体可以将上述步骤S302细化为如下步骤。In an optional embodiment, for the above step S302, based on the information identification request, the bill image to be identified is preprocessed to obtain a preprocessed bill image to be identified. The embodiment of the present invention provides a preprocessing of the bill image to be identified. Approach. As shown in FIG. 4 , FIG. 4 is a schematic flowchart of a method for preprocessing an image of a bill to be recognized according to an embodiment of the present invention. Specifically, the above step S302 can be refined into the following steps.

步骤S3021，利用预设伽玛系数，对信息识别请求中的待识别票据图像进行伽玛变换，得到第一图像。Step S3021, using a preset gamma coefficient to perform gamma transformation on the image of the bill to be identified in the information identification request to obtain a first image.

在本步骤中，当上述信息识别请求中的待识别票据图像为彩色图像时，由于待识别票据图像中除了包括票据信息以外，还包括一些其他信息如边框信息等。例如，常见的票据中边框一般为褐色，文字一般为蓝色或红色。为了使得待识别票据图像中的票据信息与其他信息之间的区别更加明显，OCR设备可以对待识别票据图像进行图像对比度增强处理，也就是OCR设备可以利用预设伽玛系数，对信息识别请求中的待识别票据图像进行伽玛变换，得到第一图像，即图像对比度增强后的图像。其中，伽玛变换也可以称为Gamma变换。In this step, when the to-be-recognized bill image in the above information identification request is a color image, the to-be-identified bill image includes not only bill information, but also some other information such as border information. For example, the borders of common bills are generally brown, and the text is generally blue or red. In order to make the difference between the bill information in the bill image to be recognized and other information more obvious, the OCR device can perform image contrast enhancement processing on the bill image to be recognized, that is, the OCR device can use the preset gamma coefficient to identify the information in the request. The image of the bill to be recognized is subjected to gamma transformation to obtain the first image, that is, the image after the image contrast is enhanced. Among them, gamma transform can also be called Gamma transform.

一个可选的实施例中，上述预设伽玛系数可以为2.5。除此以外，上述预设伽玛系数也可以为其他数值，例如，上述预设伽玛系数的取值范围可以在1-3之间，其中，预设伽玛系数为1表示未进行图像对比度增强处理，即预设伽玛系数为1为上述待识别票据图像。In an optional embodiment, the above-mentioned preset gamma coefficient may be 2.5. In addition to this, the above-mentioned preset gamma coefficient may also be other values. For example, the value range of the above-mentioned preset gamma coefficient may be between 1 and 3, wherein the preset gamma coefficient of 1 indicates that no image contrast is performed. Enhancement processing, that is, the preset gamma coefficient is 1 is the above-mentioned bill image to be recognized.

一个可选的实施例中，上述信息识别请求中的待识别票据图像为灰度图像时，OCR设备可以采用其他方法对该待识别票据图像进行图像对比对增强处理，得到第一图像。例如，OCR设备可以采用灰度变换法对待识别票据图像进行图像对比对增强处理，得到第一图像。In an optional embodiment, when the to-be-recognized bill image in the above information identification request is a grayscale image, the OCR device may use other methods to perform image comparison and enhancement processing on the to-be-identified bill image to obtain the first image. For example, the OCR device may use a grayscale transformation method to perform image contrast and enhancement processing on the image of the bill to be recognized to obtain the first image.

在本发明实施例中，对上述图像对比对增强处理所采用的方法不作具体限定。In the embodiment of the present invention, the method used for the above-mentioned image contrast enhancement processing is not specifically limited.

步骤S3022，将第一图像转换为灰度图像，得到第二图像。Step S3022, converting the first image into a grayscale image to obtain a second image.

在本步骤中，OCR设备可以对上述第一图像进行图像灰度化处理，也就是将第一图像转换为灰度图像，得到第二图像。通过将第一图像中每一像素点的RGB值转换为灰度值，从而去除第一图像中的颜色信息，增强文字信息，即增强票据信息。关于图像灰度化处理的过程，在此不作具体说明。In this step, the OCR device may perform image grayscale processing on the first image, that is, convert the first image into a grayscale image to obtain a second image. By converting the RGB value of each pixel in the first image into a grayscale value, the color information in the first image is removed, and the text information is enhanced, that is, the bill information is enhanced. The process of image grayscale processing will not be described in detail here.

步骤S3023，基于第二图像中各个像素点对应像素值的平均值，对第二图像进行图像二值化处理，得到第三图像。Step S3023: Perform image binarization processing on the second image based on the average value of pixel values corresponding to each pixel in the second image to obtain a third image.

在本步骤中，为了进步提高预处理后的待识别票据图像的准确性，OCR设备可以对上述第二图像进行图像二值化处理，从而区别待识别票据图像的前景和背景。也就是基于第二图像中各个像素点对应像素值的平均值，对第二图像进行图像二值化处理，得到第三图像，即得到对第二图像进行图像二值化处理后的图像。In this step, in order to improve the accuracy of the preprocessed bill image to be recognized, the OCR device may perform image binarization processing on the second image, thereby distinguishing the foreground and background of the bill image to be identified. That is, based on the average value of pixel values corresponding to each pixel in the second image, image binarization processing is performed on the second image to obtain a third image, that is, an image obtained by performing image binarization processing on the second image.

具体的，OCR设备可以根据上述第二图像中每一像素点的像素值，即灰度值，计算该第二图像中像素值的平均值。以第二图像中包括的5个像素点，即像素点1-5为例，假设像素点1-5所对应的像素值为灰度1-5。第二图像中像素值的平均值Z可以表示为：Specifically, the OCR device may calculate the average value of the pixel values in the second image according to the pixel value of each pixel in the second image, that is, the gray value. Taking the 5 pixel points included in the second image, that is, the pixel points 1-5 as an example, it is assumed that the pixel values corresponding to the pixel points 1-5 are grayscales 1-5. The average Z of pixel values in the second image can be expressed as:

OCR设备可以将上述第二图像中每一像素点的像素值与上述平均值进行比较。针对每一像素点，当该像素点的像素值小于上述平均值时，OCR设备可以确定该像素点的像素值为0，即该像素点表示为黑色，也就是上述待识别票据图像的前景；当该像素点的像素值不小于上述平均值时，OCR设备可以确定该像素点的像素值为255，即该像素点表示为白色也就是上述待识别票据图像的背景，从而得到第三图像。The OCR device may compare the pixel value of each pixel in the second image with the average value. For each pixel, when the pixel value of the pixel is less than the above average value, the OCR device can determine that the pixel value of the pixel is 0, that is, the pixel is represented as black, that is, the foreground of the above-mentioned bill image to be recognized; When the pixel value of the pixel point is not less than the above average value, the OCR device can determine that the pixel value of the pixel point is 255, that is, the pixel point is represented as white, which is the background of the above-mentioned bill image to be recognized, thereby obtaining the third image.

步骤S3024，根据第三图像中水平方向上每一行包括的各个像素点的像素值，竖直方向上每一列包括的各个像素点的像素值，以及预设像素值的像素点数量阈值，确定第三图像的外边框位置，并对外边框位置内的图像进行裁剪，得到第四图像。Step S3024, according to the pixel value of each pixel included in each row in the horizontal direction in the third image, the pixel value of each pixel included in each column in the vertical direction, and the pixel number threshold of the preset pixel value, determine the first pixel value. The position of the outer border of the three images, and the image within the position of the outer border is cropped to obtain a fourth image.

为便于理解，以上述第三图像中任一水平方向上的像素点，结合图5为例进行说明，图5为本发明实施例提供的图像白边裁剪处理的一种示意图。For ease of understanding, a pixel point in any horizontal direction in the above third image is taken as an example for description with reference to FIG. 5 . FIG. 5 is a schematic diagram of an image white border cropping process provided by an embodiment of the present invention.

在图5中包括10个像素点，即像素点a-j。这10个像素点为上述第三图像中某一水平方向上包括的所有像素点。其中，像素点a、像素点c、像素点d、像素点h、像素点i和像素点j的像素值为0，像素点b、像素点e、像素点f、像素点g的像素值为255。In Fig. 5, 10 pixel points are included, namely pixel points a-j. These 10 pixel points are all the pixel points included in a certain horizontal direction in the above-mentioned third image. Among them, the pixel value of pixel point a, pixel point c, pixel point d, pixel point h, pixel point i and pixel point j is 0, and the pixel value of pixel point b, pixel point e, pixel point f and pixel point g is 255.

针对上述第三图像中水平方向上的每一行，OCR设备可以按照自左向右的顺序遍历每一像素点，并统计像素值为预设像素值的像素点的数量。当像素值为预设像素值的像素点的数量等于上述像素点数量阈值时，反向遍历上述像素点数量阈值个像素点，并将最后遍历到的像素点所在的位置确定为该行像素点的左边界位置。以此类推，OCR设备可以按照左外边界位置的确定方法，分别确定每一行像素点的右边界位置，每一列像素点的上边界位置以及每一列像素点的下边界位置。For each row in the horizontal direction in the above-mentioned third image, the OCR device may traverse each pixel in the order from left to right, and count the number of pixels whose pixel value is a preset pixel value. When the number of pixels whose pixel value is the preset pixel value is equal to the above-mentioned threshold of the number of pixels, the above-mentioned threshold of the number of pixels is traversed in reverse, and the position of the last traversed pixel is determined as the row of pixels. the left border position of . By analogy, the OCR device can determine the right border position of each row of pixels, the upper border position of each column of pixels, and the lower border position of each column of pixels according to the method for determining the left outer border position.

为便于理解，以上述预设像素值为0，像素点数量阈值为4，结合上述图5对上述左边界位置的确定进行举例说明。针对上述第三图像中水平方向上的每一行，OCR设备可以按照自左向右的顺序遍历每一像素点，当遍历到像素点h时，像素值为0的像素点的数量为4。此时，OCR设备将反向遍历4个像素点，也就是从像素点h遍历至像素点d。OCR设备可以将像素点d的左边界确定该行像素点的左边界位置。For ease of understanding, the above-mentioned predetermined pixel value is 0, and the pixel number threshold is 4, and the determination of the above-mentioned left border position is illustrated with reference to the above-mentioned FIG. 5 . For each line in the horizontal direction in the above third image, the OCR device may traverse each pixel in the order from left to right, and when the traversal reaches the pixel h, the number of pixels with a pixel value of 0 is 4. At this time, the OCR device will traverse 4 pixel points in reverse, that is, traverse from pixel point h to pixel point d. The OCR device may determine the position of the left boundary of the pixel point of the row by the left boundary of the pixel point d.

OCR设备可以根据确定出的水平方向上每一行的左边界位置和右边界位置，以及竖直方向上每一列的上边界位置和下边界位置，确定第三图像的外边框位置。例如，OCR设备可以将水平方向上的每一左边界位置和右边界位置，以及竖直方向上每一上边界位置和下边界位置所对应的最小矩形区域或者最小多边形区域，确定为第三图像的外边框位置。The OCR device may determine the position of the outer border of the third image according to the determined left and right border positions of each row in the horizontal direction and the upper and lower border positions of each column in the vertical direction. For example, the OCR device may determine each left border position and right border position in the horizontal direction and the smallest rectangular area or the smallest polygon area corresponding to each upper border position and bottom border position in the vertical direction as the third image the position of the outer border.

OCR设备可以对第三图像中确定出的外边框位置内的图像进行裁剪，得到第四图像，也就是得到图像白边裁剪处理后的图像。The OCR device may crop the image within the position of the outer frame determined in the third image to obtain the fourth image, that is, to obtain the image after the white border cropping process of the image.

在本发明实施例中，上述步骤S3024是对第三图像的图像白边裁剪处理过程，在丢弃第三图像中无关白边信息的同时，可以降低图像白边裁剪处理后的第四图像中的噪声信息，如上述图5中的像素点a和像素点c对后期票据信息识别的影响，提高了票据信息识别的准确性。In the embodiment of the present invention, the above-mentioned step S3024 is an image white border cropping processing process for the third image. While discarding irrelevant white border information in the third image, it is possible to reduce the image white border cropping process in the fourth image. Noise information, such as the influence of the pixel point a and the pixel point c in the above-mentioned FIG. 5 on the identification of the bill information in the later stage, improves the accuracy of the bill information identification.

步骤S3025，对第四图像进行倾斜校正，得到预处理后的待识别票据图像。Step S3025, performing tilt correction on the fourth image to obtain a preprocessed image of the bill to be recognized.

在本发明实施例中，由于在对待识别票据进行拍摄或扫描的过程中，可以能用于拍摄角度或者扫描时放置的位置，可能使得待识别票据图像中的待识别票据出现倾斜，如文字方向发生倾斜。为了降低图像倾斜给后期票据信息识别识别带来的影响，OCR设备可以对上述第四图像进行倾斜校正，从而得到倾斜校正处理后的图像，也就是预处理后的待识别票据图像，提高后期票据信息识别的准确性。In this embodiment of the present invention, during the process of photographing or scanning the bill to be recognized, it may be used for the shooting angle or the position when scanning, which may cause the bill to be recognized in the image of the bill to be recognized to be inclined, such as the direction of the text. Tilt occurs. In order to reduce the influence of image tilt on the recognition and recognition of bill information in the later stage, the OCR device can perform tilt correction on the above fourth image, so as to obtain the image after tilt correction processing, that is, the pre-processed image of the bill to be recognized. Accuracy of information identification.

在图4所示的实施例中，上述对待识别票据图像的预处理过程包括但不限于上述图像对比度增强处理、图像灰度化处理、图像二值化处理、图像白边裁剪处理以及倾斜校正处理。上述预处理后的待识别票据图像为预处理过程最后一步处理后得到的图像，并不限于上述倾斜校正处理后的图像。In the embodiment shown in FIG. 4 , the above-mentioned preprocessing process of the bill image to be recognized includes, but is not limited to, the above-mentioned image contrast enhancement processing, image grayscale processing, image binarization processing, image white border cropping processing, and skew correction processing. . The pre-processed image of the bill to be recognized is the image obtained after the last step of the pre-processing process, and is not limited to the image after the tilt correction process.

采用图4所示的方法，OCR设备通过对待识别票据图像的预处理过程，提高了后期对预处理后的待识别票据图像识别的准确性，从而提高了票据信息识别方法的准确性。Using the method shown in FIG. 4 , the OCR device improves the recognition accuracy of the preprocessed bill image to be recognized in the later stage through the preprocessing process of the bill image to be recognized, thereby improving the accuracy of the bill information identification method.

一个可选的实施例中，上述步骤S303，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息，具体可以包括以下步骤。In an optional embodiment, in the above step S303, the pre-trained bill information recognition model is used to identify bill information in the pre-processed bill image to be identified, and obtain bill information of the bill image to be identified, which may specifically include the following steps: .

步骤一，基于预先训练好的票据信息识别模型，确定预处理后的待识别票据图像中各个字符所在的区域作为字符区域，并识别字符区域中的字符，得到字符串集合。Step 1: Based on the pre-trained bill information recognition model, determine the region where each character in the preprocessed bill image to be recognized is located as the character region, and identify the characters in the character region to obtain a string set.

为便于理解，结合图6以票据信息识别模型是根据上述CTPN算法和CRNN算法所构成的深度神经网络为例进行说明，图6为本发明实施例提供的字符识别的一种示意图。For ease of understanding, the bill information recognition model is a deep neural network formed by the CTPN algorithm and the CRNN algorithm as an example for description in conjunction with FIG. 6 . FIG. 6 is a schematic diagram of character recognition provided by an embodiment of the present invention.

上述CTPN算法所对应的深度网络是结合了卷积神经网络(Convolutional NeuralNetworks，CNN)与长短期记忆网络(Long Short-Term Memory，LSTM)得到的。根据上述预处理后的待识别票据图像中文字的特点，可以将每一个文字看作是一个字符、部分字符或者多个字符所组成的一个序列。如图6所示，字符S是上述预处理后的待识别票据图像中的一个字符，利用CTPN算法可以确定该字符S可以有区域601-区域605中包括的部分字符所共同组成的。因此，区域601-区域605所共同构成的区域为预处理后的待识别票据图像中字符S所在的字符区域。以此类推，OCR设备可以识别预处理后的待识别票据图像中各个字符所在的区域，从而得到预处理后的待识别票据图像中的字符区域。The deep network corresponding to the above CTPN algorithm is obtained by combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). According to the characteristics of the characters in the preprocessed bill image to be recognized, each character can be regarded as a character, a part of a character or a sequence composed of multiple characters. As shown in FIG. 6 , the character S is a character in the preprocessed bill image to be recognized, and the CTPN algorithm can be used to determine that the character S may be composed of some characters included in the area 601-605. Therefore, the area formed by the area 601 - the area 605 is the character area where the character S is located in the preprocessed bill image to be recognized. By analogy, the OCR device can identify the region where each character is located in the preprocessed bill image to be recognized, so as to obtain the character region in the preprocessed bill image to be recognized.

在本发明实施例中，由于上述LSTM具有上下文推断能力可以提取预处理后的待识别票据图像中各个字符间的特征，因此，OCR设备在确定字符区域时，可以将存在上下文关联关系的字符划分到同一字符区域中，如上述发票中购买方名称中的每一字符之间存在上下文关联关系可以将购买方名称划分为同一字符区域。In the embodiment of the present invention, because the above-mentioned LSTM has the context inference ability, it can extract the features between characters in the preprocessed bill image to be recognized. Therefore, when the OCR device determines the character area, it can divide the characters with contextual relationship. In the same character area, for example, there is a contextual relationship between each character in the buyer's name in the above invoice, and the buyer's name can be divided into the same character area.

在本发明实施例中，上述预处理后的待识别票据图像中的字符区域可以包括多个字符区域。每一字符区域中至少包括一个字符。In the embodiment of the present invention, the character area in the pre-processed image of the bill to be recognized may include multiple character areas. Each character area includes at least one character.

一个可选的实施例中，上述字符区域可以记录为左上角顶点所在的坐标位置和字符区域的宽高值。仍以图6为例进行说明。假设预处理后的待识别票据图像中图6所示的字符S所在区域为一字符区域。该字符区域可以记录为区域601所在位置的左上角顶点坐标，区域601-区域605所对应的宽度和值，以及区域601和区域605中任一区域对应的高度值。In an optional embodiment, the above-mentioned character area may be recorded as the coordinate position where the upper left corner vertex is located and the width and height values of the character area. Still take FIG. 6 as an example for description. It is assumed that the region where the character S shown in FIG. 6 is located in the preprocessed bill image to be recognized is a character region. The character area can be recorded as the coordinates of the upper left corner vertex where the area 601 is located, the width and value corresponding to the area 601 to the area 605, and the height value corresponding to any one of the area 601 and the area 605.

一个可选的实施例中，由于上述预处理后的待识别票据图像中包括的文字信息可能较多，OCR设备可以根据每一字符区域对应的宽高比对确定出的多个字符区域进行筛选。以每一字符所在的区域为一正方形区域(即宽高比为1:1)为例进行说明，OCR设备可以可以筛选出宽高比大于2的字符区域，即丢弃宽高比不大于2的字符区域。也就是筛选出至少包括两个字符的字符区域。In an optional embodiment, since the preprocessed bill image to be recognized may contain more text information, the OCR device can screen the determined multiple character areas according to the aspect ratio corresponding to each character area. . Taking the area where each character is located as a square area (that is, the aspect ratio is 1:1) as an example, the OCR device can filter out the character area with an aspect ratio greater than 2, that is, discarding the area with an aspect ratio not greater than 2 character area. That is, the character area that contains at least two characters is filtered out.

在本发明实施例中，上述CTPN算法相比于传统的OCR技术识别方法中利用候选模板提取字符的方法，可以在很大程度上降低了对上述待识别票据图像的约束性，从而提高票据信息识别方法的通用性。In the embodiment of the present invention, the above-mentioned CTPN algorithm can greatly reduce the constraint on the above-mentioned bill image to be recognized, thereby improving bill information Generality of identification methods.

一个可选的实施例中，在确定预处理后的待识别票据图像中的字符区域后，OCR设备可以利用CRNN算法对字符区域中的文字进行识别，得到每一字符区域对应的字符串，进而得到预处理后的待识别票据图像对应的字符串集合。In an optional embodiment, after determining the character area in the preprocessed bill image to be recognized, the OCR device can use the CRNN algorithm to identify the characters in the character area, and obtain a character string corresponding to each character area, and then A string set corresponding to the preprocessed bill image to be recognized is obtained.

在本发明实施例中，上述CRNN算法所对应的深度学习网络在进行模型训练时属于端到端训练的，可以有效降低模型训练的复杂度和节约人工成本。并且，CRNN算法可以对任意长度的字符序列进行识别，在不涉及字符分割或水平尺度归一化的前提下，保留了待识别票据中字符间的上下文特征，提高了票据识别模型对票据信息识别的准确性。除此以外，CRNN算法适用于无词典场景的字符识别任务，以及基于词典场景的文本识别任务，这提高了字符识别的通用性，从而提高票据信息识别方法的通用性。In the embodiment of the present invention, the deep learning network corresponding to the above CRNN algorithm is end-to-end training during model training, which can effectively reduce the complexity of model training and save labor costs. In addition, the CRNN algorithm can recognize character sequences of any length, and without involving character segmentation or horizontal scale normalization, it retains the contextual features between characters in the bill to be recognized, and improves the bill recognition model for bill information recognition. accuracy. In addition, the CRNN algorithm is suitable for character recognition tasks without dictionary scenarios, as well as text recognition tasks based on dictionary scenarios, which improves the versatility of character recognition and thus the universality of bill information recognition methods.

步骤二，对字符串集合进行正则化匹配，得到待识别票据图像的票据信息。In step 2, regularized matching is performed on the string set to obtain the bill information of the bill image to be recognized.

在本步骤中，OCR设备可以按照预设匹配规则，对字符串集合进行正则化匹配，得到待识别票据图像的票据信息。例如，一种已知的增值税发票中发票代码的规则一般为10位数字或12位数字。OCR设备可以将该规则确定为预设匹配规则中的一条规则，从而根据该规则识别出增值税发票中的发票代码。In this step, the OCR device may perform regularized matching on the character string set according to the preset matching rules to obtain the bill information of the bill image to be recognized. For example, a known rule for invoice codes in VAT invoices is typically 10 or 12 digits. The OCR device may determine the rule as one of the preset matching rules, so as to identify the invoice code in the VAT invoice according to the rule.

在本发明实施中，上述预设匹配规则可以根据不同的票据类型的票据中包括的票据信息的特征进行设定。在此，对上述预设匹配规则不作具体限定。In the implementation of the present invention, the above-mentioned preset matching rules may be set according to the characteristics of bill information included in bills of different bill types. Here, the above preset matching rules are not specifically limited.

通过上述步骤一和步骤二，OCR设备可以准确识别预处理后的待识别票据中的票据信息，提高了票据信息识别的准确性以及票据信息识别方法的通用性。Through the above steps 1 and 2, the OCR device can accurately identify the bill information in the preprocessed bills to be identified, which improves the accuracy of bill information identification and the generality of the bill information identification method.

一个可选的实施例中，根据图3所示的方法，本发明实施例还提供了一种票据信息识别方法。如图7所示，图7为本发明实施例提供的票据信息识别方法的第二种流程示意图。该方法包括以下步骤。In an optional embodiment, according to the method shown in FIG. 3 , an embodiment of the present invention further provides a method for identifying bill information. As shown in FIG. 7 , FIG. 7 is a schematic flowchart of a second type of bill information identification method provided by an embodiment of the present invention. The method includes the following steps.

步骤S701，接收对待识别票据图像的信息识别请求。Step S701, receiving an information identification request for the image of the bill to be identified.

上述步骤S701与上述步骤S301相同。The above-mentioned step S701 is the same as the above-mentioned step S301.

步骤S702，对信息识别请求进行认证，得到认证结果。In step S702, the information identification request is authenticated, and an authentication result is obtained.

在本步骤中，OCR设备可以在接收到上述信息识别请求后，可以根据该信息识别请求中携带的除上述待识别票据图像以外的其他信息对该信息识别请求进行认证。In this step, after receiving the above information identification request, the OCR device may authenticate the information identification request according to other information carried in the information identification request except the above-mentioned bill image to be identified.

在本发明实施例中，上述信息识别请求在网络中传输是以报文的形式进行的。因此，上述信息识别请求中还可以包括其他信息。以信息识别请求为超文本传输协议(HyperText Transfer Protocol，HTTP)报文为例，信息识别请求中还可以发送端的地址信息，报文的报头(Headers)字段，报文的内容长度(Content-Length)等信息。In this embodiment of the present invention, the above-mentioned information identification request is transmitted in the form of a message in the network. Therefore, the above information identification request may also include other information. Taking the information identification request as a HyperText Transfer Protocol (HTTP) message as an example, the information identification request may also include the address information of the sender, the header field of the message, and the content length (Content-Length) of the message. ) and other information.

为便于理解，以上述内容长度对信息识别请求的认证过程进行说明。当信息识别请求的内容长度大于预设内容长度阈值时，OCR设备可以确定该信息识别请求为非法请求。当信息识别请求的内容长度不大于预设内容长度阈值时，OCR设备可以确定该信息识别请求为合法请求。For ease of understanding, the authentication process of the information identification request is described with the above content length. When the content length of the information identification request is greater than the preset content length threshold, the OCR device may determine that the information identification request is an illegal request. When the content length of the information identification request is not greater than the preset content length threshold, the OCR device may determine that the information identification request is a legitimate request.

在本发明实施例中，对上述信息识别请求的合法认证过程不作具体限定。In the embodiment of the present invention, the legal authentication process of the above-mentioned information identification request is not specifically limited.

步骤S703，在认证结果指示信息识别请求为合法请求时，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。Step S703, when the authentication result indicates that the information identification request is a legitimate request, based on the information identification request, preprocess the to-be-recognized bill image to obtain a pre-processed to-be-identified bill image.

在本发明实施例中，通过上述信息识别请求的合法认证过程，可以及时避免OCR设备对非法的信息识别请求中的待识别票据信息进行识别，有效节约了OCR设备的计算资源，从而减少了OCR设备所需识别的票据图像的数量，提高了OCR设备的对票据信息的识别效率。In the embodiment of the present invention, through the legal authentication process of the above-mentioned information identification request, the OCR device can be prevented from identifying the bill information to be identified in the illegal information identification request in time, and the computing resources of the OCR device are effectively saved, thereby reducing the OCR device. The number of bill images that the device needs to identify improves the identification efficiency of the bill information by the OCR device.

步骤S704，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息。Step S704, using the pre-trained bill information recognition model to identify the bill information in the preprocessed bill image to be identified, and obtain bill information of the bill image to be identified; wherein, the bill information identification model is obtained by training according to a preset training set Yes, the preset training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image.

步骤S705，返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。Step S705, returning a first response message to the information identification request, where the first response message includes the bill information of the bill image to be identified.

上述步骤S704-步骤S705与上述步骤S303-步骤S304相同。The above steps S704 to S705 are the same as the above steps S303 to S304.

一个可选的实施例中，根据图3所示的方法，本发明实施例还提供了一种票据信息识别方法。如图8所示，图8为本发明实施例提供的票据信息识别方法的第三种流程示意图。该方法包括以下步骤。In an optional embodiment, according to the method shown in FIG. 3 , an embodiment of the present invention further provides a method for identifying bill information. As shown in FIG. 8 , FIG. 8 is a third schematic flowchart of a method for identifying bill information provided by an embodiment of the present invention. The method includes the following steps.

步骤S801，接收对待识别票据图像的信息识别请求。Step S801, receiving an information identification request for the image of the bill to be identified.

上述步骤S801与上述步骤S301相同。The above-mentioned step S801 is the same as the above-mentioned step S301.

步骤S802，在当前计算资源被占用的情况下，返回针对信息识别请求的第二响应消息；第二响应消息包括指示计算资源被占用的标识码。Step S802, in the case that the current computing resources are occupied, return a second response message for the information identification request; the second response message includes an identification code indicating that the computing resources are occupied.

在本步骤中，OCR设备在接收到上述信息识别请求时，可以确定当前计算资源是否被占用，也就是判断OCR设备当前是否处理接收到的其他识别请求。在确定当前计算资源被占用时，OCR设备可以返回针对上述信息识别请求的第二响应消息。关于第二响应消息的发送方法可参照上述第一响应消息的发送方法，在此不作具体说明。In this step, when the OCR device receives the above information identification request, it can determine whether the current computing resources are occupied, that is, determine whether the OCR device is currently processing other received identification requests. When it is determined that the current computing resources are occupied, the OCR device may return a second response message for the above-mentioned information identification request. For the method for sending the second response message, reference may be made to the method for sending the first response message, which will not be described in detail here.

在传统的OCR技术中，请求消息一般是通过消息队列的形式进行处理的，但是由于OCR服务特性，较长的时延将会导致消息队列过长，出现不可预估的错误，如请求消息溢出等情况。在本发明实施例中，OCR设备根据当前计算资源的被占用情况，可以保证OCR设备中的每一OCR计算实例单元及时拒绝其他信息识别请求，保证了对信息识别请求响应的有序性。In traditional OCR technology, request messages are generally processed in the form of message queues. However, due to the OCR service characteristics, long delays will lead to long message queues and unpredictable errors, such as request message overflow. and so on. In the embodiment of the present invention, the OCR device can ensure that each OCR computing instance unit in the OCR device rejects other information identification requests in time according to the current occupation of computing resources, and ensures the orderliness of responses to the information identification requests.

步骤S803，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。Step S803 , based on the information identification request, preprocess the image of the bill to be identified to obtain a preprocessed image of the bill to be identified.

在本步骤中，当确定当前计算资源未被占用时，OCR设备可以基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。In this step, when it is determined that the current computing resources are not occupied, the OCR device may preprocess the image of the bill to be identified based on the information identification request to obtain the preprocessed bill image to be identified.

步骤S804，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息。Step S804, using the pre-trained bill information identification model to identify bill information in the pre-processed bill image to be identified, and obtain bill information of the bill image to be identified; wherein, the bill information identification model is obtained by training according to a preset training set Yes, the preset training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image.

步骤S805，返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。Step S805, returning a first response message to the information identification request, where the first response message includes the bill information of the bill image to be identified.

上述步骤S804-步骤S805与上述步骤S303-步骤S304相同。The above steps S804 to S805 are the same as the above steps S303 to S304.

一个可选的实施例中，根据图3所示的方法，本发明实施例还提供了一种票据信息识别方法。如图9所示，图9为本发明实施例提供的票据信息识别方法的第四种流程示意图。该方法包括以下步骤。In an optional embodiment, according to the method shown in FIG. 3 , an embodiment of the present invention further provides a method for identifying bill information. As shown in FIG. 9 , FIG. 9 is a fourth schematic flowchart of a method for identifying bill information according to an embodiment of the present invention. The method includes the following steps.

步骤S901，接收对待识别票据图像的信息识别请求。Step S901, receiving an information identification request for the image of the bill to be identified.

步骤S902，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。Step S902 , based on the information identification request, preprocess the image of the bill to be identified to obtain a preprocessed image of the bill to be identified.

步骤S903，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息。Step S903, using the pre-trained bill information recognition model to identify bill information in the pre-processed bill image to be identified, and obtain bill information of the bill image to be identified; wherein, the bill information identification model is obtained by training according to a preset training set Yes, the preset training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image.

上述步骤S901-步骤S903与上述步骤S301-步骤S303相同。The above steps S901 to S903 are the same as the above steps S301 to S303.

步骤S904，将待识别票据图像和待识别票据图像的票据信息存储至票据信息识别系统中的存储设备。Step S904, the bill image to be recognized and the bill information of the bill image to be identified are stored in the storage device in the bill information identification system.

在本步骤中，OCR设备可以将上述待识别票据图像和待识别票据图像的票据信息发送给票据信息识别系统中的存储设备。存储设备在接收到OCR设备发送的待识别票据图像和待识别票据图像的票据信息后，可以计算待识别票据图像进行哈希(Hash)计算，得到该待识别票据图像的Hash值，并存储该待识别票据图像。进而存储设备可以存储待识别票据图像所对应的Hash值，绝对地址以及票据信息。In this step, the OCR device may send the above-mentioned bill image to be identified and bill information of the bill image to be identified to the storage device in the bill information identification system. After receiving the bill image to be identified and the bill information of the bill image to be identified sent by the OCR device, the storage device can calculate the bill image to be identified and perform Hash calculation, obtain the Hash value of the bill image to be identified, and store the bill image to be identified. The image of the ticket to be recognized. Further, the storage device can store the Hash value, the absolute address and the bill information corresponding to the bill image to be identified.

一个可选的实施例中，上述存储设备可以利用预设Hash算法，对上述待识别票据图像进行Hash计算，得到待识别票据图像的Hash值。其中，预设Hash算法包括但不限于信息摘要算法(Message-Digest Algorithm，MD5)、安全散列(Secure Hash Algorithm，SHA)算法。In an optional embodiment, the above-mentioned storage device may use a preset Hash algorithm to perform Hash calculation on the above-mentioned bill image to be recognized, to obtain a Hash value of the bill image to be recognized. The preset Hash algorithm includes but is not limited to a message digest algorithm (Message-Digest Algorithm, MD5) and a secure hash (Secure Hash Algorithm, SHA) algorithm.

一个可选的实施例中，存储设备在对上述待识别票据图像对应的Hash值、绝对地址以及票据信息进行存储时，可以建立Hash值、绝对地址和票据信息三者之间对应关系，从而将该对应关系存储到上述存储设备中的预设对应关系列表中。如表1所示，表1为本发明实施例提供对应关系存储的一种方式。In an optional embodiment, when the storage device stores the Hash value, the absolute address and the ticket information corresponding to the above-mentioned ticket image to be recognized, the corresponding relationship between the Hash value, the absolute address and the ticket information can be established, so as to store the corresponding relationship between the Hash value, the absolute address and the ticket information. The corresponding relationship is stored in the preset corresponding relationship list in the above-mentioned storage device. As shown in Table 1, Table 1 is a manner for providing correspondence storage in an embodiment of the present invention.

表1Table 1

Hash值Hash value 绝对地址absolute address 票据信息Ticket information Hash值1Hash value 1 地址1Address 1 票据信息1Ticket Information 1 Hash值2Hash value 2 地址2address 2 票据信息3Ticket Information 3

存储设备可以按照表1所示的方式将待识别票据图像对应的Hash值、绝对地址以及票据信息存储到上述表1中。在本发明实施例中，对上述将待识别票据图像对应的Hash值、绝对地址以及票据信息的存储方式不作具体限定。The storage device may store the Hash value, absolute address and bill information corresponding to the bill image to be recognized in the above-mentioned Table 1 according to the manner shown in Table 1. In the embodiment of the present invention, there is no specific limitation on the storage manner of the above-mentioned hash value, absolute address, and bill information corresponding to the bill image to be identified.

在本发明实施例中，在确定待识别票据图像的票据信息后通过对该票据信息的持久化存储，可以便于后期的票据信息查询、统计等操作。In the embodiment of the present invention, after the bill information of the bill image to be identified is determined, the bill information is stored persistently, which can facilitate later operations such as bill information query and statistics.

一个可选的实施例中，上述存储设备在接收到OCR设备发送的待识别票据图像和待识别票据图像的票据信息后，也可以直接对该待识别票据图像和待识别票据图像的票据信息进行存储。在此，上述待识别票据图像的票据信息的存储不作具体限定。In an optional embodiment, after receiving the bill image to be identified and the bill information of the bill image to be identified sent by the OCR device, the storage device can also directly perform the processing on the bill image to be identified and the bill information of the bill image to be identified. storage. Here, the storage of the bill information of the bill image to be recognized is not specifically limited.

步骤S905，返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。Step S905, returning a first response message to the information identification request, where the first response message includes the bill information of the bill image to be identified.

上述步骤S905与上述步骤S304相同。The above-mentioned step S905 is the same as the above-mentioned step S304.

在本发明实施例中，对上述步骤S904与步骤S905的执行顺序不作具体限定。In this embodiment of the present invention, the execution order of the foregoing steps S904 and S905 is not specifically limited.

为便于理解，结合图10对上述本发明实施例提供的票据信息识别方法进行说明。图10为本发明实施例提供的票据信息识别过程的一种信令图。具体可以包括以下步骤。For ease of understanding, the method for identifying bill information provided by the above embodiments of the present invention will be described with reference to FIG. 10 . FIG. 10 is a signaling diagram of a ticket information identification process provided by an embodiment of the present invention. Specifically, the following steps may be included.

步骤S1001，客户端向OCR设备发送包括待识别票据图像的信息识别请求。Step S1001, the client sends an information identification request including an image of the bill to be identified to the OCR device.

步骤S1002，OCR设备接收信息识别请求。Step S1002, the OCR device receives an information identification request.

步骤S1003，OCR设备对信息识别对信息识别请求进行认证，得到认证结果。Step S1003, the OCR device authenticates the information identification request to obtain an authentication result.

步骤S1004，OCR设备在认证结果指示信息识别请求为合法请求时，确定当前计算资源是否被占用。若是，则执行步骤S1005。若否，则执行步骤S1006。Step S1004, when the authentication result indicates that the information identifies the request as a legitimate request, the OCR device determines whether the current computing resources are occupied. If yes, step S1005 is executed. If not, step S1006 is executed.

当上述认证结果指示信息识别请求为非法请求时，OCR设备可以不响应该信息识别请求，如丢弃该信息识别请求。When the above authentication result indicates that the information identification request is an illegal request, the OCR device may not respond to the information identification request, such as discarding the information identification request.

步骤S1005，OCR设备返回针对信息识别请求的第二响应消息。Step S1005, the OCR device returns a second response message for the information identification request.

其中，上述第二响应消息包括指示计算资源被占用的标识码。Wherein, the above-mentioned second response message includes an identification code indicating that the computing resource is occupied.

在本发明实施例中，上述步骤S1002-步骤S1005可以由上述OCR设备中的请求代理中间件执行。In this embodiment of the present invention, the foregoing steps S1002 to S1005 may be performed by the request proxy middleware in the foregoing OCR device.

步骤S1006，OCR设备基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。Step S1006, the OCR device preprocesses the bill image to be identified based on the information identification request to obtain a preprocessed bill image to be identified.

步骤S1007，OCR设备利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息。Step S1007, the OCR device uses the pre-trained bill information identification model to identify bill information in the preprocessed bill image to be identified, and obtain bill information of the bill image to be identified.

步骤S1008，OCR设备将待识别票据图像和待识别票据图像的票据信息发送给存储设备。Step S1008, the OCR device sends the bill image to be recognized and the bill information of the bill image to be identified to the storage device.

步骤S1009，存储设备对接收到的待识别票据图像和待识别票据图像的票据信息进行对应存储。Step S1009, the storage device correspondingly stores the received bill image to be identified and bill information of the bill image to be identified.

步骤S1010，OCR设备向客户端返回针对信息识别请求的第一响应消息。Step S1010, the OCR device returns a first response message for the information identification request to the client.

其中，上述第一响应消息包括待识别票据图像的票据信息。Wherein, the above-mentioned first response message includes bill information of the bill image to be identified.

在本发明实施例中，对上述步骤S1008和步骤S1010的执行顺序不作具体限定。In this embodiment of the present invention, the execution order of the foregoing steps S1008 and S1010 is not specifically limited.

基于同一种发明构思，根据上述本发明实施例提供的票据信息识别方法，本发明实施例还提供了一种票据信息识别装置。如图11所示，图11为本发明实施例提供的票据识别装置的一种结构示意图。该装置应用于据信息识别系统中的OCR设备，具体包括以下模块。Based on the same inventive concept, and according to the method for identifying bill information provided by the above embodiments of the present invention, the embodiment of the present invention further provides a device for identifying bill information. As shown in FIG. 11 , FIG. 11 is a schematic structural diagram of a bill identification device provided by an embodiment of the present invention. The device is applied to the OCR equipment in the information identification system, and specifically includes the following modules.

接收模块1101，用于接收对待识别票据图像的信息识别请求；A receiving module 1101, configured to receive an information identification request for an image of a bill to be identified;

预处理模块1102，用于基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像；The preprocessing module 1102 is configured to preprocess the bill image to be identified based on the information identification request to obtain the preprocessed bill image to be identified;

第一识别模块1103，用于利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；The first identification module 1103 is used to identify the bill information in the preprocessed bill image to be identified by using the pre-trained bill information identification model, and obtain bill information of the bill image to be identified; Assuming that the training set is obtained by training, the preset training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image;

第一返回模块1104，用于返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。The first returning module 1104 is configured to return a first response message to the information identification request, where the first response message includes the ticket information of the ticket image to be identified.

可选的，上述第一识别模块1103，具体可以用于基于预先训练好的票据信息识别模型，确定预处理后的待识别票据图像中各个字符所在的区域作为字符区域，并识别字符区域中的字符，得到字符串集合；对字符串集合进行正则化匹配，得到待识别票据图像的票据信息。Optionally, the above-mentioned first recognition module 1103 can be specifically used to determine the region where each character in the preprocessed bill image to be recognized is located as the character region based on the pre-trained bill information recognition model, and identify the characters in the character region. character to obtain a string set; regularized matching is performed on the string set to obtain the bill information of the bill image to be recognized.

可选的，上述预处理模块1102，具体可以用于利用预设伽玛系数，对信息识别请求中的待识别票据图像进行伽玛变换，得到第一图像；Optionally, the above-mentioned preprocessing module 1102 can be specifically configured to use a preset gamma coefficient to perform gamma transformation on the image of the bill to be identified in the information identification request to obtain the first image;

将第一图像转换为灰度图像，得到第二图像；Convert the first image to a grayscale image to obtain a second image;

基于第二图像中各个像素点对应像素值的平均值，对第二图像进行图像二值化处理，得到第三图像；Based on the average value of the corresponding pixel values of each pixel in the second image, image binarization processing is performed on the second image to obtain a third image;

根据第三图像中水平方向上每一行包括的各个像素点的像素值，竖直方向上每一列包括的各个像素点的像素值，以及预设像素值的像素点数量阈值，确定第三图像的外边框位置，并对外边框位置内的图像进行裁剪，得到第四图像；According to the pixel value of each pixel included in each row in the horizontal direction in the third image, the pixel value of each pixel included in each column in the vertical direction, and the pixel number threshold of the preset pixel value, determine the pixel value of the third image the position of the outer border, and crop the image within the position of the outer border to obtain a fourth image;

对第四图像进行倾斜校正，得到预处理后的待识别票据图像。Tilt correction is performed on the fourth image to obtain a preprocessed image of the bill to be recognized.

可选的，上述票据信息设备装置还可以包括：Optionally, the above-mentioned ticket information equipment device may further include:

认证模块，用于对信息识别请求进行认证，得到认证结果；The authentication module is used to authenticate the information identification request and obtain the authentication result;

上述预处理模块1102，具体可以用于在认证模块的认证结果指示信息识别请求为合法请求时，基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像。The above-mentioned preprocessing module 1102 can be specifically configured to preprocess the image of the bill to be identified based on the information identification request when the authentication result of the authentication module indicates that the information identification request is a legitimate request, and obtain a preprocessed bill image to be identified.

第二返回模块，用于在接收对待识别票据图像的信息识别请求之后，若当前计算资源被占用，则返回针对信息识别请求的第二响应消息；第二响应消息包括指示计算资源被占用的标识码。The second return module is configured to return a second response message for the information identification request if the current computing resources are occupied after receiving the information identification request for the bill image to be identified; the second response message includes an identifier indicating that the computing resources are occupied code.

发送模块，用于在返回针对信息识别请求的第一响应消息之前，将待识别票据图像和待识别票据图像的票据信息存储至票据信息识别系统中的存储设备。The sending module is configured to store the to-be-recognized ticket image and the ticket information of the to-be-recognized ticket image to the storage device in the ticket information identification system before returning the first response message to the information identification request.

获取模块，用于获取预设训练集；The acquisition module is used to acquire the preset training set;

第二识别模块，用于针对每一样本票据图像，利用预设深度学习网络，识别该样本票据图像中票据信息，得到该样本票据图像的预测票据信息；The second identification module is configured to, for each sample bill image, use a preset deep learning network to identify bill information in the sample bill image, and obtain the predicted bill information of the sample bill image;

计算模块，用于基于每一样本票据图像对应的样本票据信息和预测票据信息，计算预设深度学习网络的的损失值；a calculation module, configured to calculate the loss value of the preset deep learning network based on the sample bill information and the predicted bill information corresponding to each sample bill image;

调整模块，用于当损失值大于预设损失值阈值时，调整预设深度学习网络的参数，并返回执行针对每一样本票据图像，利用预设深度学习网络，识别该样本票据图像中票据信息，得到该样本票据图像的预测票据信息的步骤；The adjustment module is used to adjust the parameters of the preset deep learning network when the loss value is greater than the preset loss value threshold, and return to execute the preset deep learning network for each sample bill image to identify the bill information in the sample bill image. , the steps of obtaining the predicted bill information of the sample bill image;

确定模块，用于当损失值不大于预设损失值阈值时，将当前的预设深度学习网络确定为训练好的票据信息识别模型。The determining module is configured to determine the current preset deep learning network as the trained bill information recognition model when the loss value is not greater than the preset loss value threshold.

通过本发明实施例提供的装置，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。With the device provided by the embodiment of the present invention, the OCR device in the bill information identification system, after receiving the information identification request for the bill image to be identified, uses the pre-trained bill information identification model to identify the preprocessed bill image to be identified to obtain the bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

基于同一种发明构思，根据上述本发明实施例提供的票据信息识别方法，本发明实施例还提供了一种票据信息识别系统。如图12所示，图12为本发明实施例提供的票据识别系统的一种结构示意图。该票据信息识别系统包括的OCR设备1201和存储设备1202；Based on the same inventive concept, and according to the bill information identification method provided by the above embodiments of the present invention, the embodiments of the present invention further provide a bill information identification system. As shown in FIG. 12 , FIG. 12 is a schematic structural diagram of a bill identification system provided by an embodiment of the present invention. The OCR device 1201 and the storage device 1202 included in the ticket information identification system;

上述OCR设备1201，用于接收对待识别票据图像的信息识别请求；基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像；利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息；The above-mentioned OCR device 1201 is used to receive an information identification request for a bill image to be identified; based on the information identification request, preprocess the bill image to be identified to obtain a preprocessed bill image to be identified; using the pre-trained bill information identification model, Recognize the bill information in the preprocessed bill image to be identified, and obtain bill information of the bill image to be identified; wherein, the bill information recognition model is trained according to a preset training set, and the preset training set includes a plurality of different bill types Sample bill images, and sample bill information for each sample bill image; return a first response message to the information identification request, where the first response message includes the bill information of the bill image to be identified;

上述存储设备1202，用于存储待识别票据图像的票据信息。The above-mentioned storage device 1202 is used for storing bill information of the bill image to be recognized.

通过本发明实施例提供的系统，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。With the system provided by the embodiment of the present invention, after receiving the information identification request for the bill image to be identified, the OCR device in the bill information identification system uses the pre-trained bill information identification model to identify the preprocessed bill image to be identified to obtain the bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

基于同一种发明构思，根据上述本发明实施例提供的票据信息识别方法，本发明实施例还提供了一种电子设备，该电子设备为上述OCR设备，如图13所示，包括处理器1301、通信接口1302、存储器1303和通信总线1304，其中，处理器1301，通信接口1302，存储器1303通过通信总线1304完成相互间的通信；Based on the same inventive concept, according to the bill information identification method provided by the above embodiments of the present invention, the embodiments of the present invention further provide an electronic device, which is the above OCR device, as shown in FIG. 13 , including a processor 1301, A communication interface 1302, a memory 1303 and a communication bus 1304, wherein the processor 1301, the communication interface 1302, and the memory 1303 communicate with each other through the communication bus 1304;

存储器1303，用于存放计算机程序；The memory 1303 is used to store computer programs;

处理器1301，用于执行存储器1303上所存放的程序时，实现如下步骤：When the processor 1301 is used to execute the program stored in the memory 1303, the following steps are implemented:

基于信息识别请求，对待识别票据图像进行预处理，得到预处理后的待识别票据图像；Based on the information identification request, preprocess the bill image to be identified to obtain the preprocessed bill image to be identified;

利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，得到待识别票据图像的票据信息；其中，票据信息识别模型是根据预设训练集训练得到的，预设训练集包括多个不同票据类型的样本票据图像，以及每一样本票据图像的样本票据信息；The pre-trained bill information recognition model is used to identify bill information in the preprocessed bill image to be identified, and the bill information of the bill image to be identified is obtained; wherein, the bill information identification model is trained according to a preset training set, and the pre- Suppose the training set includes multiple sample bill images of different bill types, and sample bill information of each sample bill image;

返回针对信息识别请求的第一响应消息，第一响应消息包括待识别票据图像的票据信息。A first response message to the information identification request is returned, where the first response message includes ticket information of the ticket image to be identified.

通过本发明实施例提供的电子设备，票据信息识别系统中的OCR设备在接收到针对待识别票据图像的信息识别请求后，利用预先训练好的票据信息识别模型，识别预处理后的待识别票据图像中的票据信息，从而得到待处理票据图像的票据信息。相比于现有技术，利用OCR设备识别票据信息克服了人工识别对人力资源数量的依赖性，并且，由于票据信息识别模型是基于不同票据类型的样本票据图像训练得到的，使得票据信息识别模型直接可以识别出不同票据类型的票据图像的票据信息，有效降低了对预设候选模板以及字符特征信息的依赖，克服了传统的OCR技术中字符识别的局限性，这有效解决了现有票据信息识别方法所存在的问题，从而降低了对人力资源数量的依赖，提高了票据识别方法的通用性。With the electronic device provided by the embodiment of the present invention, the OCR device in the bill information identification system, after receiving the information identification request for the bill image to be identified, uses the pre-trained bill information identification model to identify the preprocessed bill to be identified The bill information in the image, so as to obtain the bill information of the bill image to be processed. Compared with the prior art, the use of OCR equipment to identify bill information overcomes the dependence of manual identification on the number of human resources, and since the bill information identification model is trained based on sample bill images of different bill types, the bill information identification model is The bill information of bill images of different bill types can be directly identified, which effectively reduces the dependence on preset candidate templates and character feature information, overcomes the limitations of character recognition in traditional OCR technology, and effectively solves the problem of existing bill information. The problems existing in the identification method are reduced, thereby reducing the dependence on the number of human resources and improving the generality of the bill identification method.

上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral ComponentInterconnect，PCI)总线或扩展工业标准结构(Extended Industry StandardArchitecture，EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示，图中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above electronic device may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like. The communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

通信接口用于上述电子设备与其他设备之间的通信。The communication interface is used for communication between the above electronic device and other devices.

存储器可以包括随机存取存储器(Random Access Memory，RAM)，也可以包括非易失性存储器(Non-Volatile Memory，NVM)，例如至少一个磁盘存储器。可选的，存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located away from the aforementioned processor.

上述的处理器可以是通用处理器，包括中央处理器(Central Processing Unit，CPU)、网络处理器(Network Processor，NP)等；还可以是数字信号处理器(Digital SignalProcessing，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现场可编程门阵列(Field-Programmable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (Digital Signal Processing, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

基于同一种发明构思，根据上述本发明实施例提供的票据信息识别方法，本发明实施例还提供了一种计算机可读存储介质，该计算机可读存储介质内存储有计算机程序，所述计算机程序被处理器执行时实现上述任一票据信息识别方法的步骤。Based on the same inventive concept, according to the bill information identification method provided by the above embodiments of the present invention, the embodiments of the present invention also provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program When executed by the processor, the steps of any one of the above-mentioned methods for recognizing bill information are realized.

基于同一种发明构思，根据上述本发明实施例提供的票据信息识别方法，本发明实施例还提供了一种包含指令的计算机程序产品，当其在计算机上运行时，使得计算机执行上述实施例中任一票据信息识别方法。Based on the same inventive concept, according to the bill information identification method provided by the above embodiments of the present invention, the embodiments of the present invention also provide a computer program product including instructions, which, when run on a computer, enables the computer to execute the above embodiments Any ticket information identification method.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时，全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质，(例如，软盘、硬盘、磁带)、光介质(例如，DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), among others.

需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

本说明书中的各个实施例均采用相关的方式描述，各个实施例之间相同相似的部分互相参见即可，每个实施例重点说明的都是与其他实施例的不同之处。尤其，对于装置、系统、电子设备、计算机可读存储介质及计算机程序产品等施例而言，由于其基本相似于方法实施例，所以描述的比较简单，相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a related manner, and the same and similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for embodiments such as apparatuses, systems, electronic devices, computer-readable storage media, and computer program products, since they are basically similar to the method embodiments, the description is relatively simple, and for related details, please refer to the partial descriptions of the method embodiments. That's it.

以上所述仅为本发明的较佳实施例，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等，均包含在本发明的保护范围内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. A bill information recognition method is applied to an Optical Character Recognition (OCR) device in a bill information recognition system, and the method comprises the following steps:

receiving an information identification request of a bill image to be identified;

preprocessing the bill image to be recognized based on the information recognition request to obtain a preprocessed bill image to be recognized;

identifying bill information in the preprocessed bill image to be identified by utilizing a pre-trained bill information identification model to obtain the bill information of the bill image to be identified; the bill information recognition model is obtained by training according to a preset training set, wherein the preset training set comprises a plurality of sample bill images of different bill types and sample bill information of each sample bill image;

and returning a first response message aiming at the information identification request, wherein the first response message comprises the bill information of the bill image to be identified.

2. The method according to claim 1, wherein the step of identifying the bill information in the preprocessed bill image to be identified by using the pre-trained bill information identification model to obtain the bill information of the bill image to be identified comprises:

determining the area where each character is located in the preprocessed bill image to be recognized as a character area based on a pre-trained bill information recognition model, and recognizing the characters in the character area to obtain a character string set;

and performing regularization matching on the character string set to obtain bill information of the bill image to be identified.

3. The method according to claim 1, wherein the step of preprocessing the ticket image to be recognized based on the information recognition request to obtain a preprocessed ticket image to be recognized comprises:

performing gamma transformation on the bill image to be identified in the information identification request by using a preset gamma coefficient to obtain a first image;

converting the first image into a gray image to obtain a second image;

performing image binarization processing on the second image based on the average value of the pixel values corresponding to all the pixel points in the second image to obtain a third image;

determining the outer frame position of the third image according to the pixel values of all pixel points included in each row in the horizontal direction of the third image, the pixel values of all pixel points included in each column in the vertical direction and the pixel number threshold of a preset pixel value, and cutting the image in the outer frame position to obtain a fourth image;

and performing inclination correction on the fourth image to obtain a preprocessed bill image to be recognized.

4. The method according to claim 1, before preprocessing the to-be-recognized bill image based on the information recognition request to obtain a preprocessed to-be-recognized bill image, further comprising:

authenticating the information identification request to obtain an authentication result;

the step of preprocessing the bill image to be recognized based on the information recognition request to obtain a preprocessed bill image to be recognized comprises the following steps:

and when the authentication result indicates that the information identification request is a legal request, preprocessing the to-be-identified bill image based on the information identification request to obtain a preprocessed to-be-identified bill image.

5. The method according to any one of claims 1 to 4, further comprising, after receiving the information identification request for the ticket image to be identified:

returning a second response message to the information identification request if the current computing resource is occupied; the second response message includes an identification code indicating that the computing resource is occupied.

6. The method of claim 1, further comprising, prior to returning the first response message to the information identification request:

and storing the bill image to be identified and the bill information of the bill image to be identified to a storage device in the bill information identification system.

7. The method of claim 1, wherein the bill information recognition model is trained by the following steps:

acquiring the preset training set;

for each sample bill image, utilizing a preset deep learning network to identify bill information in the sample bill image to obtain predicted bill information of the sample bill image;

calculating a loss value of the preset deep learning network based on sample bill information and predicted bill information corresponding to each sample bill image;

when the loss value is larger than a preset loss value threshold value, adjusting parameters of the preset deep learning network, and returning to execute the step of identifying bill information in the sample bill image by using the preset deep learning network aiming at each sample bill image to obtain predicted bill information of the sample bill image;

and when the loss value is not greater than the preset loss value threshold value, determining the current preset deep learning network as a trained bill information recognition model.

8. A bill information recognition apparatus, applied to an Optical Character Recognition (OCR) device in a bill information recognition system, comprising:

the receiving module is used for receiving an information identification request of a bill image to be identified;

the preprocessing module is used for preprocessing the bill image to be recognized based on the information recognition request to obtain a preprocessed bill image to be recognized;

the first identification module is used for identifying the bill information in the preprocessed bill image to be identified by utilizing a pre-trained bill information identification model to obtain the bill information of the bill image to be identified; the bill information recognition model is obtained by training according to a preset training set, wherein the preset training set comprises a plurality of sample bill images of different bill types and sample bill information of each sample bill image;

and the first returning module is used for returning a first response message aiming at the information identification request, and the first response message comprises the bill information of the bill image to be identified.

9. The device according to claim 8, wherein the first recognition module is specifically configured to determine, based on a pre-trained bill information recognition model, an area where each character in the preprocessed bill image to be recognized is located as a character area, and recognize the character in the character area to obtain a character string set; and performing regularization matching on the character string set to obtain bill information of the bill image to be identified.

10. A bill information recognition system is characterized by comprising an Optical Character Recognition (OCR) device and a storage device;

the OCR equipment is used for receiving an information identification request of a bill image to be identified; preprocessing the bill image to be recognized based on the information recognition request to obtain a preprocessed bill image to be recognized; identifying bill information in the preprocessed bill image to be identified by utilizing a pre-trained bill information identification model to obtain the bill information of the bill image to be identified; the bill information recognition model is obtained by training according to a preset training set, wherein the preset training set comprises a plurality of sample bill images of different bill types and sample bill information of each sample bill image; returning a first response message aiming at the information identification request, wherein the first response message comprises bill information of the bill image to be identified;

and the storage equipment is used for storing the bill information of the bill image to be identified.