CN109657683A

CN109657683A - Text region modeling method and device, character recognition method and electronic equipment

Info

Publication number: CN109657683A
Application number: CN201811559356.3A
Authority: CN
Inventors: 王彦皓
Original assignee: Beijing Pixel Software Technology Co Ltd
Current assignee: Beijing Pixel Software Technology Co Ltd
Priority date: 2018-12-19
Filing date: 2018-12-19
Publication date: 2019-04-19

Abstract

The embodiment of the invention provides a kind of Text region modeling method and devices, character recognition method and electronic equipment, the Text region modeling method is by carrying out picture processing respectively to the multiple letter symbols prestored, it generates letter symbol picture corresponding with each letter symbol and stores into picture file, it extracts the store path of each letter symbol picture in picture file and label is set, it generates database file and is stored, file generated model, which is defined, by setting model describes file, the store path write-in model of database file is defined into file, parameter definition file is set, the store path write parameters that model defines file are defined into file, parameter definition file is input to model again and describes file, generate initial model, repeatedly training is carried out to obtain caffemodel file to initial model, by the c of acquisition Affemodel file is input to the testing tool of setting, obtains Text region model, obtains the Text region model of high recognition efficiency as a result,.

Description

Text region modeling method and device, character recognition method and electronic equipment

Technical field

The present invention relates to technical field of character recognition, in particular to a kind of Text region modeling method and device, text Word recognition methods and electronic equipment.

Background technique

With the rapid development of Internet, the application that Text region obtains in more and more fields, is the production of people It brings great convenience with life.However, the recognition accuracy of current Text region is to be improved.

Summary of the invention

In view of this, the present invention provides a kind of Text region modeling methods and device, character recognition method and electronics to set It is standby.

In a first aspect, the embodiment of the invention provides a kind of Text region modeling method, the Text region modeling method Include:

Picture processing respectively is carried out to the multiple letter symbols prestored, generates Chinese character corresponding with each letter symbol Number picture, the letter symbol picture is stored in picture file.

It extracts the store path of each letter symbol picture in the picture file and label is set, to generate data library text Part is simultaneously stored.

Setting model defines file, describes file to generate model.

The model is written into the store path of the database file and defines file, to the write-in database file Model after store path defines file and is stored.

Parameter definition file is set, the parameter definition file is written into the store path that the model defines file.

The parameter definition file is input to the model and describes file, generates initial model.

The initial model is repeatedly trained, caffemodel file is obtained.

The caffemodel file is input to the testing tool of setting, obtains Text region model.

Optionally, in the present embodiment, to the multiple letter symbols prestored carry out respectively picture processing, generate with it is each described The letter symbol picture is stored in the step in picture file by the corresponding letter symbol picture of letter symbol, comprising:

According to the shape of each letter symbol prestored, coordinate file corresponding with the letter symbol is generated.

Letter symbol picture corresponding with each letter symbol is generated according to each coordinate file.

The letter symbol picture is stored in picture file.

Optionally, in the present embodiment, it extracts the store path of each letter symbol picture in the picture file and sets The step of setting label, comprising:

Extract the store path of each letter symbol in the picture file.

Same digital label is arranged to all letter symbols picture corresponding to the same letter symbol.

Wherein, different literals symbol picture corresponds to different digital labels.

Optionally, in the present embodiment, setting model defines file, to generate the step of model describes file, comprising:

The definition for carrying out network layer in file is defined in the model, describes file to obtain model.

Optionally, in the present embodiment, the model describe file include multiple convolutional layers, it is multiple down-sampled layers, multiple Full articulamentum and active coating.

Optionally, in the present embodiment, the step of parameter definition file is set, comprising:

Setting for number, output gap, test interval, weight and training mode is iterated in the parameter definition file It sets.

Optionally, in the present embodiment, after obtaining the Text region model, the Text region model is for knowing Other text.

Further, the step of identifying text with the Text region model, comprising:

Letter symbol picture to be measured is input to the Text region model.

The Text region model identifies the letter symbol picture to be measured, obtains and exports recognition result.

Second aspect, the embodiment of the invention also provides a kind of Text region model building device, the Text region modeling dress It sets and includes:

First operation module generates and each text for carrying out picture processing respectively to the multiple letter symbols prestored The corresponding letter symbol picture of character number, the letter symbol picture is stored in picture file.

Second operation module, for extracting the store path of each letter symbol picture in the picture file and mark being arranged Label, to generate database file and be stored.

First setup module defines file for model to be arranged, and describes file to generate model, and by the database The store path of file is written the model and defines file, defines to the model after the store path that the database file is written File is stored.

For parameter definition file to be arranged institute is written in the store path that the model defines file by the second setup module State parameter definition file.

Generation module describes file for the parameter definition file to be input to the model, generates initial model.

First obtains module, for repeatedly being trained to the initial model, obtains caffemodel file.

Second obtains module, for the caffemodel file to be input to the testing tool of setting, obtains text and knows Other model.

The third aspect, the embodiment of the invention also provides a kind of character recognition methods, are applied to electronic equipment, the text Recognition methods carries out Text region using the Text region model that above content is built, and the character recognition method includes:

Letter symbol picture to be measured is input to the Text region model.

Fourth aspect, the embodiment of the invention also provides a kind of electronic equipment, the electronic equipment includes:

Memory；

Processor；And

Character recognition device, the character recognition device prestore the above-mentioned Text region model built, and the text is known Other device is stored in the memory and the software function module including being executed by the processor, the character recognition device Include:

Input module, for letter symbol picture to be measured to be input to the Text region model.

Identification module identifies the letter symbol picture to be measured for the Text region model, obtains and defeated Recognition result out.

5th aspect, the embodiment of the invention also provides a kind of readable storage medium storing program for executing, stores in the readable storage medium storing program for executing There is computer program, the computer program, which is performed, realizes above-mentioned character recognition method.

Text region modeling method provided in an embodiment of the present invention and device, character recognition method and electronic equipment, this article Word identifies that modeling method by carrying out picture processing respectively to the multiple letter symbols prestored, generates corresponding with each letter symbol Letter symbol picture is stored in picture file by letter symbol picture, is generated database file according to picture file and is carried out Storage, and model is set defines file generated model and describe file, by the store path write-in model definition text of database file Part simultaneously stores the model and defines file, and parameter definition file is arranged, the store path write parameters that model defines file are defined File, then parameter definition file is input to model and describes file, initial model is generated, and then repeatedly instructed to initial model Practice, obtain caffemodel file, then the caffemodel file of acquisition is input to the testing tool of setting, obtains text and know Other model obtains the Text region model that the training time is short, recognition accuracy is high as a result,.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 is a kind of flow diagram of Text region modeling method provided in an embodiment of the present invention；

Fig. 2 is the schematic diagram of segment word symbol picture provided in an embodiment of the present invention；

Fig. 3 is another flow diagram of Text region modeling method provided in an embodiment of the present invention；

Fig. 4 is a kind of block diagram of Text region model building device provided in an embodiment of the present invention；

Fig. 5 is a kind of flow diagram of character recognition method provided in an embodiment of the present invention；

Fig. 6 is that the box of the electronic equipment provided in an embodiment of the present invention for realizing above-mentioned character recognition method is illustrated Figure.

Icon: 100- Text region model building device；The first operation module of 110-；The second operation module of 120-；130- first Setup module；The second setup module of 140-；150- generation module；160- first obtains module；170- second obtains module；200- Electronic equipment；210- memory；220- processor；300- character recognition device；310- input module；320- identification module.

Specific embodiment

It has been investigated that the application that Text region obtains in more and more fields, brings for people's production and life life Great convenience.Currently, segment word recognition methods, uses improved quadric discriminant function (Modified Quadratic Discriminant Functions, MQDF) classifier classifies to letter symbol to be identified, to obtain identification knot Fruit.But such recognition methods to data be trained consumed by the time it is longer, and recognition accuracy is not high.

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Usually herein The component of the embodiment of the present invention described and illustrated in place's attached drawing can be arranged and be designed with a variety of different configurations.

Therefore, the detailed description of the embodiment of the present invention provided in the accompanying drawings is not intended to limit below claimed The scope of the present invention, but be merely representative of selected embodiment of the invention.Based on the embodiments of the present invention, this field is common Technical staff's all other embodiment obtained without creative efforts belongs to the model that the present invention protects It encloses.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.

Fig. 1 is please referred to, is a kind of flow diagram of Text region modeling method provided in an embodiment of the present invention.This The disclosed Text region model of invention, uses convolutional neural networks frame (Convolutional Architecture for Fast Feature Embedding, Caffe) it is trained, tests, finely tuning and deployment model.It should be noted that the present invention The Text region modeling method that embodiment provides is not limitation with Fig. 1 and specific order as described below.The Text region is built Mould method can be achieved by the steps of:

S10 carries out picture processing respectively to the multiple letter symbols prestored, generates text corresponding with each letter symbol The letter symbol picture is stored in picture file by character picture.

Fig. 2 is please referred to, is the schematic diagram of segment word symbol picture provided in an embodiment of the present invention.In the present embodiment In, the letter symbol prestored is divided into training set and test set.The letter symbol prestored is from multiple and different duties Industry, different sexes, all ages and classes people multiclass letter symbol, those letter symbols include additional character, punctuation mark, number Word, text, letter, Chinese character stroke coordinate.It should be noted that the letter symbol picture in Fig. 2 is only the embodiment of the present invention Used segment word symbol picture, and not all letter symbol picture.

S20 extracts the store path of each letter symbol picture in the picture file and label is arranged, to generate data Library file is simultaneously stored.

In detail, the crossover tool that the present embodiment uses Caffe to provide, by letter symbol figure each in the picture file The store path and its label of piece are converted into database file.The database file can be but not limited to light-duty memory mapping Database (Lightning Memory-Mapped Database, LMDB) file, has the characteristics that structure is simple, passes through file Data in path, that is, accessible file.

S30, setting model define file, describe file to generate model.

Optionally, in the present embodiment, the model, which defines file, can be lenet_train_test.prototxt text Crowd size (the batch_ of model training in the lenet_train_test.prototxt file is arranged in detail in part Size), so that letter symbol number of pictures used is trained in definition every time.

Further, after generating the model and describing file, the model is described into file, the model definition is written File.

The store path of the database file is written the model and defines file by S40, to the write-in data library text Model after the store path of part defines file and is stored.

Parameter definition file is arranged in S50, and the parameter definition text is written in the store path that the model defines file Part.

In detail, the parameter definition file can be lenet_solver.prototxt file, in the parameter definition Trained parameter can be set in file.

The parameter definition file is input to the model and describes file, generates initial model by S60.

S70 repeatedly trains the initial model, obtains caffemodel file.

Still optionally further, the step of repeatedly being trained to the initial model, obtaining caffemodel file, can be with 1000 training are carried out to the initial model by using the letter symbol picture in training set, and defeated every 100 training A loss function out, neural network according to loss function come feedback regulation model so that the loss function obtained is minimum.And In every 500 training, once tested using the data of test set.Pass through the test result, it can be seen that current test Under number, the accuracy of Text region.Caffemodel file can be obtained by above-mentioned training.

The caffemodel file is input to the testing tool of setting by S80, obtains Text region model.

Thus, it is possible to obtain high recognition accuracy, and the Text region model that the model training time is short.

Fig. 3 is please referred to, in embodiments of the present invention, picture processing respectively is carried out to the multiple letter symbols prestored, Letter symbol picture corresponding with each letter symbol is generated, the letter symbol picture is stored in picture file, it can To be accomplished by the following way:

S11 generates coordinate file corresponding with the letter symbol according to the shape of each letter symbol prestored.

S12 generates letter symbol picture corresponding with each letter symbol according to each coordinate file.

The letter symbol picture is stored in picture file by S13.

In embodiments of the present invention, it extracts the store path of each letter symbol picture in the picture file and mark is set Label, can be accomplished by the following way:

Extract the store path of each letter symbol in the picture file.

In detail, for example, being 173 by " angstrom " word setting digital label, all " angstrom " words will be indicated with digital label, in advance Multiple " angstrom " words deposited are distinguished by way of increasing suffix number.

Further, in embodiments of the present invention, setting model defines file, to generate the step of model describes file, Include:

Optionally, in embodiments of the present invention, the model describe file include multiple convolutional layers, multiple down-sampled layers, Multiple full articulamentums and active coating.

Optionally, it includes the first convolutional layer, the first down-sampled layer, the second convolutional layer, the second drop that the model, which describes file, Sample level, the first full articulamentum, activation primitive and the second full articulamentum.In detail, the convolution kernel of the described first down-sampled layer can To be but not limited to 5*5, step-length can be but not limited to 1, and the first down-sampled layer is for extracting characteristics of image.Described first The convolution kernel of down-sampled layer can be but not limited to 2*2, and step-length can be but not limited to 2, and the first down-sampled layer is for contracting The size of small parameter matrix, to reduce the amount of parameter.The convolution kernel of second convolutional layer can be but not limited to 2*2, step Length can be but not limited to 2, and second convolutional layer is for further extracting characteristics of image.The convolution of the second down-sampled layer Core can be but not limited to 2*2, and step-length can be but not limited to 2, and the second down-sampled layer is for further reducing parameter square The size of battle array, to be further reduced the amount of parameter.The first full articulamentum for will after multiple convolution high abstraction The feature of change is integrated, and highly purified feature is obtained.The activation primitive is used to be added the non-linear factor of model, described Activation primitive can be selected but be not limited to ReLU function.The second full articulamentum is used to further integrate feature, with Obtain the purification higher feature in Chengdu.

Optionally, in embodiments of the present invention, the step of parameter definition file is set, comprising:

The ginseng of number, output gap, test interval, weight and training mode is iterated in the parameter definition file Number setting.

Fig. 4 is please referred to, is that a kind of box of Text region model building device 100 provided in an embodiment of the present invention is illustrated Figure.The Text region model building device 100 includes:

First operation module 110, for the multiple letter symbols prestored carry out respectively picture processing, generate with it is each described The corresponding letter symbol picture of letter symbol, the letter symbol picture is stored in picture file.

Second operation module 120, for extracting the store path of each letter symbol picture in the picture file and setting Label is set, to generate database file and be stored.

First setup module 130 defines file for model to be arranged, and describes file to generate model, and by the number The model is written according to the store path of library file and defines file, to the model after the store path that the database file is written File is defined to be stored.

The store path that the model defines file is written for parameter definition file to be arranged for second setup module 140 The parameter definition file.

Generation module 150 describes file for the parameter definition file to be input to the model, generates introductory die Type.

First obtains module 160, for repeatedly being trained to the initial model, obtains caffemodel file.

Second obtains module 170, for the caffemodel file to be input to the testing tool of setting, obtains text Identification model.

Fig. 5 is please referred to, is a kind of flow diagram of character recognition method provided in an embodiment of the present invention, is applied to Electronic equipment 200, the character recognition method carries out Text region using the Text region model that above content is built, described Character recognition method includes:

Letter symbol picture to be measured is input to the Text region model by S91.

S92, the Text region model identify the letter symbol picture to be measured, obtain and export identification knot Fruit.

Fig. 6 is please referred to, is the electronic equipment provided in an embodiment of the present invention for realizing above-mentioned character recognition method 200 block diagram.In the present embodiment, the electronic equipment 200 be may be, but not limited to, smart phone, PC (PersonalComputer, PC), laptop, monitoring device, server etc. have video features study and processing capacity Computer equipment.

The electronic equipment 200 further includes view character recognition device 300, memory 210 and processor 220.The text Character distinguishing device 300 prestores the above-mentioned Text region model built.In the embodiment of the present invention, character recognition device 300 includes At least one can be stored in the memory 210 or be solidificated in the electronics in the form of software or firmware (Firmware) and set Software function module in standby 200 operating system (Operating System, OS).The processor 220 is for executing institute The executable software module stored in memory 210 is stated, for example, software function mould included by the character recognition device 300 Block and computer program etc..In the present embodiment, the character recognition device 300 also be can integrate in the operating system, be made For a part of the operating system.Specifically, the character recognition device 300 includes:

Input module 310, for letter symbol picture to be measured to be input to the Text region model.

Identification module 320 identifies the letter symbol picture to be measured for the Text region model, obtains simultaneously Export recognition result.

Still optionally further, the embodiment of the invention also provides a kind of readable storage medium storing program for executing, in the readable storage medium storing program for executing It is stored with computer program, the computer program, which is performed, realizes above-mentioned character recognition method.

In conclusion Text region modeling method provided in an embodiment of the present invention and device, character recognition method and electronics Equipment, method generate text corresponding with each letter symbol by carrying out picture processing respectively to the multiple letter symbols prestored Letter symbol picture is stored in picture file by symbol picture, is generated database file according to picture file and is stored, And it model is set defines file generated model and describe file, the store path write-in model of database file is defined into file and deposit It stores up the model and defines file, parameter definition file is set, the store path write parameters that model defines file are defined into file, then Parameter definition file is input to model and describes file, generates initial model, and then repeatedly trained to initial model, is obtained Caffemodel file, then the caffemodel file of acquisition is input to the testing tool of setting, Text region model is obtained, The Text region model that the training time is short, recognition accuracy is high is obtained as a result,.

In embodiment provided by the present invention, it should be understood that disclosed device and method, it can also be by other Mode realize.Device and method embodiment described above is only schematical, for example, flow chart and frame in attached drawing Figure shows the system frame in the cards of the system of multiple embodiments according to the present invention, method and computer program product Structure, function and operation.In this regard, each box in flowchart or block diagram can represent a module, section or code A part, a part of the module, section or code includes one or more for implementing the specified logical function Executable instruction.It should also be noted that function marked in the box can also be with not in some implementations as replacement It is same as the sequence marked in attached drawing generation.For example, two continuous boxes can actually be basically executed in parallel, they have When can also execute in the opposite order, this depends on the function involved.It is also noted that in block diagram and or flow chart Each box and the box in block diagram and or flow chart combination, can function or movement as defined in executing it is dedicated Hardware based system realize, or can realize using a combination of dedicated hardware and computer instructions.

In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row Its property includes, so that the process, method, article or equipment for including a series of elements not only includes those elements, and And further include the other elements being not explicitly listed, or further include for this process, method, article or equipment institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including institute State in the process, method, article or equipment of element that there is also other identical elements.

It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.

Claims

1. a kind of Text region modeling method, which is characterized in that the Text region modeling method includes:

Picture processing respectively is carried out to the multiple letter symbols prestored, generates letter symbol figure corresponding with each letter symbol The letter symbol picture is stored in picture file by piece；

It extracts the store path of each letter symbol picture in the picture file and label is set, to generate database file simultaneously It is stored；

Setting model defines file, describes file to generate model；

The model is written into the store path of the database file and defines file, the storage to the database file is written Model behind path defines file and is stored；

Parameter definition file is set, the parameter definition file is written into the store path that the model defines file；

The parameter definition file is input to the model and describes file, generates initial model；

The initial model is repeatedly trained, caffemodel file is obtained；

2. Text region modeling method according to claim 1, which is characterized in that carried out to the multiple letter symbols prestored Picture processing respectively generates letter symbol picture corresponding with each letter symbol, the letter symbol picture is stored in Step in picture file, comprising:

According to the shape of each letter symbol prestored, coordinate file corresponding with the letter symbol is generated；

Letter symbol picture corresponding with each letter symbol is generated according to each coordinate file；

The letter symbol picture is stored in picture file.

3. Text region modeling method according to claim 2, which is characterized in that extract each text in the picture file The store path of character picture and the step of label is set, comprising:

Extract the store path of each letter symbol in the picture file；

Same digital label is arranged to all letter symbols picture corresponding to the same letter symbol；

4. Text region modeling method according to claim 1, which is characterized in that setting model defines file, to generate Model describes the step of file, comprising:

5. Text region modeling method according to claim 4, which is characterized in that it includes multiple that the model, which describes file, Convolutional layer, multiple down-sampled layers, multiple full articulamentums and active coating.

6. Text region modeling method according to claim 1, which is characterized in that the step of parameter definition file is set, Include:

The setting of number, output gap, test interval, weight and training mode is iterated in the parameter definition file.

7. a kind of Text region model building device, which is characterized in that the Text region model building device includes:

First operation module generates and each Chinese character for carrying out picture processing respectively to the multiple letter symbols prestored Number corresponding letter symbol picture, the letter symbol picture is stored in picture file；

Second operation module, for extracting the store path of each letter symbol picture in the picture file and label being arranged, To generate database file and be stored；

First setup module defines file for model to be arranged, and describes file to generate model, and by the database file Store path the model be written define file, file is defined to the model after the store path that the database file is written It is stored；

For parameter definition file to be arranged the ginseng is written in the store path that the model defines file by the second setup module Number defines file；

Generation module describes file for the parameter definition file to be input to the model, generates initial model；

First obtains module, for repeatedly being trained to the initial model, obtains caffemodel file；

Second obtains module, for the caffemodel file to be input to the testing tool of setting, obtains Text region mould Type.

8. a kind of character recognition method, which is characterized in that be applied to electronic equipment, the character recognition method uses aforesaid right It is required that the Text region model that any one of 1-6 is built carries out Text region, the character recognition method includes:

Letter symbol picture to be measured is input to the Text region model；

9. a kind of electronic equipment, which is characterized in that the electronic equipment includes:

Memory；

Processor；And

Character recognition device, the character recognition device prestore the text that any one of the claims 1-6 is built Identification model, the character recognition device is stored in the memory and the software function mould including being executed by the processor Block, the character recognition device include:

Input module, for letter symbol picture to be measured to be input to the Text region model；

Identification module identifies the letter symbol picture to be measured for the Text region model, obtains and export knowledge Other result.

10. a kind of readable storage medium storing program for executing, which is characterized in that be stored with computer program, the meter in the readable storage medium storing program for executing Calculation machine program, which is performed, realizes character recognition method described in claim 8.