[go: up one dir, main page]

CN119478964A - Logistics order invoice registration and identification method, device, equipment and storage medium - Google Patents

Logistics order invoice registration and identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN119478964A
CN119478964A CN202411674888.7A CN202411674888A CN119478964A CN 119478964 A CN119478964 A CN 119478964A CN 202411674888 A CN202411674888 A CN 202411674888A CN 119478964 A CN119478964 A CN 119478964A
Authority
CN
China
Prior art keywords
invoice
verification
character information
image
logistics order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411674888.7A
Other languages
Chinese (zh)
Inventor
潘秒秒
冯晓明
龚鹏大
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Qianzhen Information Technology Co ltd
Original Assignee
Shanghai Qianzhen Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Qianzhen Information Technology Co ltd filed Critical Shanghai Qianzhen Information Technology Co ltd
Priority to CN202411674888.7A priority Critical patent/CN119478964A/en
Publication of CN119478964A publication Critical patent/CN119478964A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1916Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Character Input (AREA)

Abstract

本发明涉及图像识别处理技术领域,特别涉及一种物流订单发票登记识别方法、装置、设备及存储介质。其中,所述物流订单发票登记识别方法,包括如下步骤:获取未登记发票图片,对未登记发票图片进行预处理,以获得预处理图片;构建字符识别模型,采用所述字符识别模型对预处理图片进行处理,以提取预处理图片中的字符信息;建立校验规则,根据所述校验规则对字符信息进行校验,对校验不通过的字符信息进行标记,将校验通过的字符信息与物流订单进行关联。所述物流订单发票登记识别方法可以自动完成发票的登记,并能准确地识别出存在问题的发票,不仅能提高发票的登记效率,减少人为登记的出错,而且能有效帮助减少财务风险,规避问题发票。

The present invention relates to the field of image recognition and processing technology, and in particular to a logistics order invoice registration and recognition method, device, equipment and storage medium. The logistics order invoice registration and recognition method comprises the following steps: obtaining an unregistered invoice image, preprocessing the unregistered invoice image to obtain a preprocessed image; constructing a character recognition model, using the character recognition model to process the preprocessed image to extract character information in the preprocessed image; establishing a verification rule, verifying the character information according to the verification rule, marking the character information that fails the verification, and associating the character information that passes the verification with the logistics order. The logistics order invoice registration and recognition method can automatically complete the registration of invoices and accurately identify invoices with problems, which can not only improve the registration efficiency of invoices and reduce errors in human registration, but also effectively help reduce financial risks and avoid problematic invoices.

Description

Logistics order invoice registration and identification method, device, equipment and storage medium
Technical Field
The present invention relates to the field of image recognition processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for registering and recognizing a logistic order invoice.
Background
The traditional invoice registering mode mainly relies on manual input, and staff needs to manually input information on the invoice into the system one by one. The method has the defects that firstly, the efficiency is low, particularly when a large number of invoices are faced, a large amount of time and labor cost are required for manual input, secondly, the method has the problem of easy error, and due to the influence of human factors, the problems of input errors, missing information and the like can occur, so that invoice information is inaccurate, furthermore, timeliness of data is difficult to ensure, a certain hysteresis exists in manual input, and the invoice information cannot be registered in a system in time, so that financial management and decision of enterprises are influenced. With the continuous increase of logistics traffic and the rapid development of information technology, the traditional invoice registering method cannot meet the requirements of modern logistics enterprises. On the one hand, the logistics enterprises need a more efficient and accurate invoice registering method so as to improve the working efficiency, reduce the cost and ensure the accuracy and the integrity of invoice information. On the other hand, with the increasingly stricter tax administration, logistics enterprises need more standard and scientific invoice administration modes to avoid tax risks.
Disclosure of Invention
In view of the shortcomings of the prior art, the invention aims to provide a method, a device, equipment and a storage medium for registering and identifying a logistics order invoice, which aim to solve the technical problems of low efficiency, high error rate and the like in the traditional invoice input mode in the prior art.
In order to achieve the above purpose, the invention adopts the following technical scheme:
The invention provides a logistic order invoice registration and identification method, which comprises the following steps of obtaining an unregistered invoice picture, preprocessing the unregistered invoice picture to obtain a preprocessed picture, constructing a character recognition model, processing the preprocessed picture by adopting the character recognition model to extract character information in the preprocessed picture, establishing a verification rule, verifying the character information according to the verification rule, marking character information which is not passed through verification, and associating the character information which is passed through verification with a logistic order.
Optionally, in a first implementation manner of the first aspect of the present invention, the obtaining an unregistered invoice picture, preprocessing the unregistered invoice picture to obtain a preprocessed picture, specifically includes obtaining the unregistered invoice picture, denoising the unregistered invoice picture by using a filtering algorithm to obtain a first picture, processing the first picture by using a histogram equalization method to enhance a contrast ratio of the first picture to obtain a second picture, and processing the second picture by using a thresholding method to convert the second picture into a black-white binary image to obtain the preprocessed picture.
Optionally, in a second implementation manner of the first aspect of the present invention, the constructing a character recognition model, processing a preprocessed picture by using the character recognition model to extract character information in the preprocessed picture, specifically includes obtaining an invoice sample image, marking characters in the invoice sample image to obtain training data, training the basic model by using a deep learning frame and the training data with a convolutional neural network model as the basic model to obtain a preliminary model, verifying accuracy of the preliminary model, and continuously optimizing the preliminary model according to a verification result to obtain the character recognition model.
Optionally, in a third implementation manner of the first aspect of the present invention, the establishing a verification rule, verifying the character information according to the verification rule, marking character information that fails to pass the verification, associating the character information that fails to pass the verification with the logistics order, specifically includes obtaining a format specification and a business rule of an invoice, establishing the verification rule according to the format specification and the business rule of the invoice, verifying the character information according to the verification rule and a pre-established enterprise database to obtain a verification result, marking as abnormal if the verification result is that the character information does not conform to the verification rule or cannot be matched with the enterprise database, regarding as that the verification is passed if the verification result is that the character information conforms to the verification rule and can be matched with the enterprise database, and associating the character information that fails to pass the verification with the logistics order.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the checking the character information according to the checking rule and the pre-built enterprise database to obtain a checking result specifically includes obtaining different types of paper invoice images, building a definition judgment model according to the different types of paper invoice images, judging whether the invoice is a paper invoice according to the character information, if the invoice is a paper invoice, judging an unregistered invoice picture by using the definition judgment model to obtain a judging result, if the judging result is not passed, marking the corresponding unregistered invoice picture, and if the judging result is passed, checking the character information according to the checking rule and the pre-built enterprise database to obtain the checking result.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the obtaining different types of paper invoice images and constructing a definition judgment model according to the different types of paper invoice images specifically includes obtaining different types of paper invoice images, marking whether the definition of the paper invoice images meets requirements, forming a training set according to the paper invoice images and marking results thereof, training the basic model by using a machine learning algorithm as a basic model and using the training set to obtain a definition judgment model, formulating an automatic optimization iteration strategy, automatically updating the training set according to the automatic optimization iteration strategy, and optimizing and iterating the definition judgment model by using the updated training set.
Optionally, in a sixth implementation manner of the first aspect of the present invention, if the judgment result is passing, the character information is checked according to the check rule and a pre-built enterprise database to obtain a check result, and specifically includes, if the judgment result is passing, obtaining key information of an enterprise and related companies, where the key information includes a tax payer identification number and an enterprise name, building an enterprise database with the key information, checking the character information according to the check rule and the enterprise database, where the character information includes tax payer information, an invoicing date, an amount and a tax rate, matching the tax payer information in the character information with the enterprise database to obtain a first check result, and checking the invoicing date, the amount or the tax rate in the character information according to the check rule to obtain a second check result.
The invention provides a logistic order invoice registration recognition device which comprises a preprocessing module, an extraction module and a verification module, wherein the preprocessing module is used for acquiring unregistered invoice pictures and preprocessing the unregistered invoice pictures to obtain preprocessed pictures, the extraction module is used for constructing a character recognition model, processing the preprocessed pictures by adopting the character recognition model to extract character information in the preprocessed pictures, the verification module is used for establishing a verification rule, verifying the character information according to the verification rule, marking character information which is not verified, and associating the character information which is verified to pass with a logistic order.
Optionally, in a first implementation manner of the second aspect of the present invention, the preprocessing module includes a denoising unit, an adjusting unit, and a converting unit, wherein the denoising unit is used for obtaining an unregistered invoice picture, denoising the unregistered invoice picture by adopting a filtering algorithm to obtain a first picture, the adjusting unit is used for processing the first picture by adopting a histogram equalization method to enhance the contrast of the first picture to obtain a second picture, and the converting unit is used for processing the second picture by adopting a threshold method to convert the second picture into a black-white binary image to obtain a preprocessed picture.
Optionally, in a second implementation manner of the second aspect of the present invention, the extracting module includes an obtaining unit, a training unit, and an optimizing unit, where the obtaining unit is configured to obtain an invoice sample image, label characters in the invoice sample image to obtain training data, the training unit is configured to train the basic model by using the deep learning frame and the training data with the convolutional neural network model as a basic model to obtain a preliminary model, and the optimizing unit is configured to verify accuracy of the preliminary model, and continuously optimize the preliminary model according to a verification result to obtain a character recognition model.
Optionally, in a third implementation manner of the second aspect of the present invention, the verification module includes a creation sub-module configured to obtain a format specification and a service rule of an invoice, create a verification rule according to the format specification and the service rule of the invoice, verify character information according to the verification rule and a pre-built enterprise database to obtain a verification result, a marking sub-module configured to mark as abnormal if the verification result is that the character information does not conform to the verification rule or cannot be matched with the enterprise database, and a correlation sub-module configured to consider that the verification is passed if the verification result is that the character information conforms to the verification rule and can be matched with the enterprise database, and correlate the character information passed by the verification with a logistics order.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the verification sub-module includes a construction unit, a judgment unit, a marking unit, and a verification unit, wherein the construction unit is used for acquiring different types of paper invoice images, constructing a definition judgment model according to the different types of paper invoice images, the judgment unit is used for judging whether the invoice is paper invoice according to character information, judging unregistered invoice pictures by adopting the definition judgment model to obtain a judgment result if the invoice is paper invoice, marking the corresponding unregistered invoice pictures if the judgment result is not passed, and the verification unit is used for verifying the character information according to the verification rule and a pre-constructed enterprise database to obtain a verification result if the judgment result is passed.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the building unit includes an obtaining subunit, configured to obtain different types of paper invoice images, label whether the sharpness of the paper invoice images meets the requirement, form a training set based on the paper invoice images and the labeling results thereof, train the basic model with the training set based on the machine learning algorithm to obtain a sharpness judgment model, and an optimizing subunit, configured to formulate an automatic optimization iteration strategy, automatically update the training set according to the automatic optimization iteration strategy, and optimize and iterate the sharpness judgment model with the updated training set.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the verification unit includes a construction subunit, configured to obtain key information of an enterprise and its related company if the determination result is passed, where the key information includes a tax payer identifier, an enterprise name, and construct an enterprise database with the key information, and the first verification subunit is configured to verify, according to the verification rule and the enterprise database, character information, where the character information includes tax payer information, an invoicing date, an amount, and a tax rate, and match tax payer information in the character information with the enterprise database to obtain a first verification result, and the second verification subunit is configured to verify, according to the verification rule, the invoicing date, the amount, or the tax rate in the character information to obtain a second verification result.
A third aspect of the present invention provides a logistics order invoice registration recognition device comprising a memory having computer readable instructions stored therein and at least one processor invoking the computer readable instructions in the memory to perform the steps of the logistics order invoice registration recognition method as described above.
A fourth aspect of the present invention provides a computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the method for registering and identifying a logistics order invoice as described above.
The invention has the beneficial effects that the method for registering and identifying the invoice of the logistics order has the advantages that firstly, the unregistered invoice picture is obtained through preprocessing the unregistered invoice picture, the quality of the picture is improved, the processing result of a follow-up model is more accurate, then, the preprocessed picture is processed through adopting the character recognition model, the character information in the preprocessed picture is extracted, the manual operation is not needed, the efficiency is higher, the error rate is low, finally, the verification rule is established, the character information which is not passed through verification is verified according to the verification rule, the character information which is not passed through verification is marked, the character information which is passed through verification is associated with the logistics order, the invoice with abnormality is automatically verified and identified, the invoice which is successfully verified is automatically associated, and the working efficiency and the timeliness of information collection are effectively improved.
Drawings
FIG. 1 is a first flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 3 is a third flow chart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 4 is a fourth flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 5 is a fifth flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 6 is a sixth flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 7 is a seventh flowchart of a method for identifying a registration of a invoice for a physical distribution order according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a configuration of a device for registering and identifying a logistic order invoice according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of another configuration of a device for registering and identifying a logistic order invoice provided by an embodiment of the present invention;
Fig. 10 is a schematic structural diagram of a logistic order invoice registration recognition device according to an embodiment of the present invention.
Detailed Description
The invention provides a method, a device, equipment and a storage medium for registering and identifying a logistics order invoice. The method comprises the steps of firstly obtaining an unregistered invoice picture, preprocessing the unregistered invoice picture to obtain a preprocessed picture, then constructing a character recognition model, processing the preprocessed picture by adopting the character recognition model to extract character information in the preprocessed picture, finally checking the character information according to a checking rule by establishing the checking rule, marking character information which is not checked, and associating the character information which is checked to a logistics order. The invention adopts a mode of automatically extracting and verifying the information in the invoice, effectively improves the registering efficiency of the invoice, further carries out preprocessing on unregistered invoice pictures for improving the accuracy of character recognition, adopts a character recognition model for extracting the character information, effectively avoids the problem of error in character information extraction, and ensures that the registering process of the logistics order invoice is carried out quickly and efficiently.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, the following description will describe a specific flow of an embodiment of the present invention, and it should be noted that, in the present invention, the steps related to obtaining personal information of a user are all performed under the condition that authorization of the user is obtained.
Referring to fig. 1, a first embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
s101, acquiring an unregistered invoice picture, and preprocessing the unregistered invoice picture to acquire a preprocessed picture;
Specifically, each website of the logistics enterprise can directly upload invoice pictures to the system at each place of business, so that the system can acquire unregistered invoice pictures. The unregistered invoice pictures shot by different personnel or different equipment have very different picture quality, so that the unregistered invoice pictures need to be preprocessed in order to be more beneficial to the accurate extraction of character information in the follow-up character recognition model, for example, denoising, contrast enhancement, brightness adjustment, conversion into black-and-white pictures and the like can be adopted. Various pretreatment modes can be combined for use, so that the pretreatment effect is improved.
S102, constructing a character recognition model, and processing the preprocessed picture by adopting the character recognition model to extract character information in the preprocessed picture;
The character recognition model can automatically process a large number of pictures, extract character information in the pictures, remarkably improve working efficiency and reduce the requirement of manual intervention. The trained character recognition model generally has high recognition accuracy, particularly when processing clear, canonical character pictures. This helps to reduce errors that may occur in manual identification. The character recognition model can also process character pictures in various complex scenes, such as blurring, rotation, warping, and the like. This enables a more flexible model in practical applications.
The character information extracted from the preprocessed picture comprises various key information on the invoice, such as enterprise name, tax payer identification number, amount, tax amount, ticketing item and the like. The extracted character information may be further used for verification analysis such as verifying whether the invoice is valid or counterfeit, etc.
S103, establishing a verification rule, verifying the character information according to the verification rule, marking character information which does not pass the verification, and associating the character information which passes the verification with the logistics order.
The invoice has strict requirements on the format of each character, such as the format of fonts, the writing format of dates, the invoicing items, the tax size and the like. By establishing the verification rules, the system can verify each item of content acquired from the invoice picture one by one according to the verification rules, so that the accuracy of the invoice content can be ensured, and the requirements are met.
When the invoice content is found to be unsatisfactory, the system marks the abnormal character content and specifically outputs the abnormal character content to the interface for display to the user. And if the character information accords with the verification rule, correlating the character information with the logistics order to finish the registration and identification of the invoice of the logistics order.
By the method, the registering efficiency of the invoice can be improved, the website can directly upload the invoice accessories during specific implementation, corresponding invoice information can be automatically brought out through the background interface, the invoicing attribution month is selected, and the invoicing registering operation time is shortened.
In addition, the invention can also improve the registration accuracy and the validity, and can also identify the authenticity of the verification and the receipt information feedback through a three-party interface during verification, thereby avoiding the condition of error of manually inputting the receipt information. By the automatic invoice registration and identification method, the workload of manual input is reduced, and the invoice registration efficiency is improved. Meanwhile, the accuracy and the integrity of invoice information are ensured by adopting an image recognition technology and a data checking algorithm. The accurately registered invoice information can provide reliable data support for tax declaration and financial management of logistics enterprises, and is beneficial to enterprises to obey tax regulations and reduce tax risks.
Referring to fig. 2, a second embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
s201, obtaining an unregistered invoice picture, and denoising the unregistered invoice picture by adopting a filtering algorithm to obtain a first picture;
specifically, during denoising processing, algorithms such as median filtering, gaussian filtering and the like can be adopted to remove noise points in invoice images. The median filtering is a nonlinear filtering method, and the gray value of each pixel in the image is replaced by the median of the gray values of all pixels in the neighborhood of the pixel, so that salt and pepper noise and the like are effectively removed. Gaussian filtering is a linear filtering method, and noise is removed by carrying out weighted average on an image, so that the Gaussian noise removing method has a good Gaussian noise removing effect.
S202, processing the first picture by adopting a histogram equalization method to enhance the contrast of the first picture and obtain a second picture;
histogram equalization is a method of enhancing image contrast by adjusting the gray level histogram of an image. It stretches the gray scale of the image so that dark areas in the image become darker and bright areas become brighter, thereby improving the readability of the image.
S203, processing the second picture by adopting a threshold method to convert the second picture into a black-white binary image, and obtaining a preprocessed picture.
The preprocessed image is converted into a black-and-white binary image for subsequent character recognition. The binarization may be performed using a fixed threshold method or an adaptive threshold method. The fixed threshold method is to set pixels in an image with gray values greater than a certain fixed threshold to white and pixels below the threshold to black. The self-adaptive threshold rule automatically determines a threshold according to the local characteristics of the image, and has a good processing effect on invoice images with uneven illumination.
Referring to fig. 3, a third embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
s301, acquiring an invoice sample image, marking characters in the invoice sample image to obtain training data, wherein a manual marking or automatic marking method can be adopted during marking, and the accuracy and the completeness of marking are ensured.
S302, training a basic model by using a Convolutional Neural Network (CNN) model as the basic model and using a deep learning frame and training data to obtain a preliminary model;
The deep learning framework may employ, for example TensorFlow, pyTorch or the like. In the training process, parameters of the model, such as learning rate, batch size, network structure and the like, need to be continuously adjusted, so that the accuracy of character recognition is improved. Meanwhile, the cross-validation, early-stop method and other technologies can be adopted to prevent the model from being fitted excessively.
S303, verifying the accuracy of the preliminary model, and continuously optimizing the preliminary model according to a verification result to obtain a character recognition model.
After training is completed, the character recognition model can be deployed into an invoice registration recognition system, so that automatic character recognition of an invoice image is realized. In practical application, the model can be further optimized and adjusted according to the characteristics and the identification effect of the invoice.
Further, post-processing can be performed on the recognized characters to remove some erroneous recognition results. For example, the recognized characters can be checked and corrected by dictionary inquiry, grammar analysis and other methods, so that the accuracy of the recognition result is ensured.
In addition to constructing the character recognition model in the above manner, an excellent performing Optical Character Recognition (OCR) engine, such as TESSERACT OCR, hundred degrees OCR, etc., may be selected. These OCR engines are capable of accurately recognizing characters in a variety of fonts and languages through extensive training and optimization.
When the method is applied specifically, parameters can be adjusted and optimized according to the characteristics of the invoice. For example, for a particular font, font size, color, etc. on the invoice, the recognition parameters of the OCR engine may be adjusted to improve the accuracy of character recognition. Meanwhile, the invoice image can be segmented, characters in different areas are respectively identified, and the identification accuracy and efficiency are improved.
Referring to fig. 4, a fourth embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
S401, acquiring format specifications and business rules of an invoice, and establishing a verification rule according to the format specifications and the business rules of the invoice;
The invoice has special format specifications, such as the invoice name, invoice code and number, connection times and uses, customer names, issuing banks and accounts, commodity names or business items, measurement units, quantity, unit price, case and case amount, tax rate (collection rate), tax amount, invoicer, invoicing date, invoicing unit (individual) names (chapters) and the like.
In addition, the invoice must be fully issued once in a row according to the specified time limit, sequence and columns, and the special chapter of the invoice is added. The words are used in Chinese, the case and the amount are used in Chinese, and the date of invoicing is also used in Chinese. The columns of the purchase unit name, the goods name or the service item, the specification, the unit, the number, the unit price and the like must be filled in with the specification. The project is completely filled, and the handwriting is clear. All the combinations should be copied or printed once and filled in according to the number sequence.
By establishing the verification rule according to the format specification and the business rule of the invoice, the system can verify according to the related specified requirements, and the verification result is ensured to be correct.
S402, checking the character information according to the checking rule and a pre-constructed enterprise database to obtain a checking result;
s403, if the verification result is that the character information does not accord with the verification rule or cannot be matched with the enterprise database, marking as abnormal;
when the abnormality is found, the corresponding content is marked, so that the user can be reminded to perform manual processing in time, and if the problem that the invoice picture is still unregistered is found through manual inspection, the user is required to provide a new invoice again. If the model itself identifies a problem, the verification can be performed through a manual channel.
S404, if the verification result is that the character information accords with the verification rule and can be matched with the enterprise database, the character information passing the verification is considered to pass the verification, and the character information passing the verification is associated with the logistics order.
After association, the invoice information may be stored using a database or in the form of an electronic document. During registration, the integrity and accuracy of invoice information are ensured, and the information of invoice registration time, registration personnel and the like is recorded at the same time so as to facilitate subsequent inquiry and management. Invoice information may be categorized and archived for ease of query and management. For example, invoice information may be stored in different folders or database tables, classified by invoice type, date of invoicing, tax payer identification number, etc. Meanwhile, an index can be established, so that specific invoice information can be conveniently and rapidly inquired.
Referring to fig. 5, a fifth embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
S501, acquiring paper invoice images of different types, and constructing a definition judgment model according to the paper invoice images of different types;
Paper invoices may have many printed problems or mispreserved problems relative to electronic invoices. For example, if the key information cannot be checked due to stains on the paper invoice, the system cannot recognize the key information and only can provide the invoice again. For another example, paper invoices are printed by a printer, and there may be a problem of unclear printing. The electronic invoice is an electronic file directly generated, so that the problems of fuzzy and unclear content are solved.
S502, judging whether the invoice is a paper invoice according to the character information, and if the invoice is the paper invoice, judging unregistered invoice pictures by adopting a definition judgment model to obtain a judgment result;
the character information comprises all character contents on the invoice, wherein the character information comprises an invoice head, the invoice head accurately represents whether the invoice is an electronic invoice, and whether the invoice is a paper invoice can be judged by judging whether the character information contains information of the electronic invoice;
s503, if the judgment result is that the invoice does not pass, marking the corresponding unregistered invoice picture;
s504, if the judgment result is that the character information passes, checking the character information according to the checking rule and a pre-constructed enterprise database to obtain a checking result.
In this embodiment, if the definition judgment model considers that the definition of the paper invoice picture is too low, the paper invoice picture is marked, at this time, the manual intervention is reminded to check whether the picture is not satisfactory, if the picture is photographed, the user is required to provide the picture again, and if the picture is the paper invoice picture, the user is required to provide a new invoice additionally.
Since all character information is not required to be checked during verification, the paper invoice itself cannot be ensured to have other problems by passing the verification. When the character information is checked, other problems of the paper invoice are eliminated in advance through the definition judgment model, and the subsequently received paper invoice can be ensured to meet the financial requirements.
Referring to fig. 6, a sixth embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
S601, acquiring paper invoice images of different types, marking whether the definition of the paper invoice images meets the requirements, and forming a training set by the paper invoice images and marking results thereof;
Specifically, in order to form a training set with more complete data, it is necessary to obtain paper invoice images with different defects, such as an invoice with dirt, which causes part of key information to be unrecognizable, an invoice with breakage, an invoice with unclear printing, an invoice with incomplete content, and the like. The obtained definition judgment model has higher judgment accuracy by manually marking which types or the degree of invoice is acceptable in advance and then forming a corresponding training set.
S602, training a basic model by using a machine learning algorithm as the basic model and adopting a training set to obtain a definition judgment model;
s603, an automatic optimization iteration strategy is formulated, a training set is automatically updated according to the automatic optimization iteration strategy, and the updated training set is adopted to optimize and iterate the definition judgment model.
Specifically, a new batch of paper invoice images acquired through the system can be formulated, added into a training set, and the training data size is enlarged, so that the accuracy of the definition judgment model can be improved.
Referring to fig. 7, a seventh embodiment of a method for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
S701, if the judgment result is that the business and the related companies pass, acquiring key information of the business and the related companies, wherein the key information comprises a tax payer identification number and a business name, and constructing a business database by using the key information;
in order to reduce the system development cost of enterprises, all the subsidiary companies or the associated companies can adopt the same invoice registration and identification system, and related information data of all the subsidiary companies or the associated companies are stored in an enterprise database, so long as the enterprise name and the tax payer identification number thereof meet the requirements, the enterprise name and the tax payer identification number can be matched and found from the enterprise database.
S702, checking character information according to the checking rule and an enterprise database, wherein the character information comprises tax payer information, billing date, amount and tax rate, and matching the tax payer information in the character information with the enterprise database to obtain a first checking result;
specifically, the character information may further include key information such as goods or tax service, service name, unit, number, unit price, tax, etc. The tax payer information comprises information such as enterprise names, tax payer identification numbers and the like, and as long as one of the tax payer information cannot be matched correctly, the tax payer information is regarded as not passing the verification.
S703, checking the billing date, the amount or the tax rate in the character information according to the checking rule to obtain a second checking result.
For example, if tax payer information in the character information cannot be matched with an enterprise database, the verification is failed, if the billing date in the character information does not accord with the date format, the verification is failed, if the amount in the character information is not positive, the verification is failed, and if the tax rate in the character information does not accord with goods or tax service and service names, the verification is failed.
The method for registering and identifying the material flow order invoice in the embodiment of the invention is described above, and the device for registering and identifying the material flow order invoice in the embodiment of the invention is described below, referring to fig. 8, one embodiment of the device for registering and identifying the material flow order invoice in the embodiment of the invention includes:
the preprocessing module 10 is used for acquiring unregistered invoice pictures and preprocessing the unregistered invoice pictures to acquire preprocessed pictures;
The extracting module 20 is configured to construct a character recognition model, and process the preprocessed picture by using the character recognition model to extract character information in the preprocessed picture;
And the verification module 30 is used for establishing a verification rule, verifying the character information according to the verification rule, marking character information which does not pass the verification, and associating the character information which passes the verification with the logistics order.
Referring to fig. 9, an embodiment of a device for registering and identifying a logistic order invoice according to an embodiment of the present invention includes:
the preprocessing module 10 is used for acquiring unregistered invoice pictures and preprocessing the unregistered invoice pictures to acquire preprocessed pictures;
The extracting module 20 is configured to construct a character recognition model, and process the preprocessed picture by using the character recognition model to extract character information in the preprocessed picture;
The verification module 30 is configured to establish a verification rule, verify the character information according to the verification rule, mark character information that fails to pass the verification, and associate the character information that passes the verification with the logistics order;
in this embodiment, the preprocessing module 10 includes:
The denoising unit 11 is used for acquiring unregistered invoice pictures, and denoising the unregistered invoice pictures by adopting a filtering algorithm to acquire first pictures;
an adjusting unit 12, configured to process the first picture by using a histogram equalization method, so as to enhance a contrast ratio of the first picture, and obtain a second picture;
a conversion unit 13, configured to process the second picture by using a threshold method, so as to convert the second picture into a black-white binary image, thereby obtaining a preprocessed picture;
In this embodiment, the extracting module 20 includes:
An acquiring unit 21, configured to acquire an invoice sample image, and label characters in the invoice sample image to obtain training data;
a training unit 22, configured to train the basic model with the deep learning framework and training data by using the convolutional neural network model as the basic model, so as to obtain a preliminary model;
An optimizing unit 23, configured to verify the accuracy of the preliminary model, and continuously optimize the preliminary model according to the verification result, so as to obtain a character recognition model;
in this embodiment, the verification module 30 includes:
the establishing sub-module 31 is used for acquiring the format specification and the business rule of the invoice, and establishing a verification rule according to the format specification and the business rule of the invoice;
A verification sub-module 32, configured to verify the character information according to the verification rule and a pre-constructed enterprise database, so as to obtain a verification result;
A marking sub-module 33, configured to mark as abnormal if the verification result indicates that the character information does not conform to the verification rule or cannot be matched with the enterprise database;
The association sub-module 34 is configured to, if the verification result indicates that the character information accords with the verification rule and can be matched with the enterprise database, consider that the verification is passed, and associate the character information passed by the verification with the logistics order;
In this embodiment, the verification sub-module 32 includes:
the construction unit 321 is configured to obtain different types of paper invoice images, and construct a definition judgment model according to the different types of paper invoice images;
the judging unit 322 is configured to judge whether the invoice is a paper invoice according to the character information, and if the invoice is a paper invoice, judge an unregistered invoice picture by using a definition judgment model to obtain a judgment result;
a marking unit 323, configured to mark the corresponding unregistered invoice picture if the determination result is not passed;
the checking unit 324 is configured to check the character information according to the checking rule and a pre-constructed enterprise database if the determination result is passed, so as to obtain a checking result;
in this embodiment, the building unit 321 includes:
the obtaining subunit 3211 is configured to obtain paper invoice images of different types, label whether the sharpness of the paper invoice images meets the requirement, and form a training set according to the paper invoice images and the labeling results thereof;
a training subunit 3212, configured to train the basic model with a training set by using a machine learning algorithm as the basic model, so as to obtain a definition judgment model;
An optimizing subunit 3213, configured to formulate an automatic optimization iteration strategy, automatically update the training set according to the automatic optimization iteration strategy, and optimize and iterate the sharpness judgment model by using the updated training set;
In this embodiment, the verification unit 324 includes:
A construction subunit 3241, configured to obtain key information of the enterprise and its related companies if the determination result is passed, where the key information includes a tax payer identifier and an enterprise name, and construct an enterprise database according to the key information;
The first checking subunit 3242 is configured to check character information according to the checking rule and the enterprise database, where the character information includes tax payer information, date of invoicing, amount of money, and tax rate, and match the tax payer information in the character information with the enterprise database to obtain a first checking result;
and a second checking subunit 3243, configured to check the billing date, amount, or tax rate in the character information according to the checking rule, so as to obtain a second checking result.
According to the logistics order invoice registration recognition device, the quality of the obtained unregistered invoice picture is improved by automatically preprocessing the unregistered invoice picture, the character recognition model is built, the preprocessed picture is processed by adopting the character recognition model, character information in the preprocessed picture can be accurately extracted, the efficiency is high, manual processing is not needed, the registration of invoice contents can be rapidly completed, finally, the character information is checked according to the check rule by establishing the check rule, if the character information is not checked, the mark is carried out, and the checked character information is associated with a logistics order, so that the invoice information is stored.
The above describes the logistics order invoice registration and identification apparatus in the embodiment of the present invention in detail from the point of view of the modularized functional entity, and the logistics order invoice registration and identification device in the embodiment of the present invention is described in detail from the point of view of hardware processing.
Fig. 10 is a schematic diagram of a configuration of a device for registering and identifying a logistics order invoice according to an embodiment of the present invention, where the device 900 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 910 (e.g., one or more processors) and a memory 920, one or more storage media 930 (e.g., one or more mass storage devices) storing application programs 933 or data 932. Wherein the memory 920 and storage medium 930 may be transitory or persistent storage. The program stored on the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations in the logistics order invoice registration identification apparatus 900. Still further, the processor 910 may be configured to communicate with the storage medium 930 and execute a series of instruction operations in the storage medium 930 on the logistics order invoice registration recognition device 900 to implement the steps of the logistics order invoice registration recognition method provided by the above-described method embodiments.
The logistics order invoice registration identification apparatus 900 may also include one or more power supplies 940, one or more wired or wireless network interfaces 950, one or more input/output interfaces 960, and/or one or more operating systems 931, such as Windows Serve, mac OS X, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the configuration of the logistics order invoice registration recognition device illustrated in fig. 10 does not constitute a limitation of the logistics order invoice registration recognition device, and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and which may also be a volatile computer readable storage medium, having stored therein instructions that, when executed on a computer, cause the computer to perform the steps of a method for registering and identifying a logistic order invoice.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus or device described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
It will be understood that equivalents and modifications will occur to those skilled in the art in light of the present invention and their spirit, and all such modifications and substitutions are intended to be included within the scope of the present invention as defined in the following claims.

Claims (10)

1.一种物流订单发票登记识别方法,其特征在于,包括如下步骤:1. A logistics order invoice registration and identification method, characterized in that it includes the following steps: 获取未登记发票图片,对未登记发票图片进行预处理,以获得预处理图片;Acquire an unregistered invoice image, and pre-process the unregistered invoice image to obtain a pre-processed image; 构建字符识别模型,采用所述字符识别模型对预处理图片进行处理,以提取预处理图片中的字符信息;Constructing a character recognition model, and using the character recognition model to process the preprocessed image to extract character information in the preprocessed image; 建立校验规则,根据所述校验规则对字符信息进行校验,对校验不通过的字符信息进行标记,将校验通过的字符信息与物流订单进行关联。Verification rules are established, character information is verified according to the verification rules, character information that fails the verification is marked, and character information that passes the verification is associated with the logistics order. 2.根据权利要求1所述的物流订单发票登记识别方法,其特征在于,所述获取未登记发票图片,对未登记发票图片进行预处理,以获得预处理图片,具体包括:2. The logistics order invoice registration and identification method according to claim 1 is characterized in that the step of obtaining the unregistered invoice image and preprocessing the unregistered invoice image to obtain the preprocessed image specifically comprises: 获取未登记发票图片,采用滤波算法对未登记发票图片进行去噪处理,以获得第一图片;Obtain an unregistered invoice image, and perform denoising on the unregistered invoice image using a filtering algorithm to obtain a first image; 采用直方图均衡化方法处理第一图片,以增强第一图片的对比度,获得第二图片;Processing the first image by using a histogram equalization method to enhance the contrast of the first image, thereby obtaining a second image; 采用阈值法处理第二图片,以将第二图片转换为黑白二值图像,获得预处理图片。The second image is processed by using a threshold method to convert the second image into a black and white binary image to obtain a preprocessed image. 3.根据权利要求1所述的物流订单发票登记识别方法,其特征在于,所述构建字符识别模型,采用所述字符识别模型对预处理图片进行处理,以提取预处理图片中的字符信息,具体包括:3. The method for registering and identifying a logistics order invoice according to claim 1 is characterized in that the step of constructing a character recognition model and using the character recognition model to process a preprocessed image to extract character information from the preprocessed image specifically comprises: 获取发票样本图像,对发票样本图像中的字符进行标注,以获得训练数据;Obtaining sample invoice images and annotating characters in the sample invoice images to obtain training data; 以卷积神经网络模型为基础模型,使用深度学习框架以及训练数据对基础模型进行训练,获得初步模型;Taking the convolutional neural network model as the basic model, the deep learning framework and training data are used to train the basic model to obtain a preliminary model; 对所述初步模型的准确率进行验证,并根据验证结果对初步模型进行持续优化,获得字符识别模型。The accuracy of the preliminary model is verified, and the preliminary model is continuously optimized according to the verification result to obtain a character recognition model. 4.根据权利要求1所述的物流订单发票登记识别方法,其特征在于,所述建立校验规则,根据所述校验规则对字符信息进行校验,对校验不通过的字符信息进行标记,将校验通过的字符信息与物流订单进行关联,具体包括:4. The method for registering and identifying logistics order invoices according to claim 1 is characterized in that the establishment of verification rules, verification of character information according to the verification rules, marking of character information that fails the verification, and association of character information that passes the verification with the logistics order specifically include: 获取发票的格式规范和业务规则,根据所述发票的格式规范和业务规则建立校验规则;Obtaining the format specifications and business rules of the invoice, and establishing verification rules according to the format specifications and business rules of the invoice; 根据所述校验规则以及预先构建的企业数据库对字符信息进行校验,以获得校验结果;Verify the character information according to the verification rules and the pre-built enterprise database to obtain a verification result; 若校验结果为字符信息不符合校验规则或无法与企业数据库匹配,则标记为异常;If the verification result is that the character information does not meet the verification rules or cannot be matched with the enterprise database, it is marked as abnormal; 若校验结果为字符信息符合校验规则以及能与企业数据库完成匹配,则视为检验通过,将校验通过的字符信息与物流订单进行关联。If the verification result is that the character information complies with the verification rules and can be matched with the enterprise database, it is considered to have passed the inspection, and the verified character information will be associated with the logistics order. 5.根据权利要求4所述的物流订单发票登记识别方法,其特征在于,所述根据所述校验规则以及预先构建的企业数据库对字符信息进行校验,以获得校验结果,具体包括:5. The method for registering and identifying logistics order invoices according to claim 4 is characterized in that the character information is verified according to the verification rules and the pre-built enterprise database to obtain the verification result, specifically comprising: 获取不同类型的纸质发票图像,根据不同类型的纸质发票图像构建清晰度判断模型;Obtain different types of paper invoice images, and build a clarity judgment model based on the different types of paper invoice images; 根据字符信息判断发票是否为纸质发票,若发票为纸质发票,采用清晰度判断模型对未登记发票图片进行判断,以获得判断结果;Determine whether the invoice is a paper invoice based on the character information. If the invoice is a paper invoice, use the clarity judgment model to judge the unregistered invoice image to obtain a judgment result; 若判断结果为不通过,则将对应的未登记发票图片进行标记;If the judgment result is failure, the corresponding unregistered invoice image will be marked; 若判断结果为通过,根据所述校验规则以及预先构建的企业数据库对字符信息进行校验,以获得校验结果。If the judgment result is passed, the character information is verified according to the verification rule and the pre-built enterprise database to obtain a verification result. 6.根据权利要求5所述的物流订单发票登记识别方法,其特征在于,所述获取不同类型的纸质发票图像,根据不同类型的纸质发票图像构建清晰度判断模型,具体包括:6. The method for registering and identifying logistics order invoices according to claim 5 is characterized in that the step of acquiring different types of paper invoice images and constructing a clarity judgment model according to different types of paper invoice images specifically includes: 获取不同类型的纸质发票图像,对纸质发票图像的清晰度是否符合要求进行标注,以纸质发票图像及其标注结果形成训练集;Obtain different types of paper invoice images, mark whether the clarity of the paper invoice images meets the requirements, and form a training set with the paper invoice images and their marking results; 以机器学习算法为基础模型,采用训练集对基础模型进行训练,以获得清晰度判断模型;Taking the machine learning algorithm as the basic model, the basic model is trained with the training set to obtain the clarity judgment model; 制定自动优化迭代策略,根据自动优化迭代策略自动更新训练集,采用更新后的训练集对清晰度判断模型进行优化和迭代。An automatic optimization iteration strategy is formulated, the training set is automatically updated according to the automatic optimization iteration strategy, and the clarity judgment model is optimized and iterated using the updated training set. 7.根据权利要求5所述的物流订单发票登记识别方法,其特征在于,若判断结果为通过,根据所述校验规则以及预先构建的企业数据库对字符信息进行校验,以获得校验结果,具体包括:7. The method for registering and identifying a logistics order invoice according to claim 5 is characterized in that, if the judgment result is passed, the character information is verified according to the verification rules and the pre-built enterprise database to obtain the verification result, specifically including: 若判断结果为通过,获取企业及其关联公司的关键信息,所述关键信息包括纳税人识别号、企业名称,以所述关键信息构建企业数据库;If the judgment result is passed, key information of the enterprise and its affiliated companies is obtained, the key information includes the taxpayer identification number and the enterprise name, and the enterprise database is constructed with the key information; 根据所述校验规则以及企业数据库对字符信息进行校验,所述字符信息包括:纳税人信息、开票日期、金额以及税率,将字符信息中的纳税人信息与企业数据库匹配,以获得第一校验结果;Verifying the character information according to the verification rule and the enterprise database, the character information including: taxpayer information, invoice date, amount and tax rate, matching the taxpayer information in the character information with the enterprise database to obtain a first verification result; 根据校验规则对字符信息中的开票日期、金额或税率进行校验,以获得第二校验结果。The invoicing date, amount or tax rate in the character information is verified according to the verification rule to obtain a second verification result. 8.一种物流订单发票登记识别装置,其特征在于,包括:8. A logistics order invoice registration and identification device, characterized by comprising: 预处理模块,用于获取未登记发票图片,对未登记发票图片进行预处理,以获得预处理图片;A preprocessing module, used for obtaining an unregistered invoice image, and preprocessing the unregistered invoice image to obtain a preprocessed image; 提取模块,用于构建字符识别模型,采用所述字符识别模型对预处理图片进行处理,以提取预处理图片中的字符信息;An extraction module, used to construct a character recognition model, and use the character recognition model to process the pre-processed image to extract character information in the pre-processed image; 校验模块,用于建立校验规则,根据所述校验规则对字符信息进行校验,对校验不通过的字符信息进行标记,将校验通过的字符信息与物流订单进行关联。The verification module is used to establish verification rules, verify the character information according to the verification rules, mark the character information that fails the verification, and associate the character information that passes the verification with the logistics order. 9.一种物流订单发票登记识别设备,其特征在于,包括存储器和至少一个处理器,所述存储器中存储有计算机可读指令;9. A logistics order invoice registration and identification device, characterized in that it includes a memory and at least one processor, wherein the memory stores computer-readable instructions; 所述至少一个处理器调用所述存储器中的所述计算机可读指令,以执行如权利要求1-7中任一项所述物流订单发票登记识别方法的各个步骤。The at least one processor calls the computer-readable instructions in the memory to execute the various steps of the logistics order invoice registration and identification method as described in any one of claims 1-7. 10.一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如权利要求1-7中任一项所述物流订单发票登记识别方法的各个步骤。10. A computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions, when executed by a processor, implement the various steps of the logistics order invoice registration and identification method as described in any one of claims 1 to 7.
CN202411674888.7A 2024-11-21 2024-11-21 Logistics order invoice registration and identification method, device, equipment and storage medium Pending CN119478964A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411674888.7A CN119478964A (en) 2024-11-21 2024-11-21 Logistics order invoice registration and identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411674888.7A CN119478964A (en) 2024-11-21 2024-11-21 Logistics order invoice registration and identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN119478964A true CN119478964A (en) 2025-02-18

Family

ID=94567805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411674888.7A Pending CN119478964A (en) 2024-11-21 2024-11-21 Logistics order invoice registration and identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN119478964A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119785369A (en) * 2025-03-11 2025-04-08 陕西交通电子工程科技有限公司 Invoice information positioning and interception system and method based on image recognition

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119785369A (en) * 2025-03-11 2025-04-08 陕西交通电子工程科技有限公司 Invoice information positioning and interception system and method based on image recognition

Similar Documents

Publication Publication Date Title
US11676185B2 (en) System and methods of an expense management system based upon business document analysis
US9342741B2 (en) Systems, methods and computer program products for determining document validity
US11436852B2 (en) Document information extraction for computer manipulation
CN112395996A (en) Financial bill OCR recognition and image processing method, system and readable storage medium
CN108717545B (en) Bill identification method and system based on mobile phone photographing
US8879846B2 (en) Systems, methods and computer program products for processing financial documents
CN111476109A (en) Bill processing method, bill processing apparatus, and computer-readable storage medium
CN103617415A (en) Device and method for automatically identifying invoice
CN109002768A (en) Medical bill class text extraction method based on the identification of neural network text detection
JP2015146075A (en) accounting data input support system, method, and program
CN111539414B (en) Method and system for character recognition and character correction of OCR (optical character recognition) image
CN112418812A (en) Distributed full-link automatic intelligent clearance system, method and storage medium
US20220292861A1 (en) Docket Analysis Methods and Systems
CN110610175A (en) OCR data mislabeling cleaning method
CN109840520A (en) A kind of invoice key message recognition methods and system
CN119478964A (en) Logistics order invoice registration and identification method, device, equipment and storage medium
JP2019191665A (en) Financial statements reading device, financial statements reading method and program
CN116612479A (en) A lightweight bill OCR recognition method and system
CN111768565B (en) Method for identifying and post-processing invoice codes in value-added tax invoices
CN120317999A (en) Account reporting method, device and electronic device based on OCR and Transformer decoder
CN111881880A (en) Bill text recognition method based on novel network
CN117831052A (en) Identification method and device for financial form, electronic equipment and storage medium
CN110503094A (en) Occupational certificate photo name plate identification method and device
TWI879700B (en) Methods for extracting OCR data of purchase and sales items or other documents
US20240419742A1 (en) Systems and methods for automated document ingestion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination