[go: up one dir, main page]

CN112559817A - Report content checking method, system, computer equipment and storage medium - Google Patents

Report content checking method, system, computer equipment and storage medium Download PDF

Info

Publication number
CN112559817A
CN112559817A CN202011266412.1A CN202011266412A CN112559817A CN 112559817 A CN112559817 A CN 112559817A CN 202011266412 A CN202011266412 A CN 202011266412A CN 112559817 A CN112559817 A CN 112559817A
Authority
CN
China
Prior art keywords
report
regular expression
content
data
regular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011266412.1A
Other languages
Chinese (zh)
Other versions
CN112559817B (en
Inventor
柴源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haichuanghui Technology Entrepreneurship Development Co ltd
Original Assignee
Beijing Chuangye Guangrong Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chuangye Guangrong Information Technology Co ltd filed Critical Beijing Chuangye Guangrong Information Technology Co ltd
Priority to CN202011266412.1A priority Critical patent/CN112559817B/en
Publication of CN112559817A publication Critical patent/CN112559817A/en
Application granted granted Critical
Publication of CN112559817B publication Critical patent/CN112559817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a report content verification method, which comprises the steps of determining key attention fields in a report according to all acquired data, designing a regular expression for each key attention field according to the type and the attribute of each key attention field, and establishing a regular expression set; importing N known correct reports, extracting a plurality of key data characteristics of the reports, correcting regular expressions according to the extracted key data characteristics, outputting the corrected regular expressions, and establishing a standard regular expression set; and importing a target report, checking the text content of the target report, and displaying an error or prompting an abnormal position when the content of the target report is not consistent with the regular expression, and highlighting the error or prompting the abnormal position. The invention also discloses a report content checking system, computer equipment and a storage medium. The invention improves the flexibility and accuracy of the verification.

Description

Report content checking method, system, computer equipment and storage medium
Technical Field
The invention relates to the technical field of computer software, in particular to a method and a system for checking report contents, computer equipment and a storage medium.
Background
The report forms are used for dynamically displaying data in forms, charts and other formats. Before computers were not available, people recorded data with paper and pens, and errors generally occurred in that reports were either misread when they were written line by line into the summary sheet, or missed when they were reviewed. After the advent of the computer age, data was recorded by computer software, and while efficiency was improved, errors still occurred. To avoid these errors or minimize them, the verification step is an important link in the report.
With the development of computer technology, the cloud era has come, and in daily life, work and entertainment, more and more data are generated based on networks, and infinite value is stored in mass data, so that data can be counted and processed into various reference indexes from different dimensions to form a data report, and the data can be analyzed more visually. For example, for an enterprise company, the management status of the enterprise can be reflected by constructing a financial report so as to provide effective decision support. However, as the data report is increasingly complex and the report amount increases, the verification is particularly important, and if the verification is performed only by manual verification, the workload is very large, a large amount of manpower and material resources are needed for processing, so that not only is the resource waste seriously caused, but also the artificial error or error rate is difficult to control.
In order to improve the verification efficiency and the verification accuracy, different verification systems and platforms appear successively, for example, CN 201910602951.9 discloses a report data verification method, in which a corresponding data verification rule is obtained according to a report data verification request, report data corresponding to the data verification rule is further obtained, and then the report data is directly verified according to the data verification rule, so as to obtain a report data verification result. The processing is carried out without manpower input, and the checking processing efficiency of the report data is improved. However, the verification rule of the method is established, once the report content layout changes, the verification rule needs to be reconstructed, the flexibility is not strong, and the verification accuracy cannot be guaranteed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a report content verification method, which does not need to adjust and correct the verification rules for the same type of table for many times, improves the verification flexibility, can automatically integrate and verify the contents of the subsequently uploaded report, and improves the verification accuracy.
In order to solve the problems, the technical scheme adopted by the invention is as follows:
a method for checking report contents includes
Step 1: traversing users and report types, obtaining all data in a user report, analyzing data required by a single item in the report and data required by a plurality of item joint analysis to form rules according to all the obtained data, determining key focus fields in the report, designing a regular expression for each key focus field according to the type and the attribute of the key focus field, and establishing a regular expression set;
step 2: importing N known correct reports, extracting a plurality of key data characteristics of the reports, associating the key data characteristics with the regular expression according to attribute information of the key data characteristics, correcting the regular expression according to the extracted key data characteristics when a certain key data characteristic is not consistent with the regular expression, and outputting the corrected regular expression; correcting each regular expression according to the same steps, and establishing a standard regular expression set;
and step 3: and importing a target report, extracting text content from the target report, matching the report content according to each regular expression, checking the text content, and displaying an error or prompting an abnormal position and highlighting when the target report content does not accord with the regular expression.
As a further and optional solution, the step 1 of establishing the regular expression set specifically includes,
s11, according to the type and the attribute of the focus field, combining at least one specific symbol for matching any character as a matching item of the regular expression, and sequencing the matching item to form at least one regular expression;
s12 takes each kind of data in the report content as a node, numbers each node to clarify the data type represented by the node, calculates the regular expression of each node in the table content according to the connection relation of each node and the number, and takes all the regular expressions as the character string codes of the data structure to construct a regular expression set.
As a further alternative, the single item of the present invention includes one of a row, a column, a header, and a grid position.
As a further and optional solution, the specific steps of verifying the text content described in the present invention include:
determining the matching degree of the text content;
determining a verification result for verifying the text content based on the matching degree;
judging whether the matching degree is greater than or equal to a preset matching degree;
if so, judging that the text content passes the verification;
if not, the text content check is judged not to pass.
As a further and optional scheme, the regular expression set described in the present invention includes a regular expression for a header, a regular expression for a column name, a regular expression for a row name, a regular expression for a lattice attribute, and a regular expression for a lattice content.
As a further and optional scheme, the key data features of the present invention are one or more of data format, text length, numerical size, decimal floating point, language, decimal decimals, and column decimals.
As a further and optional aspect, the method of the present invention further comprises
And 4, step 4: and aiming at the display error or prompt abnormity, a correction suggestion is provided for the report to be verified, specifically, a regular expression is retrieved according to the position of the display error or prompt abnormity, a part of characteristics corresponding to the display error or prompt abnormity characteristics are taken out from the regular expression, and the part of characteristics are output.
The invention also provides a report content checking method, which comprises
Step 1: importing N known correct reports, extracting a plurality of key data characteristics of the reports, associating the key data characteristics with an automatically generated regular expression according to attribute information of the key data characteristics, and establishing a set of the key data characteristics and the regular expression;
step 2: introducing new target reports in large batch, reporting errors for the contents of the reports, selecting a certain characteristic to compare with the key data characteristic, reporting errors if the characteristics are not matched, retrieving regular expressions according to the characteristics of the reported errors, and taking the regular expressions with high enough task matching degree as the regular expressions for checking;
and step 3: importing a target report to be verified, extracting text content from the target report, matching the report content according to each regular expression, verifying the text content, and displaying an error or prompting an abnormal position and highlighting when the target report content does not accord with the regular expression.
The invention also provides a report content checking system, which comprises
The report import module is used for importing a target report according to a user instruction;
the analysis and construction module is used for acquiring data of a target report, analyzing data required by a single item in the report and data construction rules required by joint analysis of a plurality of items according to all the acquired data and user instructions, determining key focus fields in the report, designing a design regular expression of each key focus field according to the type and the attribute of each key focus field, and establishing a regular expression set;
the verification rule correction module is used for acquiring a plurality of key data features in a known and correct report, associating the attribute information of the key data features with the regular expression according to a user instruction, correcting the regular expression according to the extracted key data features when a certain key data feature is not consistent with the regular expression, and outputting the corrected regular expression; correcting each regular expression according to the same steps, and establishing a standard regular expression set;
and the report verification processing module is used for verifying the contents of the target report according to the regular expression.
The invention also provides computer equipment which comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the steps of the report content checking method when executing the computer program.
The invention also provides a storage medium, wherein the storage medium is stored with a computer program, and the computer program realizes the steps of the report content verification method when being executed by a processor.
Compared with the prior art, the invention has the beneficial effects that:
the report content verification method provided by the invention verifies the target report content through the regular expression, corrects the regular expression through the key data characteristics, introduces the key data characteristics into the regular expression, does not need to re-edit the regular expression as long as the key data items corresponding to the report do not change when the layout of the report changes, and improves the flexibility of the verification rule. Because the key data characteristics come from the known and correct report contents, errors are not easy to occur in verification, and the verification accuracy is improved.
The present invention will be described in further detail with reference to specific embodiments.
Detailed Description
The report content checking method comprises the following steps
Step 1: traversing the types of the user and the report to obtain all data in the user report, analyzing data required by a single item in the report and data required by the joint analysis of a plurality of items according to all the obtained data to form rules, determining key focus fields in the report, designing a regular expression for each key focus field according to the type and the attribute of the key focus field, establishing a regular expression set, specifically comprising the steps of,
s11, according to the type and the attribute of the focus field, combining at least one specific symbol for matching any character as a matching item of the regular expression, and sequencing the matching item to form at least one regular expression;
s12, taking each type of data in the report content as a node, numbering each node to clarify the data type represented by the node, calculating the regular expression of each node in the table content according to the connection relation and the number of each node, taking all the regular expressions as the character string codes of the data structure, and constructing a regular expression set;
step 2: importing N known correct reports (wherein the number of N is greater than or equal to 5), extracting a plurality of key data characteristics of the reports, associating the key data characteristics with the regular expression according to attribute information of the key data characteristics, correcting the regular expression according to the extracted key data characteristics when a certain key data characteristic is not consistent with the regular expression, and outputting the corrected regular expression; correcting each regular expression according to the same steps, and establishing a standard regular expression set;
and step 3: importing a target report (starting from the (N + 1) th table), extracting text contents from the target report, matching the report contents according to each regular expression, checking the text contents, and displaying an error or prompting an abnormal position and highlighting when the target report contents do not accord with the regular expressions; the specific steps of verifying the text content comprise:
determining the matching degree of the text content;
determining a verification result for verifying the text content based on the matching degree;
judging whether the matching degree is greater than or equal to a preset matching degree;
if so, judging that the text content passes the verification;
if not, the text content check is judged not to pass.
As a further and optional scheme, the regular expression set described in the present invention includes a regular expression for a header, a regular expression for a column name, a regular expression for a row name, a regular expression for a lattice attribute, and a regular expression for a lattice content.
As a further and optional scheme, the key data features of the present invention are one or more of data format, text length, numerical size, decimal floating point, language, decimal decimals, and column decimals.
As a further and optional aspect, the method of the present invention further comprises
And 4, step 4: and aiming at the display error or prompt abnormity, a correction suggestion is provided for the report to be verified, specifically, a regular expression is retrieved according to the position of the display error or prompt abnormity, a part of characteristics corresponding to the display error or prompt abnormity characteristics are taken out from the regular expression, and the part of characteristics are output.
The invention also provides a report content checking system, which comprises
The report import module is used for importing a target report according to a user instruction;
the analysis and construction module is used for acquiring data of a target report, analyzing data required by a single item in the report and data construction rules required by joint analysis of a plurality of items according to all the acquired data and user instructions, determining key focus fields in the report, designing a design regular expression of each key focus field according to the type and the attribute of each key focus field, and establishing a regular expression set;
the verification rule correction module is used for acquiring a plurality of key data features in a known and correct report, associating the attribute information of the key data features with the regular expression according to a user instruction, correcting the regular expression according to the extracted key data features when a certain key data feature is not consistent with the regular expression, and outputting the corrected regular expression; correcting each regular expression according to the same steps, and establishing a standard regular expression set;
and the report verification processing module is used for verifying the contents of the target report according to the regular expression.
The invention also provides computer equipment which comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the steps of the report content checking method when executing the computer program.
The invention also provides a storage medium, wherein the storage medium is stored with a computer program, and the computer program realizes the steps of the report content verification method when being executed by a processor.
Example 1
Taking an EXCEL template report as an example, the report content verification method comprises the steps of
Step 1: traversing users and report types, obtaining all data in a user report, analyzing data required by a single item (such as a header, a column, a row, a grid position and grid content) in the report and data required by joint analysis of a plurality of items according to all the obtained data to form a rule, determining key focus fields in the report, designing a design regular expression of each key focus field according to the type and the attribute of the key focus field, establishing a regular expression set, specifically comprising the steps of,
s11, according to the type and the attribute of the focus field, combining at least one specific symbol for matching any character as a matching item of the regular expression, and sequencing the matching item to form at least one regular expression;
s12, taking each type of data in the report content as a node, numbering each node to clarify the data type represented by the node, calculating the regular expression of each node in the table content according to the connection relation and the number of each node, taking all the regular expressions as the character string codes of the data structure, and constructing a regular expression set; in this step, each type of data refers to data as one type, by row, column, table content, table position, or the like, and the established regular expressions include a regular expression for a header, a regular expression for a name of a certain column, a regular expression for a name of a certain row, a regular expression for an attribute of a certain cell, a regular expression for a value of a certain cell, generally, a format for a value (for example, 2 bits are left after a decimal point), and a range of values (for example, 0 to 10 ten thousand);
step 2: importing N known correct reports, extracting a plurality of key data characteristics of the reports, wherein the key data characteristics comprise data formats, text lengths, numerical values, decimal floating points, languages, row numerical values, column numerical values and the like, associating the key data characteristics with the regular expressions according to attribute information of the key data characteristics, and when a certain key data characteristic is not consistent with the regular expressions, correcting the regular expressions according to the extracted key data characteristics and outputting the corrected regular expressions; correcting each regular expression according to the same steps, and establishing a standard regular expression set; in this step, the specific method of extraction includes:
s21 extracting the content of the grid, and assigning a grid, generally, the grid address is (r, c), i.e. the r-th row and c-th column, (1) recording the data format of the grid, including date, currency, text, etc.; (2) if the text is the text, recording the length of the longest text so far; (3) if the number is the longest number, recording the number of digits of the number; (4) if the number is a digit, recording the smallest number floating point number which is the most up to now; (5) recording the language used, such as simplified Chinese, traditional Chinese, English in British, English in America; taking the data format, text, number and language used for recording as key data characteristics;
s22 extracting line content, giving the r-th line of a certain line, recording a line numerical subtotal, namely if the line numerical subtotal is a number, recording the sum of all grid numerical values of the line, and taking the maximum numerical value as a key data feature, wherein the maximum numerical value is the sum of all grid numerical values of the line;
s23 extracting the content of the column, and given the a-th column of a certain column, recording the column numerical subtotal, namely if the column is a number, recording the sum of all grid numerical values of the column, and the maximum numerical value up to now; taking the maximum value as a key data characteristic;
and step 3: importing a target report, extracting text content from the target report, matching the report content according to each regular expression, checking the text content, and displaying an error or prompting an abnormal position and highlighting when the target report content does not accord with the regular expressions; the specific steps of verifying the text content comprise:
determining the matching degree of the text content;
determining a verification result for verifying the text content based on the matching degree;
judging whether the matching degree is greater than or equal to a preset matching degree; the preset matching pair calculation mode is as follows: matching degree is similarity coefficient, data format difference coefficient and language difference coefficient;
wherein the similarity coefficient is text length (x, y)/text length (r, c) + numerical size (x, y)/numerical size (r, c) + decimal floating point (x, y)/decimal floating point (r, c); wherein (x, y) refers to the characteristics in the target report to be verified, and (r, c) refers to the characteristics extracted from the regular expression (the same below);
the data format difference coefficient is expressed by a formula (i), if the data format (r, c) is different from the data format (x, y), the coefficient takes a value of 0.5; (ii) if the data format (r, c) is the same as the data format (x, y), the coefficient takes a value of 0.0;
the language dissimilarity coefficient is calculated, and the formula is expressed as (i) if the language (r, c) does not contain the language (x, y), the coefficient takes the value of 0.5; (ii) if the language (r, c) contains the language (x, y), the coefficient takes the value 0.0; (x, y) represents the content in the target report to be verified, and (r, c) represents the characteristics taken out from the regular expression;
if the matching degree exceeds 2.0, the matching degree is considered to be greater than or equal to the preset matching degree, and the text content is judged to pass the verification;
if the matching degree is less than 2.0, the matching degree is considered to be less than the preset matching degree, and the text content verification is judged not to pass.
Example 2
On the basis of the scheme of the embodiment 1, the method further comprises the following step
And 4, step 4: and aiming at the display error or prompt abnormity, a correction suggestion is provided for the report to be verified, specifically, a regular expression is retrieved according to the position of the display error or prompt abnormity, a part of characteristics corresponding to the display error or prompt abnormity characteristics are taken out from the regular expression, and the part of characteristics are output. The user makes corrections to the table contents based on the characteristics of the output section.
Example 3
Taking an EXCEL template report as an example, the report content verification method comprises the steps of
Step 1: importing N known correct reports, extracting a plurality of key data characteristics of the reports, associating the key data characteristics with an automatically generated regular expression according to attribute information of the key data characteristics, and establishing a set of the key data characteristics and the regular expression; in this step, the specific method of extraction includes:
s11 extracting the content of the grid, and assigning a grid, generally, the grid address is (r, c), i.e. the r-th row and c-th column, (1) recording the data format of the grid, including date, currency, text, etc.; (2) if the text is the text, recording the length of the longest text so far; (3) if the number is the longest number, recording the number of digits of the number; (4) if the number is a digit, recording the smallest number floating point number which is the most up to now; (5) recording the language used, such as simplified Chinese, traditional Chinese, English in British, English in America; taking the data format, text, number and language used for recording as key data characteristics;
s12 extracting line content, giving the r-th line of a certain line, recording a line numerical subtotal, namely if the line numerical subtotal is a number, recording the sum of all grid numerical values of the line, and taking the maximum numerical value as a key data feature, wherein the maximum numerical value is the sum of all grid numerical values of the line;
s13 extracting the content of the column, and given the a-th column of a certain column, recording the column numerical subtotal, namely if the column is a number, recording the sum of all grid numerical values of the column, and the maximum numerical value up to now; taking the maximum value as a key data characteristic;
step 2: introducing new target reports in large batch, reporting errors for the contents of the reports, selecting a certain characteristic to compare with the key data characteristic, reporting errors if the characteristics are not matched, retrieving the regular expression according to the characteristics reported with the errors, and taking the regular expression with high enough task matching degree (when the matching degree is more than or equal to 2, the matching degree is considered to be high enough, and the calculation formula of the matching degree is the same as that of the embodiment 1) as the regular expression for verification; the specific steps of error reporting of the form content include
S21, a grid in the given form is given, (a) the grid is compared with the extracted data format characteristics, if the grid is not matched with the extracted data format characteristics, the data format is reported to be wrong; (b) comparing with the extracted text length feature, if the feature is less than 50% of the text length of the grid, reporting that the text length is longer and suspected to be wrong; (c) comparing the extracted numerical value size characteristic with the extracted numerical value size characteristic, and if the characteristic is less than 50% of the numerical value of the grid, reporting that the numerical value is larger and suspected to be wrong; (d) comparing the characteristics with the extracted decimal floating point characteristics, and if the characteristics are more than 1 bit of the lattice decimal floating point, reporting that the decimal floating point has more digits and is suspected to be wrong; (e) comparing the extracted language features with the extracted language features, and if the features do not contain the language of the lattice, reporting that 'the language which is not seen is suspected to be wrong';
s22, a line in the form is given, (a) whether all the grids of the line are numbers is judged; (b) comparing the extracted line number counting characteristic with the extracted line number counting characteristic, and if the characteristic is less than 50% of the line number, reporting that the line number is larger and suspected to be wrong;
s23, a column in the form is given, (a) whether all the grids in the column are numbers is judged; (b) comparing the characteristic with the extracted column number subtotal characteristic, and if the characteristic is less than 50% of the column number, reporting that the column number is larger and is suspected to be wrong;
and step 3: importing a target report to be verified, extracting text content from the target report, matching the report content according to each regular expression, verifying the text content, and displaying an error or prompting an abnormal position and highlighting when the target report content does not accord with the regular expression.
Further, on the basis of the above embodiment 3, in the case of error report, a modification suggestion is proposed for the new form based on the regular expression described in the foregoing step.
The above embodiments are only preferred embodiments of the present invention, and the protection scope of the present invention is not limited thereby, and any insubstantial changes and substitutions made by those skilled in the art based on the present invention are within the protection scope of the present invention.

Claims (10)

1.一种报表内容校验方法,其特征在于,包括1. a report content verification method, is characterized in that, comprises 步骤1:遍历用户、报表类型,获得用户报表中的所有数据,根据所获得的所有数据,分析报表中单一项目所需的数据和多个项目联合分析所需的数据构成规则,确定报表中的重点关注字段,根据重点关注字段的类型和属性设计各个重点关注字段的设计正则表达式,建立正则表达式集合;Step 1: Traverse the user and report type, obtain all the data in the user report, and analyze the data required for a single item in the report and the data required for joint analysis of multiple projects according to all the data obtained. Focus on fields, design regular expressions for each focused field according to the types and attributes of focused fields, and establish regular expression sets; 步骤2:导入N份已知正确的报表,提取所述报表的多个关键数据特征,根据多个所述关键数据特征的属性信息与上述正则表达式进行关联,当某个关键数据特征与正则表达式不相符时,根据提取的关键数据特征,对正则表达式做出更正,并输出更正后的正则表达式;按照同样的步骤更正每个正则表达式,建立规范的正则表达式集合;Step 2: Import N reports that are known to be correct, extract multiple key data features of the report, and associate with the above regular expressions according to the attribute information of multiple key data features. When the expressions do not match, correct the regular expression according to the extracted key data features, and output the corrected regular expression; follow the same steps to correct each regular expression to establish a standard regular expression set; 步骤3:导入待校验的目标报表,从目标报表中提取文本内容,根据每个正则表达式对报表内容匹配,对所述文本内容进行校验,当目标报表内容与所述正则表达式不相符时,显示错误或提示异常位置,并突出显示。Step 3: Import the target report to be verified, extract the text content from the target report, match the report content according to each regular expression, and verify the text content. When they match, the error or abnormal location is displayed and highlighted. 2.根据权利要求1所述的报表内容校验方法,其特征在于,所述步骤1建立正则表达式集合具体包括,2. report content verification method according to claim 1, is characterized in that, described step 1 establishing regular expression set specifically comprises, S11根据重点关注字段的类型和属性,结合至少一个用于匹配任意字符的特定符号作为正则表达式的匹配项,并将所述匹配项进行排序,形成至少一个正则表达式;S11, according to the type and attribute of the focused field, combine at least one specific symbol for matching any character as a matching item of the regular expression, and sort the matching items to form at least one regular expression; S12将报表内容中每一类数据作为一个节点,为每个节点进行编号以明确该节点所代表的数据类型,并根据各节点的连接关系和该编号,计算出表格内容中每个节点的正则表达式,并将所有正则表达式作为该类数据结构字符串码,构建正则表达式集合。S12 takes each type of data in the report content as a node, numbers each node to clarify the data type represented by the node, and calculates the regularity of each node in the table content according to the connection relationship of each node and the number expression, and use all regular expressions as the string code of this type of data structure to construct a regular expression set. 3.根据权利要求1所述的报表内容校验方法,其特征在于,对所述文本内容进行校验的具体步骤包括:3. The report content verification method according to claim 1, wherein the concrete steps of verifying the text content comprise: 确定所述文本内容的匹配度;determining the degree of matching of the text content; 基于所述匹配度确定对所述文本内容进行校验的校验结果;Determine the verification result of verifying the text content based on the matching degree; 判断所述匹配度是否大于或等于预设匹配度;Judging whether the matching degree is greater than or equal to a preset matching degree; 若是,则判定文本内容校验通过;If so, it is determined that the text content verification is passed; 若否,则判定文本内容校验不通过。If not, it is determined that the text content verification fails. 4.根据权利要求1-4任一项所述的报表内容校验方法,其特征在于,所述正则表达式集合中包括针对表头的正则表达式、针对列的名称的正则表达式、针对行的名称正则表达式、针对格子的属性的正则表达式、针对格子内容的正则表达式。4. The report content verification method according to any one of claims 1-4, wherein the regular expression set includes a regular expression for the header, a regular expression for the name of the column, and a regular expression for the name of the column. Regular expression for the name of the row, regular expression for the properties of the grid, regular expression for the content of the grid. 5.根据权利要求1-4任一项所述的报表内容校验方法,其特征在于,步骤2中,所述关键数据特征为数据格式、文本长度、数值大小、小数浮点、语言、行数值小计、列数值小计中的一种或两种以上。5. The report content verification method according to any one of claims 1-4, wherein in step 2, the key data feature is data format, text length, numerical size, decimal floating point, language, line One or more of numeric subtotals and column numeric subtotals. 6.根据权利要求1所述的报表内容校验方法,其特征在于,该方法还包括步骤4:针对上述显示错误或者提示异常的情况,对待校验的报表提出修正建议,具体根据显示错误或提示异常的位置检索正则表达式,从正则表达式中取出与显示错误或提示异常特征对应的部分特征,并输出该部分特征。6. report content verification method according to claim 1, is characterized in that, this method also comprises step 4: for above-mentioned display error or prompt abnormal situation, the report form to be verified is proposed correction suggestion, specifically according to display error or A regular expression is retrieved from the position indicating an exception, and a part of the feature corresponding to the feature that displays an error or anomaly is extracted from the regular expression, and the part of the feature is output. 7.一种报表内容校验方法,其特征在于,包括7. a report content verification method, is characterized in that, comprises 步骤1:导入N份已知正确的报表,提取所述报表的多个关键数据特征,根据多个所述关键数据特征的属性信息与自动生成正则表达式进行关联,建立关键数据特征与正则表达式的集合;Step 1: Import N reports that are known to be correct, extract multiple key data features of the report, associate the attribute information of the multiple key data features with the automatically generated regular expressions, and establish key data features and regular expressions collection of formulas; 步骤2:大批量导入新的目标报表,对表单内容进行报错,选择其中某个特征与上述关键数据特征进行对比,如果不匹配,则报错,根据被报错的特征检索正则表达式,任务匹配度足够高的正则表达式作为校验用的正则表达式;Step 2: Import new target reports in large batches, report errors on the content of the form, select one of the features to compare with the above key data features, if they do not match, report an error, search for regular expressions based on the reported features, and task matching degree A regular expression that is high enough to be used as a regular expression for verification; 步骤3:导入待校验的目标报表,从目标报表中提取文本内容,根据每个正则表达式对报表内容匹配,对所述文本内容进行校验,当目标报表内容与所述正则表达式不相符时,显示错误或提示异常位置,并突出显示。Step 3: Import the target report to be verified, extract the text content from the target report, match the report content according to each regular expression, and verify the text content. When they match, the error or abnormal location is displayed and highlighted. 8.一种报表内容校验系统,其特征在于,包括8. A report content verification system is characterized in that, comprising 报表导入模块,用于根据用户指令导入目标报表;The report import module is used to import target reports according to user instructions; 分析构建模块,用于获取目标报表的数据,并根据所获得的所有数据,按照用户指令分析报表中单一项目所需的数据和多个项目联合分析所需的数据构成规则,确定报表中的重点关注字段,根据重点关注字段的类型和属性设计各个重点关注字段的设计正则表达式,建立正则表达式集合;The analysis building module is used to obtain the data of the target report, and according to all the obtained data, according to the user's instructions, analyze the data required for a single item in the report and the data composition rules required for the joint analysis of multiple projects, and determine the key points in the report. Focus on fields, design regular expressions for each focused field according to the types and attributes of focused fields, and establish regular expression sets; 校验规则更正模块,用于获取已知正确的报表中的多个关键数据特征,根据用户指令,将所述关键数据特征的属性信息与上述正则表达式进行关联,当某个关键数据特征与正则表达式不相符时,根据提取的关键数据特征,对正则表达式做出更正,并输出更正后的正则表达式;按照同样的步骤更正每个正则表达式,建立规范的正则表达式集合;The verification rule correction module is used to obtain multiple key data features in the known correct report, and according to user instructions, associate the attribute information of the key data features with the above-mentioned regular expression, when a key data feature is associated with the above-mentioned regular expression. When the regular expression does not match, correct the regular expression according to the extracted key data features, and output the corrected regular expression; follow the same steps to correct each regular expression to establish a standard regular expression set; 报表校验处理模块,用于按照所述正则表达式对目标报表内容进行校验。The report verification processing module is used for verifying the content of the target report according to the regular expression. 9.一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1-7任意一项所述方法的步骤。9. A computer device, comprising a memory and a processor, wherein the memory stores a computer program, wherein the processor implements the steps of the method according to any one of claims 1-7 when the processor executes the computer program. 10.一种存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的方法的步骤。10. A storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 7 are implemented.
CN202011266412.1A 2020-11-13 2020-11-13 Report content verification method, system, computer equipment and storage medium Active CN112559817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011266412.1A CN112559817B (en) 2020-11-13 2020-11-13 Report content verification method, system, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011266412.1A CN112559817B (en) 2020-11-13 2020-11-13 Report content verification method, system, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112559817A true CN112559817A (en) 2021-03-26
CN112559817B CN112559817B (en) 2024-11-29

Family

ID=75042132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011266412.1A Active CN112559817B (en) 2020-11-13 2020-11-13 Report content verification method, system, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112559817B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065338A (en) * 2021-04-08 2021-07-02 银清科技有限公司 XML message recombination method and device
CN114896955A (en) * 2022-05-26 2022-08-12 中国平安人寿保险股份有限公司 Data report processing method and device, computer equipment and storage medium
CN115098382A (en) * 2022-06-29 2022-09-23 中国银行股份有限公司 Bank business parameter evaluation method, device and equipment
CN115907631A (en) * 2022-09-20 2023-04-04 用友网络科技股份有限公司 Data system control method, device, readable storage medium and data system
CN117371412A (en) * 2023-12-06 2024-01-09 信通院(江西)科技创新研究院有限公司 Form-based filling methods, systems, equipment and storage media
CN119227658A (en) * 2024-12-05 2024-12-31 水发大正科技服务有限公司 A business form intelligent processing system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682109A (en) * 2012-05-09 2012-09-19 北京彼速信息技术有限公司 Patent information analysis method and device
US20120271827A1 (en) * 2007-12-31 2012-10-25 Merz Christopher J Methods and systems for implementing approximate string matching within a database
CN107862258A (en) * 2017-10-24 2018-03-30 广东小天才科技有限公司 Method, device and equipment for checking text content in video and storage medium
CN108205732A (en) * 2017-12-26 2018-06-26 云南电网有限责任公司 A kind of method of calibration of the new energy prediction data access based on file
CN109726312A (en) * 2018-12-25 2019-05-07 广州虎牙信息科技有限公司 A kind of regular expression detection method, device, equipment and storage medium
CN109767177A (en) * 2018-12-20 2019-05-17 北京航空航天大学 A data processing and input method for public security traffic management business based on regular expressions

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120271827A1 (en) * 2007-12-31 2012-10-25 Merz Christopher J Methods and systems for implementing approximate string matching within a database
CN102682109A (en) * 2012-05-09 2012-09-19 北京彼速信息技术有限公司 Patent information analysis method and device
CN107862258A (en) * 2017-10-24 2018-03-30 广东小天才科技有限公司 Method, device and equipment for checking text content in video and storage medium
CN108205732A (en) * 2017-12-26 2018-06-26 云南电网有限责任公司 A kind of method of calibration of the new energy prediction data access based on file
CN109767177A (en) * 2018-12-20 2019-05-17 北京航空航天大学 A data processing and input method for public security traffic management business based on regular expressions
CN109726312A (en) * 2018-12-25 2019-05-07 广州虎牙信息科技有限公司 A kind of regular expression detection method, device, equipment and storage medium

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065338A (en) * 2021-04-08 2021-07-02 银清科技有限公司 XML message recombination method and device
CN113065338B (en) * 2021-04-08 2024-06-04 银清科技有限公司 XML message reorganization method and device
CN114896955A (en) * 2022-05-26 2022-08-12 中国平安人寿保险股份有限公司 Data report processing method and device, computer equipment and storage medium
CN114896955B (en) * 2022-05-26 2025-04-25 中国平安人寿保险股份有限公司 Data report processing method, device, computer equipment and storage medium
CN115098382A (en) * 2022-06-29 2022-09-23 中国银行股份有限公司 Bank business parameter evaluation method, device and equipment
CN115907631A (en) * 2022-09-20 2023-04-04 用友网络科技股份有限公司 Data system control method, device, readable storage medium and data system
CN117371412A (en) * 2023-12-06 2024-01-09 信通院(江西)科技创新研究院有限公司 Form-based filling methods, systems, equipment and storage media
CN117371412B (en) * 2023-12-06 2024-03-12 信通院(江西)科技创新研究院有限公司 Form-based filling methods, systems, equipment and storage media
CN119227658A (en) * 2024-12-05 2024-12-31 水发大正科技服务有限公司 A business form intelligent processing system

Also Published As

Publication number Publication date
CN112559817B (en) 2024-11-29

Similar Documents

Publication Publication Date Title
CN112559817A (en) Report content checking method, system, computer equipment and storage medium
US20230385321A1 (en) Systems and methods for processing a natural language query in data tables
US7117430B2 (en) Spreadsheet error checker
WO2012034733A2 (en) Method and arrangement for handling data sets, data processing program and computer program product
CN114661721B (en) Database table processing method and system
CN113961934A (en) Multi-level associated source code method based on open source vulnerability
CN112579629A (en) Method for helping purchasers of electronic component enterprises to accurately find products
CN111611242A (en) A Method for Importing Excel Data into Database
US20090204889A1 (en) Adaptive sampling of web pages for extraction
CN114254607A (en) Document auditing method and related equipment thereof
US10437825B2 (en) Optimized data condenser and method
US7707078B2 (en) Method and apparatus for mapping dimension-based accounting entries to allow segment-based reporting
CN110069279B (en) A verification method, device and storage medium for a DC control and protection program
US10339035B2 (en) Test DB data generation apparatus
CN107943785B (en) PDF document processing method and device based on big data
JP2009110220A (en) Audit log collection / evaluation system, audit log collection / evaluation method, and collection / evaluation computer
CN117539892A (en) Data processing method, device, medium and equipment applied to business intelligent system
CN112631852A (en) Macro checking method, macro checking device, electronic equipment and computer readable storage medium
CN114327377B (en) Method and device for generating demand tracking matrix, computer equipment and storage medium
CN111913945A (en) Data management method and device and storage medium
CN115618341A (en) Big data based analysis method and system for database user behaviors
CN112700322B (en) Order sampling inspection methods, devices, electronic equipment and storage media
CN109324963A (en) The method and terminal device of automatic test profitable result
CN112215695B (en) Matrix-based bill cycle rule analysis method and device and storage medium
CN117494702B (en) Data pushing method and system combining RPA and AI

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20231012

Address after: 266000 floor 5, block B, building 1, No. 151, huizhiqiao Road, high tech Zone, Qingdao, Shandong

Applicant after: Haichuanghui Technology Entrepreneurship Development Co.,Ltd.

Address before: 100022 unit 02, 10 / F, building 108, building a 108, building B 108, building 110, building 112, building 116, building 118, building a 118, building B 118

Applicant before: Beijing Chuangye Guangrong Information Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant