CN111209736A - Text file analysis method and device, computer equipment and storage medium - Google Patents
Text file analysis method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN111209736A CN111209736A CN202010006856.5A CN202010006856A CN111209736A CN 111209736 A CN111209736 A CN 111209736A CN 202010006856 A CN202010006856 A CN 202010006856A CN 111209736 A CN111209736 A CN 111209736A
- Authority
- CN
- China
- Prior art keywords
- data
- text
- text file
- analyzed
- field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 28
- 238000013507 mapping Methods 0.000 claims abstract description 68
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000012795 verification Methods 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 2
- 238000012423 maintenance Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 5
- 230000008520 organization Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention relates to the technical field of Internet, and provides a text file analysis method, a text file analysis device, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a text file to be analyzed, wherein the text file comprises text data to be analyzed; analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object; and storing the data object into a database. Compared with the prior art, the embodiment of the invention can not need to modify the codes again when the data columns in the text format are increased or reduced, only the mapping table needs to be modified, the maintenance cost is reduced, and the compatibility of the text file analysis method is improved.
Description
Technical Field
The invention relates to the technical field of internet, in particular to a text file parsing method and device, computer equipment and a storage medium.
Background
In many industries, data is often processed and transmitted in a text format and stored in a database in a structured form, and therefore, during use, the data in the text format is often required to be parsed into structured data or converted into the text format.
In the prior art, when data in a text format is analyzed, a mainstream open source function library is mainly used for reading and writing the data in the text format, and then the data is organized into a preset format.
Disclosure of Invention
The invention aims to provide a text file analysis method, a text file analysis device, computer equipment and a storage medium, which can be used for modifying a mapping table only without modifying codes again when data columns in a text format are increased or reduced, so that the maintenance cost is reduced, and meanwhile, the compatibility of the text file analysis method is improved.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, this embodiment provides a text file parsing method applied to a computer device, where the method includes: acquiring a text file to be analyzed, wherein the text file comprises text data to be analyzed; analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object; and storing the data object into a database.
In a second aspect, the present embodiment provides a text file parsing apparatus, which is applied to a computer device, and includes an obtaining module, a parsing module, and a storage module, where the obtaining module is configured to obtain a text file to be parsed, where the text file includes text data to be parsed; the analysis module is used for analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object; and the storage module is used for storing the data object into the database.
In a third aspect, the present embodiment provides a computer device, including: one or more processors; memory storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a text file parsing method as in any one of the preceding embodiments.
In a fourth aspect, the present embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the text file parsing method according to any one of the preceding embodiments.
Compared with the prior art, the embodiment of the invention provides a text file analysis method, a text file analysis device, computer equipment and a storage medium, wherein a mapping relation between text data to be analyzed and a data object is established, the text data to be analyzed is analyzed according to the mapping relation to obtain the data object corresponding to the text data to be analyzed, and finally the data object is stored in a database.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a diagram illustrating the processing of a prior art Chinese document provided by an embodiment of the present invention.
FIG. 2 illustrates an exemplary diagram of data organization during the processing of a document in the prior art provided by an embodiment of the present invention.
Fig. 3 shows a flowchart of a text file parsing method according to an embodiment of the present invention.
Fig. 4 is a flowchart illustrating another text file parsing method according to an embodiment of the present invention.
Fig. 5 is a schematic diagram illustrating a text file parsing process provided by an embodiment of the present invention.
Fig. 6 is a flowchart illustrating another text file parsing method according to an embodiment of the present invention.
Fig. 7 is a flowchart illustrating another text file parsing method according to an embodiment of the present invention.
Fig. 8 shows a block diagram of a text file parsing apparatus according to an embodiment of the present invention.
Fig. 9 shows a block schematic diagram of a computer device provided by an embodiment of the present invention.
Icon: 10-a computer device; 11-a memory; 12-a communication interface; 13-a processor; 14-a bus; 100-text file parsing means; 110-an obtaining module; 120-resolution module; 130-a storage module; 140-derivation module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present invention, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. indicate an orientation or a positional relationship based on that shown in the drawings or that the product of the present invention is used as it is, this is only for convenience of description and simplification of the description, and it does not indicate or imply that the device or the element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.
Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
Generally, the analysis of the text file adopts single-line analysis, data in the text file is read line by line, then each line of data is organized according to a custom format, a set corresponding to each line of data is generated, code logic written into a database from the set is automatically realized, and finally the text file is written into the database. Referring to fig. 1, fig. 1 is a schematic diagram illustrating a process of a text file in the prior art according to an embodiment of the present invention, and in addition, a process of exporting data from a database to the text file is an inverse process of the process, which is not specifically illustrated here.
It should be noted that, for the text file of the EXCEL type, the text file may be read and parsed line by line through the SAX (simple API for xml) mode, or may be read and parsed entirely through the user mode providing good encapsulation, the SAX mode reads the file sequentially, and may process the file of any size, but does not support the random storage of the file, and the user mode may access the data in the text file at will, but is not suitable for processing the large text file.
Referring to fig. 2, fig. 2 is a diagram illustrating an example of data organization in a process of processing a text file in the prior art according to an embodiment of the present invention, where the text file includes two rows of data, each row of data includes 4 columns, each row of data is analyzed to generate a corresponding linked LIST, the LISTs corresponding to the two rows of data are merged into a LIST set, a simple JAVA object POJO (plan organization JAVA object) set corresponding to the LIST set is generated according to a mapping rule of a field name corresponding to each column, and then the POJO set is written into a database. When the columns in the text file are increased or reduced, the whole implementation process needs to be readjusted, the corresponding implementation codes need to be adaptively modified, the maintenance cost is high, and the compatibility is poor.
In addition, the LIST is analyzed and generated in the whole process, the LIST set is combined and the POJO set is generated, each step involves different preset interface functions, a user needs to know each preset interface function well, the preset interface function can be used for completing the analysis process of the text file correctly, and the complexity of the implementation process is high.
In view of the problems in the parsing process, embodiments of the present invention provide a method, an apparatus, a computer device, and a storage medium for parsing a text file, where when a mapping relationship between text data to be parsed and a data object changes due to an increase or decrease of a data column in a text format, a code does not need to be modified again, and only a mapping table needs to be modified, so that a maintenance cost is reduced, and meanwhile, a compatibility of the method for parsing a text file is improved, which will be described in detail below.
Referring to fig. 3, fig. 3 is a flowchart illustrating a text file parsing method according to an embodiment of the present invention, where the method includes:
step S101, a text file to be analyzed is obtained, wherein the text file comprises text data to be analyzed.
In the present embodiment, the text file includes, but is not limited to, a TXT format file, a CSV format file, and an EXCEL format file.
And S102, analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object.
In this embodiment, the data object is in a data format that can be stored in the database by directly calling a preset function, for example, when based on a JAVA structure, the data object may be a POJO object structure.
Step S103, storing the data object into a database.
According to the text file analysis method provided by the embodiment of the invention, the mapping relation between the text data to be analyzed and the data object is abstracted to generate the mapping table, so that on one hand, when the mapping relation between the text data to be analyzed and the data object is changed due to the increase or decrease of the data columns in the text format, codes do not need to be modified again, only the mapping table needs to be modified, the maintenance cost is reduced, meanwhile, the compatibility of the text file analysis method is improved, on the other hand, the contents needing to be read can be flexibly selected according to the field identification or the field number, and the flexibility is high.
Based on fig. 3, fig. 4 shows a flowchart of another text file parsing method provided by the embodiment of the present invention, and step S102 includes the following sub-steps:
and a substep S1021, separating the field value corresponding to the field identification from the analysis text data according to the mapping type.
In this embodiment, the text file may include a field identifier and a field value corresponding to the field identifier, for example, the EXCEL form is shown in table 1:
TABLE 1
Name (I) | Age (age) | Class of class |
Wangming liquor | 8 | 3 |
Li Jian | 9 | 4 |
The field identification has 3, respectively: name, age, and class. For the first line, the field value corresponding to the name is: wangming; the field values corresponding to age are: 8; the field value for the class is 3.
In this embodiment, the text file is read line by line, and when the JAVA architecture is used, the contents in the text file can be read using a main stream open source function library such as POI (open source function library of Apache software foundation, which provides a JAVA program with functions of reading and writing Microsoft Office format archives), JAVAIO library (full IO interface provided by JAVA, including file reading and writing, standard device output, and the like), JAVACSV (read and write interface provided by JAVA for CSV format files).
In this embodiment, since the field value may be data wrongly written into the text file, in order to avoid that the wrongly written field value in the text file is continuously written into the database, the field value may be checked first, and only the field value that passes the check may be written into the database. Therefore, after separating the field value and before writing it into the database, the field value may be checked, and the checking step is shown in sub-step S1022 and sub-step S1023.
And a substep S1022, when there is a target check rule corresponding to the field identifier in the check rule, determining whether the field value is checked.
In this embodiment, the check rule may be stored independently from the mapping table, or may be stored in the mapping table. In the JAVA architecture, common validation rules include JSR-303, and Hibernate Validator provides for the implementation of all the built-in constraints in the JSR 303 specification, in addition to some additional constraints.
In this embodiment, the data object further includes an object attribute, and the object attribute may be a representation of the field identifier in the database, for example, the field identifier is: name, and the corresponding object attribute is Name.
In this embodiment, the verification rules include an attribute verification rule and an object verification rule, where the attribute verification rule is a verification rule for a field value corresponding to an object attribute, and includes, but is not limited to, a non-null verification, a dictionary value verification, a digital range verification, and the like. The object verification rule is a verification rule for data objects, for example, a certain field value between two or more data objects cannot be repeated, for example, the study numbers of any two students cannot be repeated in the basic information table of a class student.
In this embodiment, as a specific implementation manner of the object verification rule, a uniqueness checker may be marked on each row to be verified, and hash code marking is performed on each column, so that hash code encoding is performed on the object attribute of the data object to be verified, thereby comparing whether the data object is duplicated.
In this embodiment, as a specific implementation manner, the method for determining whether a field value is checked may be:
first, if the field value exists in the default deduplication set corresponding to the field identifier, it is determined that the field value is verified.
Secondly, if the field value does not exist in the preset deduplication set corresponding to the field identifier, the field value is determined not to be verified.
In this embodiment, each field identifier corresponds to a preset deduplication set, and the preset deduplication set finally includes all non-repetitive field values of the field identifier, for example, the field identifier is an age, 4 lines of data are shared in a text file, and values of the age are respectively: 6. 7, 6, 8, the verification process may be: preset deduplication set to start empty, for the age of the first row: 6, since the data is not in the preset deduplication set, checking the data, and after the check is passed, putting 6 into the preset deduplication set, for the age of the second row: 7, if the current time is not in the preset deduplication set, checking the current time, and after the check is passed, putting 7 into the preset deduplication set, wherein the preset deduplication set is as follows: (6, 7), for the age of the third row: 6, since 6 is already in the preset deduplication set, it is no longer verified, for the age of the fourth row: and 8, checking the field identifier if the field identifier is not in the preset deduplication set, and after the check is passed, putting 8 into the preset deduplication set, wherein the value in the preset deduplication set corresponding to the field identifier is as follows: (6,7,8).
In the sub-step S1023, when the field value is not verified, the field value is verified according to the target verification rule.
In this embodiment, each field identifier may correspond to at least one rule, the check rule may include a plurality of rules corresponding to a plurality of field identifiers, the check rule corresponding to the field identifier of the current field value to be checked is a target check rule, for example, the field identifier a corresponds to rules a and B, the field identifier B corresponds to rule c, and the check rule is: (A- > (a, B); B- > c) for the field value of field identification A, the target check rule is (a, B).
And a substep S1024, when the field value is not checked, putting the field value into a preset de-duplication set corresponding to the field identification.
In this embodiment, in order to ensure the correctness of text data to be parsed in a text file and also ensure the parsing efficiency, subsequent identical field values may not be verified any more, and therefore, after the field values that are not verified are verified, the field values are also placed in the preset deduplication set corresponding to the field identifiers of the field values in the embodiment of the present invention.
And a substep S1025 of generating an initial object according to the mapping relation, and assigning values to the object attributes of the initial object by using field values to obtain a data object corresponding to the text data to be analyzed.
In this embodiment, for a field value that fails to be checked, the data of the line corresponding to the field value may be directly discarded, or the current text file parsing may be directly terminated.
In this embodiment, the mapping table includes mapping types and mapping relationships between field identifiers and object attributes, where the mapping types may be analyzed according to the column numbers of the field identifiers, or according to the name order of the field identifiers, as in table 1, the column numbers of names, ages, and classes are 1, 2, and 3, respectively, and the mapping types may be analyzed according to the column number order during the analysis, and the names, the ages, and the classes may be analyzed in sequence, or according to the name order of the field identifiers (for example, a preset order of the first pinyin letters of each field), for example, the classes, the ages, and the names may be analyzed in sequence.
It should be noted that the mapping table may further include a format type of the text file to be parsed, for example, a TXT format, an EXCEL format, a CSV format, and the like.
In order to more clearly represent the parsing process of the present solution, fig. 5 shows a schematic diagram of a text file parsing process provided by an embodiment of the present invention. In fig. 5, the dashed line box corresponds to the processing procedure of the data organization in fig. 1, taking the above table 1 as an example, under the JAVA structure, the mapping table is shown in table 2:
TABLE 2
Wherein, the mapping type is 0, which means that the analysis is performed according to the identification number.
The corresponding POJO set of table 1 may be: { (Wangming, 8, 3); (li jian, 9, 4) }, the corresponding POJO object may be:
{
name: wangming;
age: 8;
class: 3
And
{
name: plum blossom building;
age: 8;
class: 4
}。
Compared with the prior art, the text file analysis method provided by the embodiment of the invention has the following effects: firstly, the content to be read can be flexibly selected according to the field identification or the field number by adopting a mapping table mode, and the flexibility is high; secondly, text files of different types can be uniformly realized, the repetition degree of codes is reduced, the maintenance cost is further reduced, thirdly, the attribute check rule and the object check rule are used for checking the field values and the data objects respectively, the problem that the text data to be analyzed wrongly written into the text files are written into the database and influence on the system is caused is avoided, fourthly, the preset duplication removing set is adopted, the same field values are prevented from being repeatedly checked, and the efficiency of analyzing the text files is guaranteed while the reliability of the data written into the database is guaranteed.
In this embodiment, in a case that an amount of data to be analyzed in a text file is very large (the number of lines or rows in the text file is too many), if all the data is read into a memory and the analysis occupies a large amount of memory and CPU resources, in order to reduce the occupation of the memory in this scenario, an embodiment of the present invention further provides another text file analysis method, please refer to fig. 6, where fig. 6 shows a flowchart of another text file analysis method provided in the embodiment of the present invention, and step S103 includes the following sub-steps:
and a substep S1031 of storing the data object in a temporary file.
In this embodiment, the temporary file may also be replaced with a cache database such as Rest, which is used for temporarily caching the data object.
And a substep S1032 of reading a preset number of data objects from the temporary file in sequence, and storing the read preset number of data objects into the database until all the data objects in the temporary file are stored into the database.
In this embodiment, the preset number is determined according to the size of the temporary file, for example, the preset number is set to 1 ten thousand, for an EXCEL file with 100 ten thousand rows, 1 ten thousand (that is, 1 ten thousand rows of data) are read each time and put into the temporary file, 1 ten thousand pieces of data in the temporary file are stored in the database, and then the subsequent 1 ten thousand pieces of data are processed until all the 100 ten thousand rows of data are stored in the database.
It should be noted that substep S1031 and substep S1032 in fig. 6 may be used in combination with fig. 4, instead of step S103 in fig. 4, or substeps S1021 to S1024 in fig. 4 may be used in combination with fig. 6, instead of step S102 in fig. 6.
According to the text file analysis method provided by the embodiment of the invention, the temporary files are introduced, the data objects are temporarily stored in batches, and then the temporary files are written into the database, so that the problem of excessive memory occupation in the scene that the text files are oversized files is avoided.
In this embodiment, in addition to writing a text file into a database, according to the requirement of an application scenario, data in the database needs to be exported and written into the text file, which may be regarded as a reverse process of the above process, and therefore, another text file parsing method is further provided in this embodiment of the present invention, referring to fig. 7, fig. 7 shows a flowchart of another text file parsing method provided in this embodiment of the present invention, where the method includes the following steps:
step S201, reading a data object from a database.
Step S202, analyzing the data object according to the mapping table to generate corresponding text data.
In this embodiment, a field identifier corresponding to an object attribute of the data object is determined according to the mapping table, and then a value corresponding to the object attribute is written in the text file as a field value corresponding to the field identifier, where the specific process is the inverse process of the above conversion process from the text file to the data object, and those skilled in the art can obtain the field identifier according to the method disclosed above without creative work, and the description is omitted here.
Step S203, writing the text data into the text file according to a preset format.
In this embodiment, the preset format may be, but is not limited to, a TXT format, an EXCEL format, a CSV format, and the like.
The text file analysis method provided by the embodiment of the invention can lead out data from the database and write the data into the text file in a preset format, thereby improving the applicable flexibility of the text file analysis method.
In order to execute the corresponding steps in the above embodiments and various possible implementations, an implementation of a block diagram of a text file parsing apparatus applied to a computer device is given below, please refer to fig. 8, and fig. 8 shows a block diagram of a text file parsing apparatus 100 applied to a computer device according to an embodiment of the present invention. It should be noted that the basic principle and the resulting technical effect of the text file parsing apparatus 100 applied to a computer device provided in this embodiment are the same as those of the above embodiment, and for the sake of brief description, no mention is made in this embodiment, and reference may be made to the corresponding contents in the above embodiment.
The text file parsing apparatus 100 includes an obtaining module 110, a parsing module 120, a storage module 130, and an export module 140.
The obtaining module 110 is configured to obtain a text file to be analyzed, where the text file includes text data to be analyzed.
The parsing module 120 is configured to parse the text data to be parsed according to a mapping table to obtain a data object corresponding to the text data to be parsed, where the mapping table is used to represent a mapping relationship between the text data to be parsed and the data object.
Specifically, the text file further includes a field identifier and a field value corresponding to the field identifier, the data object further includes an object attribute, the mapping table includes a mapping type and a mapping relationship between the field identifier and the object attribute, and the parsing module 120 is specifically configured to: separating field values corresponding to the field identifications from the analyzed text data according to the mapping type; and generating an initial object according to the mapping relation, and assigning values to the object attributes of the initial object by using field values to obtain a data object corresponding to the text data to be analyzed.
Specifically, the mapping table further includes a check rule, and after the parsing module 120 performs the step of separating the field value corresponding to the field identifier from the parsed text data according to the mapping type, when a target check rule corresponding to the field identifier exists in the check rule, the parsing module 120 determines whether the field value is checked; when the field value is not verified, the parsing module 120 verifies the field value according to the target verification rule.
Specifically, when the parsing module 120 determines whether the field value is checked, the parsing module is further configured to: if the field value exists in the preset deduplication set corresponding to the field identifier, the parsing module 120 determines that the field value is verified; if the field value does not exist in the default deduplication set corresponding to the field identifier, the parsing module 120 determines that the field value is not verified.
Specifically, when the field value is not checked, the parsing module 120 is further configured to, after checking the field value according to the target checking rule: and putting the field value into a preset de-duplication set corresponding to the field identifier.
The storage module 130 is used for storing the data object into the database.
Specifically, the storage module 130 is further configured to: storing the data object into a temporary file; and sequentially reading a preset number of data objects from the temporary file, and storing the read preset number of data objects into the database until all the data objects in the temporary file are stored into the database.
An export module 140 to: reading a data object from a database; analyzing the data object according to the mapping table to generate corresponding text data; and writing the text data into the text file according to a preset format.
Referring to fig. 9, fig. 9 is a block diagram illustrating a computer device 10 according to an embodiment of the present invention. The computer device 10 may be an entity computer such as a host or a server, a host group composed of a plurality of hosts, or a server group composed of a plurality of servers, or a virtual host or a virtual server, or a virtual host group or a virtual server group, which can realize the same function as the entity computer. The computer device 10 further comprises a memory 11, a communication interface 12, a processor 13 and a bus 14. The memory 11, the communication interface 12, and the processor 13 are connected by a bus 14.
The memory 11 is used for storing a program, such as the text file parsing apparatus 100 in fig. 8, where the text file parsing apparatus 100 includes at least one software functional module that can be stored in the memory 11 in a form of software or firmware (firmware), and the processor 13 executes the program after receiving an execution instruction to implement the text file parsing method disclosed in the above embodiment.
The Memory 11 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory 11 may be a storage device built in the processor 13, or may be a storage device independent of the processor 13.
The communication connection with other external devices is realized through at least one communication interface 12 (which may be wired or wireless).
The bus 14 may be an ISA bus, PCI bus, EISA bus, or the like. Fig. 9 is indicated by only one double-headed arrow, but does not indicate only one bus or one type of bus.
The processor 13 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 13. The Processor 13 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
In summary, embodiments of the present invention provide a text file parsing method, an apparatus, a computer device, and a storage medium, where the method includes: acquiring a text file to be analyzed, wherein the text file comprises text data to be analyzed; analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object; and storing the data object into a database. Compared with the prior art, the embodiment of the invention can not need to modify the codes again when the data columns in the text format are increased or reduced, only the mapping table needs to be modified, the maintenance cost is reduced, and the compatibility of the text file analysis method is improved.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.
Claims (11)
1. A text file parsing method is applied to a computer device, and comprises the following steps:
acquiring a text file to be analyzed, wherein the text file comprises text data to be analyzed;
analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object;
and storing the data object into a database.
2. The method according to claim 1, wherein the text file further includes a field identifier and a field value corresponding to the field identifier, the data object further includes an object attribute, the mapping table includes a mapping type and a mapping relationship between the field identifier and the object attribute, the step of parsing the text data to be parsed according to the mapping table to obtain the data object corresponding to the text data to be parsed includes:
separating field values corresponding to the field identifications from the analyzed text data according to the mapping type;
and generating an initial object according to the mapping relation, and assigning values to the object attributes of the initial object by using the field values to obtain a data object corresponding to the text data to be analyzed.
3. The method of parsing a text file according to claim 2, wherein the mapping table further includes a check rule, and the step of separating the field value corresponding to the field identifier from the parsed text data according to the mapping type further includes:
when a target verification rule corresponding to the field identifier exists in the verification rules, judging whether the field value is verified;
and when the field value is not verified, verifying the field value according to the target verification rule.
4. The text file parsing method of claim 3, wherein the step of determining whether the field value is checked comprises:
if the field value exists in a preset deduplication set corresponding to the field identifier, judging that the field value is verified;
and if the field value does not exist in the preset deduplication set corresponding to the field identifier, judging that the field value is not verified.
5. The method for parsing a text file according to claim 4, wherein when the field value is not verified, the step of verifying the field value according to the target verification rule further comprises:
and putting the field value into a preset de-duplication set corresponding to the field identifier.
6. The text file parsing method according to any one of claims 3 to 5, wherein the check rule includes an attribute check rule and an object check rule.
7. The text file parsing method of claim 1, the step of storing the data object in a database further comprising:
storing the data object into a temporary file;
and sequentially reading a preset number of data objects from the temporary file, and storing the read preset number of data objects into the database until all data objects in the temporary file are stored into the database.
8. The text file parsing method of claim 1, the method further comprising:
reading a data object from the database;
analyzing the data object according to the mapping table to generate corresponding text data;
and writing the text data into a text file according to a preset format.
9. A text file parsing device applied to a computer device, the device comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a text file to be analyzed, and the text file comprises text data to be analyzed;
the analysis module is used for analyzing the text data to be analyzed according to a mapping table to obtain a data object corresponding to the text data to be analyzed, wherein the mapping table is used for representing the mapping relation between the text data to be analyzed and the data object;
and the storage module is used for storing the data object into a database.
10. A computer device, characterized in that the computer device comprises:
one or more processors;
memory storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the text file parsing method of any of claims 1-8.
11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of parsing a text file according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010006856.5A CN111209736A (en) | 2020-01-03 | 2020-01-03 | Text file analysis method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010006856.5A CN111209736A (en) | 2020-01-03 | 2020-01-03 | Text file analysis method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111209736A true CN111209736A (en) | 2020-05-29 |
Family
ID=70786628
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010006856.5A Pending CN111209736A (en) | 2020-01-03 | 2020-01-03 | Text file analysis method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111209736A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111949716A (en) * | 2020-08-11 | 2020-11-17 | 北京锐安科技有限公司 | Formatted data output field processing method, computer device and storage medium |
CN112632332A (en) * | 2021-01-04 | 2021-04-09 | 恩亿科(北京)数据科技有限公司 | Configurable verification method, system, equipment and storage medium for XML file |
CN112925749A (en) * | 2021-02-20 | 2021-06-08 | 北京同邦卓益科技有限公司 | Data processing method and device, electronic equipment and storage medium |
CN114090569A (en) * | 2020-09-14 | 2022-02-25 | 北京沃东天骏信息技术有限公司 | Method, apparatus, apparatus and computer readable medium for processing data |
CN114490848A (en) * | 2022-01-19 | 2022-05-13 | 北京明朝万达科技股份有限公司 | File analysis processing method and device, storage medium and electronic equipment |
WO2023277821A1 (en) * | 2021-07-01 | 2023-01-05 | Garena Online Private Limited | Platform to automate creation of serialised data objects for import into a game engine |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030208493A1 (en) * | 2002-04-12 | 2003-11-06 | Hall Bradley S. | Object relational database management system |
US20050086235A1 (en) * | 2003-10-17 | 2005-04-21 | International Business Machines Corporation | Configurable flat file data mapping to a datasbase |
US20070168380A1 (en) * | 2006-01-17 | 2007-07-19 | International Business Machines Corporation | System and method for storing text annotations with associated type information in a structured data store |
US20140082033A1 (en) * | 2012-09-14 | 2014-03-20 | Salesforce.Com, Inc. | Methods and systems for managing files in an on-demand system |
CN107145537A (en) * | 2017-04-21 | 2017-09-08 | 上海斐讯数据通信技术有限公司 | A kind of list data introduction method and system |
CN108009282A (en) * | 2017-12-22 | 2018-05-08 | 武汉楚鼎信息技术有限公司 | A kind of json data are synchronized to the method and system device of relevant database |
CN108984177A (en) * | 2018-06-21 | 2018-12-11 | 中国铁塔股份有限公司 | A kind of data processing method and system |
CN109299183A (en) * | 2018-11-20 | 2019-02-01 | 北京锐安科技有限公司 | A kind of data processing method, device, terminal device and storage medium |
CN109670053A (en) * | 2018-12-25 | 2019-04-23 | 北京锐安科技有限公司 | Data object mapping method, device, equipment and computer readable storage medium |
-
2020
- 2020-01-03 CN CN202010006856.5A patent/CN111209736A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030208493A1 (en) * | 2002-04-12 | 2003-11-06 | Hall Bradley S. | Object relational database management system |
US20050086235A1 (en) * | 2003-10-17 | 2005-04-21 | International Business Machines Corporation | Configurable flat file data mapping to a datasbase |
US20070168380A1 (en) * | 2006-01-17 | 2007-07-19 | International Business Machines Corporation | System and method for storing text annotations with associated type information in a structured data store |
US20140082033A1 (en) * | 2012-09-14 | 2014-03-20 | Salesforce.Com, Inc. | Methods and systems for managing files in an on-demand system |
CN107145537A (en) * | 2017-04-21 | 2017-09-08 | 上海斐讯数据通信技术有限公司 | A kind of list data introduction method and system |
CN108009282A (en) * | 2017-12-22 | 2018-05-08 | 武汉楚鼎信息技术有限公司 | A kind of json data are synchronized to the method and system device of relevant database |
CN108984177A (en) * | 2018-06-21 | 2018-12-11 | 中国铁塔股份有限公司 | A kind of data processing method and system |
CN109299183A (en) * | 2018-11-20 | 2019-02-01 | 北京锐安科技有限公司 | A kind of data processing method, device, terminal device and storage medium |
CN109670053A (en) * | 2018-12-25 | 2019-04-23 | 北京锐安科技有限公司 | Data object mapping method, device, equipment and computer readable storage medium |
Non-Patent Citations (2)
Title |
---|
唐红梅,郑刚: "基于XML数据库的存储及映射研究" * |
陈静;何香玲;: "测震数据对象关系映射软件包设计" * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111949716A (en) * | 2020-08-11 | 2020-11-17 | 北京锐安科技有限公司 | Formatted data output field processing method, computer device and storage medium |
CN114090569A (en) * | 2020-09-14 | 2022-02-25 | 北京沃东天骏信息技术有限公司 | Method, apparatus, apparatus and computer readable medium for processing data |
CN112632332A (en) * | 2021-01-04 | 2021-04-09 | 恩亿科(北京)数据科技有限公司 | Configurable verification method, system, equipment and storage medium for XML file |
CN112925749A (en) * | 2021-02-20 | 2021-06-08 | 北京同邦卓益科技有限公司 | Data processing method and device, electronic equipment and storage medium |
WO2023277821A1 (en) * | 2021-07-01 | 2023-01-05 | Garena Online Private Limited | Platform to automate creation of serialised data objects for import into a game engine |
CN114490848A (en) * | 2022-01-19 | 2022-05-13 | 北京明朝万达科技股份有限公司 | File analysis processing method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111209736A (en) | Text file analysis method and device, computer equipment and storage medium | |
CN108427731B (en) | Page code processing method and device, terminal equipment and medium | |
CN110188135B (en) | File generation method and device | |
CN110990276A (en) | Automatic testing method and device for interface field and storage medium | |
CN111563218B (en) | Page repairing method and device | |
CN110795464B (en) | Method, device, terminal and storage medium for checking field of object marker data | |
CN110928802A (en) | Test method, device, equipment and storage medium based on automatic generation of case | |
CN114676040A (en) | Test coverage verification method and device and storage medium | |
CN113391972A (en) | Interface testing method and device | |
CN111273891A (en) | Business decision method and device based on rule engine and terminal equipment | |
JP2017174418A (en) | Data structure abstraction for model checking | |
CN112528307A (en) | Service request checking method and device, electronic equipment and storage medium | |
CN117493309A (en) | Standard model generation method, device, equipment and storage medium | |
CN110750440A (en) | Data testing method and terminal equipment | |
CN110888972A (en) | Sensitive content identification method and device based on Spark Streaming | |
CN114968725A (en) | Task dependency relationship correction method and device, computer equipment and storage medium | |
CN110633258A (en) | Log insertion method, device, computer device and storage medium | |
CN111078529B (en) | Client writing module testing method and device and electronic equipment | |
CN118606123A (en) | Test configuration file generation method, device, system, equipment and storage medium | |
CN109324838B (en) | Execution method and execution device of single chip microcomputer program and terminal | |
CN112632332A (en) | Configurable verification method, system, equipment and storage medium for XML file | |
CN112463633A (en) | Method, device, equipment and medium for checking address decoding of on-chip memory | |
CN112506783A (en) | Test method, test device and storage medium | |
CN115495082B (en) | TLV format data automatic conversion method and related equipment | |
CN114443101B (en) | System advanced audit strategy update method, system, terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200529 |