[go: up one dir, main page]

CN110825771B - Batch data processing method, electronic device, computer equipment and storage medium - Google Patents

Batch data processing method, electronic device, computer equipment and storage medium Download PDF

Info

Publication number
CN110825771B
CN110825771B CN201910969164.8A CN201910969164A CN110825771B CN 110825771 B CN110825771 B CN 110825771B CN 201910969164 A CN201910969164 A CN 201910969164A CN 110825771 B CN110825771 B CN 110825771B
Authority
CN
China
Prior art keywords
processing
unit modules
processed
unit
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910969164.8A
Other languages
Chinese (zh)
Other versions
CN110825771A (en
Inventor
陈志超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN201910969164.8A priority Critical patent/CN110825771B/en
Publication of CN110825771A publication Critical patent/CN110825771A/en
Application granted granted Critical
Publication of CN110825771B publication Critical patent/CN110825771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24561Intermediate data storage techniques for performance improvement
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a batch data processing method, an electronic device, computer equipment and a storage medium, wherein an original data file loaded to a temporary table of a database is decomposed into unit modules with different granularities according to data content; sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules; monitoring the processing state of the unit module; and forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file. The invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, thereby enabling the grid nodes of the batch processor to simultaneously process the corresponding unit modules in parallel, monitoring and displaying the processing state of each unit module, enabling a user to intuitively know the processing state of the data, conveniently and rapidly positioning the data with abnormal operation, rapidly positioning and solving the abnormal condition when the batch processing occurs, and saving manpower.

Description

Batch data processing method, electronic device, computer equipment and storage medium
Technical Field
The present invention relates to the field of batch data processing technologies, and in particular, to a batch data processing method, an electronic device, a computer device, and a storage medium.
Background
With the annual increase of insurance business, more and more data are generated by a core system, a large amount of data to be processed exist, and the data processing and synchronization of the large amount of data are realized by adopting a batch processing mode at present. However, in the daily operation and maintenance process, batch processing is invisible, including information such as processing progress, operation abnormality and the like, so that information available for positioning is very little, great workload is brought to operation and maintenance, and the working difficulty of operation and maintenance personnel is greatly increased.
Disclosure of Invention
In view of this, the present invention provides a batch data processing method, an electronic device, a computer device, and a storage medium, which can intuitively understand a processing state of data, and conveniently and rapidly locate data with abnormal operation.
Firstly, to achieve the above object, the present invention provides a batch data processing method, which includes the steps of:
Loading an original data file to be processed into a temporary database table;
Decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
Sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules;
monitoring the processing state of the unit module; and
And forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
the step of monitoring the processing state of the unit module further includes:
Monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality;
The color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, before the step of monitoring the processing state of the unit module, the method further includes:
And reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
Further, after the step of loading the original data file to be processed into the temporary table of the database, the method further includes:
counting the total amount A of the original data files to be processed;
recording the number B of the original data files loaded to the temporary table of the database; and
And calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, after the step of sending the unit modules to the grid nodes corresponding to the batch processor, the method further includes:
Counting the total quantity C of unit modules to be processed of the grid nodes;
recording the number D of the unit modules processed by the grid node; and
And calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, after the step of calculating the processing progress according to the proportional relation between the number D and the total amount C, the method further includes:
and counting the time of the grid node processing unit module.
Further, after the step of forming and outputting the target data file according to the unit modules processed by the grid nodes, the method further comprises:
Counting the total amount E of the target data files; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and
And calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A or not so as to judge whether data are lost in the batch processing process or not.
In addition, to achieve the above object, the present invention also provides an electronic device, including:
the loading module is used for loading the original data file to be processed to the temporary database table;
The decomposition module is used for decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
The sending module is used for sending the unit modules to grid node processing corresponding to the batch processor according to the granularity of the unit modules;
the monitoring module is used for monitoring the processing state of the unit module; and
And the output module is used for forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing; the monitoring module is further configured to monitor whether the unit module is processed by a grid node and record a processing state, record the processing state of the unit module that does not enter the grid node for processing as waiting for processing, record the processing state of the unit module that is being processed by the grid node as processing, record the processing state of the unit module that has been successfully processed by the grid node as processing end, and record the processing state of the unit module that cannot be processed or has an abnormal processing result as processing abnormality; the color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, the electronic device also comprises a reading module for reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
Further, the electronic device further comprises a statistics module for counting the total amount A of the original data files to be processed; recording the number B of the original data files loaded to the temporary table of the database; and calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, the statistics module is further configured to count a total amount C of unit modules to be processed by the grid node; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, the electronic device further comprises a time module for counting the time of the grid node processing unit module.
Further, the statistics module is further configured to count a total amount E of the target data file; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A so as to judge whether data are lost in the batch processing process.
To achieve the above object, the present invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
Compared with the prior art, the batch data processing method, the electronic device, the computer equipment and the storage medium provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and minimize the coupling degree between the unit modules, so that the grid nodes of the batch processor can simultaneously process the corresponding unit modules in parallel, and a user can intuitively know the processing state of the data by monitoring and displaying the processing state of each unit module, so that the abnormal data can be conveniently and rapidly positioned, and the abnormal condition can be rapidly positioned and solved when the batch processing occurs, thereby saving manpower.
Drawings
FIG. 1 is a flow chart of a Python-based reading method according to an exemplary embodiment of the present invention;
FIG. 2 is a flow chart of a Python-based reading method according to an exemplary embodiment of the present invention;
FIG. 3 is a flow chart of a method of batch data processing according to an exemplary embodiment of the invention;
FIG. 4 is a flow chart of a method of batch data processing, according to an exemplary embodiment of the invention;
FIG. 5 is a flow chart of a method of batch data processing, according to an exemplary embodiment of the invention;
FIG. 6 is a flow chart of a method of batch data processing shown in an exemplary embodiment of the invention;
FIG. 7 is a flow chart of a method of batch data processing shown in an exemplary embodiment of the invention;
FIG. 8 is a program module schematic diagram of an electronic device according to an exemplary embodiment of the present invention;
fig. 9 is a schematic diagram of a hardware architecture of an electronic device according to an exemplary embodiment of the present invention.
Reference numerals:
Electronic device 20
Memory device 21
Processor and method for controlling the same 22
Network interface 23
Memory 24
Loading module 201
Decomposition module 202
Transmitting module 203
Monitoring module 204
Output module 205
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
Referring to FIG. 1, a flow chart of a batch data processing method according to an exemplary embodiment of the invention is shown, the method includes the following steps:
Step S110, loading an original data file to be processed into a temporary database table;
step S120, the original data file loaded to the temporary table of the database is decomposed into unit modules with different granularities according to the data content;
step S130, sending the unit modules to grid node processing corresponding to a batch processor according to the granularity of the unit modules;
step S140, monitoring a processing state of the unit module; and
And step S150, forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
As insurance business increases year by year, more and more data is generated by the core system. And the data generated by the core system is required to be uploaded to an application server of the target system, and the target system performs logic processing through batch processing and then stores the logic processing into a database.
In step S110, the original data file to be processed is loaded into the database temporary table. The temporary table is a table built in a temporary folder of the system, is a database object for temporarily storing temporary data (or intermediate data), and can perform various operations like a permanent table, but is automatically deleted when it is not used any more.
And selecting partial data (to-be-processed original data) from the data generated by the core system, storing the partial data (to-be-processed original data) into a database, and generating an to-be-processed original data file according to the to-be-processed original data. For example, if the data of the policy in the vehicle insurance underwriting system needs to be stored in the database, the raw data to be processed refers to the data of the policy.
In step S120, the original data file loaded to the temporary table of the database is decomposed into unit modules of different granularities according to the data content. For the original data file to be processed, which is loaded into the temporary table of the database, the original data file is decomposed into a plurality of small unit modules according to different granularities, taking the original data file as an insurance policy as an example, the insurance policy must clearly and completely record the right obligations of both insurance parties, the insurance policy mainly carries names of insurers and insured persons, insurance targets, insurance amount, insurance fee, insurance period, reimbursement or responsibility ranges of payment and other stipulations, the data content recorded in the insurance policy is subdivided into the smallest granularity as possible layer by layer so as to reduce the coupling degree among the decomposed unit modules as much as possible, and therefore, the insurance policy can be decomposed into a plurality of small unit modules according to the data content recorded in the insurance policy, for example, the insurance policy paid for 20 years, and the unit modules with 20 granularities can be subdivided into the unit modules according to 1 st year, 2 nd year, … … and 20 th year by text recognition of the insurance policy for the part paid in the middle term. The invention does not specifically limit the number of the unit modules decomposed by the original data file, but reduces the coupling degree among the decomposed unit modules as much as possible according to the actual condition of the original data file, namely the influence degree of the variables related to the single unit module on other unit modules is the most possible small, and the variables are not the variables which can not be stripped out.
In step S130, the unit modules are sent to the grid node process corresponding to the batch processor. Because the coupling degree of each unit module is low, each grid node can simultaneously process corresponding unit modules in parallel. The setting of the number of the grid nodes corresponding to the batch processor is automatically performed by the system according to the operation capacity of the batch processor for processing the batch of original data files and the urgency degree of the processing result of the original data files. For example, if an original data file X is decomposed into 4 different granularity unit modules a, B, C and D, the batch processor may process the unit modules a, B, C and D through the grid node a, B, C and D.
In step S140, the process state of the unit module is monitored. During the processing of unit modules by a batch processor, the processing state of the unit modules can be monitored and displayed. The processing state may include: processing exceptions, waiting for processing, in process, ending the process.
As shown in fig. 2, in an embodiment of the present invention, the step of monitoring the processing state of the unit module may include the steps of:
Step S201, monitoring whether the unit module is processed by a grid node and recording a processing state, recording the processing state of the unit module which does not enter the grid node for processing as waiting processing, recording the processing state of the unit module which is being processed by the grid node as processing, recording the processing state of the unit module which is successfully processed by the grid node as processing end in processing, recording the processing state of the unit module which cannot be processed by the grid node or has abnormal processing result as processing abnormality.
Step S202, setting the color of the unit module according to the processing state so as to display the processing state of the unit module through different colors.
According to the information such as whether the unit modules are processed by the grid nodes or not and the processing results, the processing states of the unit modules can be recorded, and in order to facilitate users to intuitively check the current processing states of the unit modules, the unit modules can be set to different colors so as to display the processing states of the unit modules. For example, the processing state of the unit module which cannot process the grid node or process the result exception is recorded as the processing exception, and the processing exception can be represented by red; the processing state of the unit module which does not enter the grid node for processing is recorded as waiting processing, and the waiting processing can be represented by yellow; the processing state of the unit module being processed by the grid node is recorded as processing, and the processing can be represented by blue; the processing state of the unit module successfully processed by the grid node is recorded as the processing end, and the processing end can be indicated by green. By setting different colors, a user can visually check the current processing state of the unit modules, and can rapidly locate the unit modules for processing the abnormality from a large number of unit modules, so that the checking speed of the batch processing when the abnormality occurs is greatly improved.
As shown in fig. 3, in an embodiment of the present invention, before the step of monitoring the processing state of the unit module, the method may further include the steps of:
step S301, the processing state of the unit module is read from a batch log generated by the operation of the batch processor.
The method comprises the steps of decomposing an original data file into unit modules with different granularities, generating different names for the different unit modules, generating batch logs by a batch processor in the batch process, wherein each row of logs records the description of related operations such as the names, date, time, user, action and the like of the unit modules, and the actions comprise: processing exceptions, waiting for processing, in process, ending the process. The processing state of the unit module can be known from the batch log.
The batch data processing method provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, so that the grid nodes of the batch processor can process the corresponding unit modules simultaneously and in parallel, and a user can intuitively know the processing state of the data by monitoring and displaying the processing state of each unit module, thereby conveniently and quickly positioning the abnormal running data, and quickly positioning and solving the abnormal condition in batch processing, and saving manpower.
In order to facilitate the user to intuitively understand the progress of the batch process, as shown in fig. 4, in an embodiment of the present invention, after the step of loading the raw data file to be processed into the temporary table of the database, the method may further include the following steps:
Step S401, counting the total amount A of the original data files to be processed;
step S402, recording the number B of the original data files loaded to the temporary table of the database; and
Step S403, calculating file loading progress according to the proportional relation between the quantity B and the total quantity A.
The original data to be processed has eliminated abnormal data, and the abnormal data comprises data which does not meet the requirements and data which does not meet the logic of the target system. Non-satisfactory data includes, for example, a case where a field length of a certain data in the source system is 8 characters long and a field length of the target database is 7 characters long, and such data belongs to data that cannot be put in storage and is filtered out as non-satisfactory data. Data that does not conform to the target system logic includes situations where, for example, policy-validated data needs to be transferred to the target database, policy data that has not been validated may be filtered.
And counting the total amount A of the original data files to be processed, loading the original data files to be processed to a database temporary table, recording the number B of the original data files loaded to the database temporary table, and calculating the file loading progress according to the proportional relation between the number B and the total amount A. For example, when the total number a of original data files is 10000 and the number B of original data files loaded into the database temporary table is 1000, the progress of completion of file loading is 10%, and the remaining 90% are not loaded.
And by calculating and displaying the file loading progress, the user can intuitively know the progress of batch processing.
As shown in fig. 5, in an embodiment of the present invention, the step of sending the unit modules to the grid node corresponding to the batch processor may further include the following steps:
step S501, counting the total quantity C of unit modules to be processed by the grid node;
Step S502, recording the number D of the unit modules processed by the grid node; and
Step S503, calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
The original data file is decomposed into a plurality of small unit modules according to different granularities, and different unit modules are processed by corresponding grid nodes, so that the total quantity C of the unit modules to be processed by the grid nodes can be counted. And simultaneously recording the number D of the unit modules processed by the grid nodes, and calculating the processing progress according to the proportional relation between the number D and the total amount C. For example, the total number C of unit modules to be processed by the grid node a is 10000, and when the number D of processed unit modules is 1000, the processing progress of the unit modules is 10%, and the remaining 90% are unprocessed. The total amount C of the unit modules to be processed of the grid node B is 10000, when the number D of the processed unit modules is 1500, the processing progress of the unit modules is 15%, and the rest 85% is unprocessed.
The processing progress of the unit module is calculated and displayed, so that a user can intuitively know the progress of batch processing.
As shown in fig. 6, in an embodiment of the present invention, after the step of calculating the processing progress according to the proportional relationship between the number D and the total amount C, the method may further include the following steps:
And step S601, counting the time of the grid node processing unit module.
According to the total amount C of the unit modules to be processed of the grid node, the number D of the unit modules which are processed currently and the time spent for processing the unit modules of the number D, the time spent for processing the rest unit modules of the grid node, the time spent for processing the complete unit modules and the like can be calculated.
The original data file is decomposed into a plurality of small unit modules according to different granularities, different unit modules are processed by different grid nodes, and the processing time of the unit modules is calculated and displayed, so that a user can intuitively know the progress of batch processing. Further, when the time spent for processing the unit module by a certain grid node is long, the user can quickly and conveniently find the condition of long time consumption and intervene in time, so that countermeasures are conveniently proposed to shorten the processing time.
As shown in fig. 7, in an embodiment of the present invention, after the step of forming and outputting the target data file according to the unit modules processed by the grid node, the method may further include the following steps:
step S701, counting the total amount E of the target data file;
step S702, calculating the total amount F of the corresponding original data files according to the unit modules for processing the abnormality; and
In step S703, it is calculated whether the sum of the total amount E and the total amount F is equal to the total amount a, so as to determine whether data is lost during the batch processing.
The original data file is decomposed into a plurality of small unit modules according to different granularities, different unit modules are processed by different grid nodes, and when all the unit modules of one original data file are processed by the grid nodes, the target data file can be formed according to the unit modules processed by the grid nodes, so that the total quantity E of the target file is counted. Of course, there may be a unit module for processing an exception, and since there may be a plurality of unit modules for processing an exception among a plurality of unit modules corresponding to one original data file, it is necessary to count the number F of corresponding original data files according to the unit modules for processing an exception. Meanwhile, the unit module for processing the abnormality records the abnormality detailed information for operation and maintenance to check whether human intervention processing is needed. Under normal conditions, the total amount A of the original data files at the database entry should be equal to the sum of the total amount E of the target data files and the total amount F of the original data files with exception, and if the sum of the total amount E and the total amount F is not equal to the total amount A, the situation that data is lost in the batch processing process can be judged.
To better illustrate the above-described batch data processing scheme, a specific explanation will be made below by way of one example.
Examples:
And loading 100 to-be-processed original data files into the database temporary table, and recording the number of the original data files loaded into the database temporary table to calculate the file loading progress, for example, when the number B of the original data files loaded into the database temporary table is 10, the file loading completion progress is 10%, and the rest 90% are not loaded.
The original data files loaded to the temporary table of the database are decomposed into unit modules with different granularities, and for convenience of description, the description will be given below taking the decomposition of each original data file into 4 unit modules as an example, wherein the 1 st original data file is decomposed into unit module a1, unit module b1, unit module c1 and unit module d1, and the 2 nd original data file is decomposed into unit module a2, unit module b2, unit module c2 and unit modules d2, … … and so on.
The 4 grid nodes in the batch processor are respectively and parallelly processing corresponding unit modules, wherein the grid node A is used for processing 100 unit modules such as a unit module a1, a unit module a2, a unit module … … and the like, the grid node B is used for processing 100 unit modules such as a unit module B1, a unit module B2, a unit module … … and the like, the grid node C is used for processing 100 unit modules such as a unit module C1, a unit module C2, a unit module … … and the like, and the grid node D is used for processing 100 unit modules such as a unit module D1, a unit module D2, a unit module … … and the like.
The batch processor generates a batch log in the batch process, the processing state of the unit module can be obtained from the batch log, and the color of the unit module is set according to the processing state so as to display the processing state of the unit module through different colors. For example, for the mesh node a, the unit module a1 at which the processing ends is set to green, the unit module a2 at which the abnormality is handled is set to red, the unit module a3 in the processing is set to blue, and the unit modules a4 to a100 waiting for the processing are set to yellow.
The number of the processed unit modules of each grid node is recorded to calculate the processing progress of each grid node. For example, for grid node a, when the number of processed unit modules is 15 for 15 minutes, the processing progress of the unit modules is 15%, and the remaining 85% is unprocessed, and it is expected that the entire unit modules can be processed for 85 minutes.
When each unit module decomposed from an original data file is processed by a corresponding grid node, a target data file can be formed according to the processed unit modules. For example, when the processing of the 1 st original data file is completed by the corresponding grid node in each of the unit module a1, the unit module b1, the unit module c1, and the unit module d1, the 1 st target data file may be formed according to the unit module a1, the unit module b1, the unit module c1, and the unit module d1 after the processing is completed. At the end of the batch, 91 target data files are co-generated.
The unit module for processing the exception comprises: the unit modules a2, a5, a10, b2, b6, b11, c3, c6, c12, d4, and d13. The original data file corresponding to the abnormal handling unit module includes 9 original data files, including the 2 nd original data file, the 3 rd original data file, the 4 th original data file, the 5 th original data file, the 6 th original data file, the 10 th original data file, the 11 th original data file, the 12 th original data file, and the 13 th original data file.
Since the sum of 91 target data files and 9 original data files that handle exceptions is 100, no data is lost during the batch process.
The time spent for processing all the unit modules by each grid node is obtained, for example, the time spent for grid node A is 2 hours, the time spent for grid node B is 2 hours and 10 minutes, the time spent for grid node C is 2 hours and 30 minutes, and the time spent for grid node D is 5 hours.
The invention further provides an electronic device. Referring to fig. 8, a schematic diagram of a program module of the electronic device 20 according to an exemplary embodiment of the invention is shown.
The electronic device 20 includes:
the loading module 201 is configured to load an original data file to be processed into a temporary database table;
the decomposition module 202 is configured to decompose an original data file loaded to a temporary table of the database into unit modules with different granularities according to data content;
A sending module 203, configured to send the unit module to a grid node process corresponding to the batch processor according to granularity of the unit module;
A monitoring module 204 for monitoring a processing state of the unit module; and
And the output module 205 is configured to compose a target data file according to the unit modules processed by the grid nodes and output the target data file.
The electronic device 20 provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, so that the grid nodes of the batch processor can process the corresponding unit modules simultaneously and in parallel, and a user can intuitively know the processing state of the data by displaying the processing state of each unit module, so that abnormal data can be conveniently and quickly positioned, and when abnormal conditions occur in batch processing, the abnormal conditions can be quickly positioned and solved, and the manpower is saved.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing; the monitoring module 204 is further configured to monitor whether the unit module is processed by a grid node and record a processing state, record the processing state of the unit module that does not enter the grid node for processing as waiting for processing, record the processing state of the unit module that is being processed by the grid node as processing, record the processing state of the unit module that has been successfully processed by the grid node as processing end, and record the processing state of the unit module that cannot be processed or has an abnormal processing result as processing abnormality; the color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, the electronic device 20 further includes a reading module for reading the processing status of the unit module from a batch log generated by the batch processor.
Further, the electronic device 20 further includes a statistics module for counting the total amount a of the original data file to be processed; recording the number B of the original data files loaded to the temporary table of the database; and calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, the statistics module is further configured to count a total amount C of unit modules to be processed by the grid node; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, the electronic device 20 further includes a time module for counting the time of the grid node processing unit module.
Further, the statistics module is further configured to count a total amount E of the target data file; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A so as to judge whether data are lost in the batch processing process.
To achieve the above object, the present invention also provides a computer device 20 comprising a memory 21, a processor 22 and a computer program stored on the memory 21 and executable on the processor 22, the processor 22 implementing the steps of the above method when executing the computer program. The computer program may be stored in the memory 24.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
The invention also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack-mounted server, a blade server, a tower server or a cabinet server (comprising independent servers or a server cluster formed by a plurality of servers) and the like which can execute programs. The computer device of the present embodiment includes at least, but is not limited to: memory, processors, etc. that may be communicatively coupled to each other via a system bus.
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used to store the electronic device 20, which when executed by the processor 22 implements the batch data processing method of the present invention.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (7)

1. A method of batch data processing, the method comprising the steps of:
Loading an original data file to be processed into a temporary database table;
Decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
Sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules;
monitoring the processing state of the unit module; and
Forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file;
the unit module for decomposing the original data file loaded to the temporary table of the database into different granularities according to the data content comprises:
Subdividing the data content of the original data file loaded into the temporary table of the database layer by layer to the smallest granularity possible to obtain unit modules with different granularities so as to reduce the coupling degree among the decomposed unit modules as much as possible;
After the step of loading the original data file to be processed into the temporary database table, the method further comprises the following steps:
counting the total amount A of the original data files to be processed;
recording the number B of the original data files loaded to the temporary table of the database; and
Calculating file loading progress according to the proportional relation between the quantity B and the total quantity A;
wherein the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
the step of monitoring the processing state of the unit module further includes:
Monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality;
setting the colors of the unit modules according to the processing states so as to display the processing states of the unit modules through different colors;
After the step of sending the unit modules to the grid node corresponding to the batch processor for processing, the method further comprises the following steps:
Counting the total quantity C of unit modules to be processed of the grid nodes;
recording the number D of the unit modules processed by the grid node; and
And calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
2. The batch data processing method of claim 1, wherein before the step of monitoring the processing state of the unit modules, further comprising:
And reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
3. The batch data processing method according to claim 1, wherein after the step of calculating the processing progress according to the proportional relation of the number D and the total amount C, further comprising:
and counting the time of the grid node processing unit module.
4. The batch data processing method as claimed in claim 1, wherein after the step of composing and outputting the target data file according to the cell modules processed by the grid node, further comprising:
counting the total amount E of the target data files;
Calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and
And calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A or not so as to judge whether data are lost in the batch processing process or not.
5. An electronic device, comprising:
the loading module is used for loading the original data file to be processed to the temporary database table;
The decomposition module is used for decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
The sending module is used for sending the unit modules to grid node processing corresponding to the batch processor according to the granularity of the unit modules;
the monitoring module is used for monitoring the processing state of the unit module; and
The output module is used for forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file;
The decomposition module is specifically configured to: subdividing the data content of the original data file loaded into the temporary table of the database layer by layer to the smallest granularity possible to obtain unit modules with different granularities so as to reduce the coupling degree among the decomposed unit modules as much as possible;
The electronic device further comprises a statistics module for counting the total amount A of the original data files to be processed; recording the number B of the original data files loaded to the temporary table of the database; calculating file loading progress according to the proportional relation between the quantity B and the total quantity A;
wherein the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
The monitoring module is also used for: monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality; setting the colors of the unit modules according to the processing states so as to display the processing states of the unit modules through different colors;
The statistics module is further configured to: counting the total quantity C of unit modules to be processed of the grid nodes; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the batch data processing method of any one of claims 1 to 4 when the computer program is executed.
7. A computer-readable storage medium having stored thereon a computer program, characterized by: the computer program, when executed by a processor, implements the steps of the batch data processing method of any one of claims 1 to 4.
CN201910969164.8A 2019-10-12 2019-10-12 Batch data processing method, electronic device, computer equipment and storage medium Active CN110825771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910969164.8A CN110825771B (en) 2019-10-12 2019-10-12 Batch data processing method, electronic device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910969164.8A CN110825771B (en) 2019-10-12 2019-10-12 Batch data processing method, electronic device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110825771A CN110825771A (en) 2020-02-21
CN110825771B true CN110825771B (en) 2024-06-28

Family

ID=69549118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910969164.8A Active CN110825771B (en) 2019-10-12 2019-10-12 Batch data processing method, electronic device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110825771B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398153A (en) * 2022-01-21 2022-04-26 平安科技(深圳)有限公司 Control method and system for virtual machine live migration, electronic device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101000562A (en) * 2006-12-30 2007-07-18 中国建设银行股份有限公司 Method and device for executing batch processing job
CN105740063A (en) * 2014-12-08 2016-07-06 杭州华为数字技术有限公司 Data processing method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050058374A (en) * 2002-08-20 2005-06-16 도쿄 일렉트론 가부시키가이샤 Method for processing data based on the data context
US9842000B2 (en) * 2015-09-18 2017-12-12 Salesforce.Com, Inc. Managing processing of long tail task sequences in a stream processing framework
US10691671B2 (en) * 2017-12-21 2020-06-23 Cisco Technology, Inc. Using persistent memory to enable consistent data for batch processing and streaming processing
CN108737170A (en) * 2018-05-09 2018-11-02 中国银行股份有限公司 A kind of batch daily record abnormal data alarm method and device
CN109669766A (en) * 2018-09-11 2019-04-23 深圳平安财富宝投资咨询有限公司 Processing method, device, equipment and the storage medium of batch processing job

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101000562A (en) * 2006-12-30 2007-07-18 中国建设银行股份有限公司 Method and device for executing batch processing job
CN105740063A (en) * 2014-12-08 2016-07-06 杭州华为数字技术有限公司 Data processing method and apparatus

Also Published As

Publication number Publication date
CN110825771A (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN113176978B (en) Monitoring method, system, equipment and readable storage medium based on log file
CN111723079A (en) Data migration method, device, computer equipment and storage medium
CN112561370A (en) Software version management method and device, computer equipment and storage medium
CN110503564B (en) Security case processing method, system, equipment and storage medium based on big data
CN108090831A (en) Credit Risk Assessment method, application server and computer readable storage medium
CN108536356A (en) Agent information processing method and device and computer readable storage medium
CN110517365A (en) Automation attendance recording method, device, equipment and storage medium based on AI
CN110825771B (en) Batch data processing method, electronic device, computer equipment and storage medium
CN112561385A (en) Risk monitoring method and system
CN110866834A (en) Batch processing program execution method and system
CN113706092A (en) Engineering project supervision method and system
CN110443560B (en) Protocol data management method, device, computer equipment and storage medium
CN117611228A (en) Enterprise marketing prediction method, system, equipment and storage medium
CN115099920A (en) Financial data safety management system
CN111131393B (en) User activity data statistical method, electronic device and storage medium
CN114742521A (en) Reminder method, apparatus, computer device, and computer-readable storage medium
CN114416560A (en) Program crash analysis aggregation method and system
CN113327341A (en) Equipment early warning system, method and storage medium based on network technology
CN111626868A (en) Account checking method, account checking device, account checking equipment and computer readable storage medium
CN113888339B (en) Method, device, equipment and storage medium for checking risk event
US20060106651A1 (en) Insurance claim monitoring
CN116433197B (en) Information reporting method, device, reporting end and storage medium
CN113630504B (en) Method and device for acquiring abnormal information of recording system, electronic equipment and storage medium
CN119696917B (en) Network security threat situation assessment system based on AI technology
CN117057527B (en) Intelligent operation and maintenance method and system for industrial Internet of things of automobile manufacturing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant