CN110825771B - Batch data processing method, electronic device, computer equipment and storage medium - Google Patents
Batch data processing method, electronic device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110825771B CN110825771B CN201910969164.8A CN201910969164A CN110825771B CN 110825771 B CN110825771 B CN 110825771B CN 201910969164 A CN201910969164 A CN 201910969164A CN 110825771 B CN110825771 B CN 110825771B
- Authority
- CN
- China
- Prior art keywords
- processing
- unit modules
- processed
- unit
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24561—Intermediate data storage techniques for performance improvement
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Accounting & Taxation (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Quality & Reliability (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- Development Economics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a batch data processing method, an electronic device, computer equipment and a storage medium, wherein an original data file loaded to a temporary table of a database is decomposed into unit modules with different granularities according to data content; sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules; monitoring the processing state of the unit module; and forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file. The invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, thereby enabling the grid nodes of the batch processor to simultaneously process the corresponding unit modules in parallel, monitoring and displaying the processing state of each unit module, enabling a user to intuitively know the processing state of the data, conveniently and rapidly positioning the data with abnormal operation, rapidly positioning and solving the abnormal condition when the batch processing occurs, and saving manpower.
Description
Technical Field
The present invention relates to the field of batch data processing technologies, and in particular, to a batch data processing method, an electronic device, a computer device, and a storage medium.
Background
With the annual increase of insurance business, more and more data are generated by a core system, a large amount of data to be processed exist, and the data processing and synchronization of the large amount of data are realized by adopting a batch processing mode at present. However, in the daily operation and maintenance process, batch processing is invisible, including information such as processing progress, operation abnormality and the like, so that information available for positioning is very little, great workload is brought to operation and maintenance, and the working difficulty of operation and maintenance personnel is greatly increased.
Disclosure of Invention
In view of this, the present invention provides a batch data processing method, an electronic device, a computer device, and a storage medium, which can intuitively understand a processing state of data, and conveniently and rapidly locate data with abnormal operation.
Firstly, to achieve the above object, the present invention provides a batch data processing method, which includes the steps of:
Loading an original data file to be processed into a temporary database table;
Decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
Sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules;
monitoring the processing state of the unit module; and
And forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
the step of monitoring the processing state of the unit module further includes:
Monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality;
The color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, before the step of monitoring the processing state of the unit module, the method further includes:
And reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
Further, after the step of loading the original data file to be processed into the temporary table of the database, the method further includes:
counting the total amount A of the original data files to be processed;
recording the number B of the original data files loaded to the temporary table of the database; and
And calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, after the step of sending the unit modules to the grid nodes corresponding to the batch processor, the method further includes:
Counting the total quantity C of unit modules to be processed of the grid nodes;
recording the number D of the unit modules processed by the grid node; and
And calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, after the step of calculating the processing progress according to the proportional relation between the number D and the total amount C, the method further includes:
and counting the time of the grid node processing unit module.
Further, after the step of forming and outputting the target data file according to the unit modules processed by the grid nodes, the method further comprises:
Counting the total amount E of the target data files; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and
And calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A or not so as to judge whether data are lost in the batch processing process or not.
In addition, to achieve the above object, the present invention also provides an electronic device, including:
the loading module is used for loading the original data file to be processed to the temporary database table;
The decomposition module is used for decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
The sending module is used for sending the unit modules to grid node processing corresponding to the batch processor according to the granularity of the unit modules;
the monitoring module is used for monitoring the processing state of the unit module; and
And the output module is used for forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing; the monitoring module is further configured to monitor whether the unit module is processed by a grid node and record a processing state, record the processing state of the unit module that does not enter the grid node for processing as waiting for processing, record the processing state of the unit module that is being processed by the grid node as processing, record the processing state of the unit module that has been successfully processed by the grid node as processing end, and record the processing state of the unit module that cannot be processed or has an abnormal processing result as processing abnormality; the color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, the electronic device also comprises a reading module for reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
Further, the electronic device further comprises a statistics module for counting the total amount A of the original data files to be processed; recording the number B of the original data files loaded to the temporary table of the database; and calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, the statistics module is further configured to count a total amount C of unit modules to be processed by the grid node; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, the electronic device further comprises a time module for counting the time of the grid node processing unit module.
Further, the statistics module is further configured to count a total amount E of the target data file; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A so as to judge whether data are lost in the batch processing process.
To achieve the above object, the present invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
Compared with the prior art, the batch data processing method, the electronic device, the computer equipment and the storage medium provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and minimize the coupling degree between the unit modules, so that the grid nodes of the batch processor can simultaneously process the corresponding unit modules in parallel, and a user can intuitively know the processing state of the data by monitoring and displaying the processing state of each unit module, so that the abnormal data can be conveniently and rapidly positioned, and the abnormal condition can be rapidly positioned and solved when the batch processing occurs, thereby saving manpower.
Drawings
FIG. 1 is a flow chart of a Python-based reading method according to an exemplary embodiment of the present invention;
FIG. 2 is a flow chart of a Python-based reading method according to an exemplary embodiment of the present invention;
FIG. 3 is a flow chart of a method of batch data processing according to an exemplary embodiment of the invention;
FIG. 4 is a flow chart of a method of batch data processing, according to an exemplary embodiment of the invention;
FIG. 5 is a flow chart of a method of batch data processing, according to an exemplary embodiment of the invention;
FIG. 6 is a flow chart of a method of batch data processing shown in an exemplary embodiment of the invention;
FIG. 7 is a flow chart of a method of batch data processing shown in an exemplary embodiment of the invention;
FIG. 8 is a program module schematic diagram of an electronic device according to an exemplary embodiment of the present invention;
fig. 9 is a schematic diagram of a hardware architecture of an electronic device according to an exemplary embodiment of the present invention.
Reference numerals:
Electronic device | 20 |
Memory device | 21 |
Processor and method for controlling the same | 22 |
Network interface | 23 |
Memory | 24 |
Loading module | 201 |
Decomposition module | 202 |
Transmitting module | 203 |
Monitoring module | 204 |
Output module | 205 |
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
Referring to FIG. 1, a flow chart of a batch data processing method according to an exemplary embodiment of the invention is shown, the method includes the following steps:
Step S110, loading an original data file to be processed into a temporary database table;
step S120, the original data file loaded to the temporary table of the database is decomposed into unit modules with different granularities according to the data content;
step S130, sending the unit modules to grid node processing corresponding to a batch processor according to the granularity of the unit modules;
step S140, monitoring a processing state of the unit module; and
And step S150, forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file.
As insurance business increases year by year, more and more data is generated by the core system. And the data generated by the core system is required to be uploaded to an application server of the target system, and the target system performs logic processing through batch processing and then stores the logic processing into a database.
In step S110, the original data file to be processed is loaded into the database temporary table. The temporary table is a table built in a temporary folder of the system, is a database object for temporarily storing temporary data (or intermediate data), and can perform various operations like a permanent table, but is automatically deleted when it is not used any more.
And selecting partial data (to-be-processed original data) from the data generated by the core system, storing the partial data (to-be-processed original data) into a database, and generating an to-be-processed original data file according to the to-be-processed original data. For example, if the data of the policy in the vehicle insurance underwriting system needs to be stored in the database, the raw data to be processed refers to the data of the policy.
In step S120, the original data file loaded to the temporary table of the database is decomposed into unit modules of different granularities according to the data content. For the original data file to be processed, which is loaded into the temporary table of the database, the original data file is decomposed into a plurality of small unit modules according to different granularities, taking the original data file as an insurance policy as an example, the insurance policy must clearly and completely record the right obligations of both insurance parties, the insurance policy mainly carries names of insurers and insured persons, insurance targets, insurance amount, insurance fee, insurance period, reimbursement or responsibility ranges of payment and other stipulations, the data content recorded in the insurance policy is subdivided into the smallest granularity as possible layer by layer so as to reduce the coupling degree among the decomposed unit modules as much as possible, and therefore, the insurance policy can be decomposed into a plurality of small unit modules according to the data content recorded in the insurance policy, for example, the insurance policy paid for 20 years, and the unit modules with 20 granularities can be subdivided into the unit modules according to 1 st year, 2 nd year, … … and 20 th year by text recognition of the insurance policy for the part paid in the middle term. The invention does not specifically limit the number of the unit modules decomposed by the original data file, but reduces the coupling degree among the decomposed unit modules as much as possible according to the actual condition of the original data file, namely the influence degree of the variables related to the single unit module on other unit modules is the most possible small, and the variables are not the variables which can not be stripped out.
In step S130, the unit modules are sent to the grid node process corresponding to the batch processor. Because the coupling degree of each unit module is low, each grid node can simultaneously process corresponding unit modules in parallel. The setting of the number of the grid nodes corresponding to the batch processor is automatically performed by the system according to the operation capacity of the batch processor for processing the batch of original data files and the urgency degree of the processing result of the original data files. For example, if an original data file X is decomposed into 4 different granularity unit modules a, B, C and D, the batch processor may process the unit modules a, B, C and D through the grid node a, B, C and D.
In step S140, the process state of the unit module is monitored. During the processing of unit modules by a batch processor, the processing state of the unit modules can be monitored and displayed. The processing state may include: processing exceptions, waiting for processing, in process, ending the process.
As shown in fig. 2, in an embodiment of the present invention, the step of monitoring the processing state of the unit module may include the steps of:
Step S201, monitoring whether the unit module is processed by a grid node and recording a processing state, recording the processing state of the unit module which does not enter the grid node for processing as waiting processing, recording the processing state of the unit module which is being processed by the grid node as processing, recording the processing state of the unit module which is successfully processed by the grid node as processing end in processing, recording the processing state of the unit module which cannot be processed by the grid node or has abnormal processing result as processing abnormality.
Step S202, setting the color of the unit module according to the processing state so as to display the processing state of the unit module through different colors.
According to the information such as whether the unit modules are processed by the grid nodes or not and the processing results, the processing states of the unit modules can be recorded, and in order to facilitate users to intuitively check the current processing states of the unit modules, the unit modules can be set to different colors so as to display the processing states of the unit modules. For example, the processing state of the unit module which cannot process the grid node or process the result exception is recorded as the processing exception, and the processing exception can be represented by red; the processing state of the unit module which does not enter the grid node for processing is recorded as waiting processing, and the waiting processing can be represented by yellow; the processing state of the unit module being processed by the grid node is recorded as processing, and the processing can be represented by blue; the processing state of the unit module successfully processed by the grid node is recorded as the processing end, and the processing end can be indicated by green. By setting different colors, a user can visually check the current processing state of the unit modules, and can rapidly locate the unit modules for processing the abnormality from a large number of unit modules, so that the checking speed of the batch processing when the abnormality occurs is greatly improved.
As shown in fig. 3, in an embodiment of the present invention, before the step of monitoring the processing state of the unit module, the method may further include the steps of:
step S301, the processing state of the unit module is read from a batch log generated by the operation of the batch processor.
The method comprises the steps of decomposing an original data file into unit modules with different granularities, generating different names for the different unit modules, generating batch logs by a batch processor in the batch process, wherein each row of logs records the description of related operations such as the names, date, time, user, action and the like of the unit modules, and the actions comprise: processing exceptions, waiting for processing, in process, ending the process. The processing state of the unit module can be known from the batch log.
The batch data processing method provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, so that the grid nodes of the batch processor can process the corresponding unit modules simultaneously and in parallel, and a user can intuitively know the processing state of the data by monitoring and displaying the processing state of each unit module, thereby conveniently and quickly positioning the abnormal running data, and quickly positioning and solving the abnormal condition in batch processing, and saving manpower.
In order to facilitate the user to intuitively understand the progress of the batch process, as shown in fig. 4, in an embodiment of the present invention, after the step of loading the raw data file to be processed into the temporary table of the database, the method may further include the following steps:
Step S401, counting the total amount A of the original data files to be processed;
step S402, recording the number B of the original data files loaded to the temporary table of the database; and
Step S403, calculating file loading progress according to the proportional relation between the quantity B and the total quantity A.
The original data to be processed has eliminated abnormal data, and the abnormal data comprises data which does not meet the requirements and data which does not meet the logic of the target system. Non-satisfactory data includes, for example, a case where a field length of a certain data in the source system is 8 characters long and a field length of the target database is 7 characters long, and such data belongs to data that cannot be put in storage and is filtered out as non-satisfactory data. Data that does not conform to the target system logic includes situations where, for example, policy-validated data needs to be transferred to the target database, policy data that has not been validated may be filtered.
And counting the total amount A of the original data files to be processed, loading the original data files to be processed to a database temporary table, recording the number B of the original data files loaded to the database temporary table, and calculating the file loading progress according to the proportional relation between the number B and the total amount A. For example, when the total number a of original data files is 10000 and the number B of original data files loaded into the database temporary table is 1000, the progress of completion of file loading is 10%, and the remaining 90% are not loaded.
And by calculating and displaying the file loading progress, the user can intuitively know the progress of batch processing.
As shown in fig. 5, in an embodiment of the present invention, the step of sending the unit modules to the grid node corresponding to the batch processor may further include the following steps:
step S501, counting the total quantity C of unit modules to be processed by the grid node;
Step S502, recording the number D of the unit modules processed by the grid node; and
Step S503, calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
The original data file is decomposed into a plurality of small unit modules according to different granularities, and different unit modules are processed by corresponding grid nodes, so that the total quantity C of the unit modules to be processed by the grid nodes can be counted. And simultaneously recording the number D of the unit modules processed by the grid nodes, and calculating the processing progress according to the proportional relation between the number D and the total amount C. For example, the total number C of unit modules to be processed by the grid node a is 10000, and when the number D of processed unit modules is 1000, the processing progress of the unit modules is 10%, and the remaining 90% are unprocessed. The total amount C of the unit modules to be processed of the grid node B is 10000, when the number D of the processed unit modules is 1500, the processing progress of the unit modules is 15%, and the rest 85% is unprocessed.
The processing progress of the unit module is calculated and displayed, so that a user can intuitively know the progress of batch processing.
As shown in fig. 6, in an embodiment of the present invention, after the step of calculating the processing progress according to the proportional relationship between the number D and the total amount C, the method may further include the following steps:
And step S601, counting the time of the grid node processing unit module.
According to the total amount C of the unit modules to be processed of the grid node, the number D of the unit modules which are processed currently and the time spent for processing the unit modules of the number D, the time spent for processing the rest unit modules of the grid node, the time spent for processing the complete unit modules and the like can be calculated.
The original data file is decomposed into a plurality of small unit modules according to different granularities, different unit modules are processed by different grid nodes, and the processing time of the unit modules is calculated and displayed, so that a user can intuitively know the progress of batch processing. Further, when the time spent for processing the unit module by a certain grid node is long, the user can quickly and conveniently find the condition of long time consumption and intervene in time, so that countermeasures are conveniently proposed to shorten the processing time.
As shown in fig. 7, in an embodiment of the present invention, after the step of forming and outputting the target data file according to the unit modules processed by the grid node, the method may further include the following steps:
step S701, counting the total amount E of the target data file;
step S702, calculating the total amount F of the corresponding original data files according to the unit modules for processing the abnormality; and
In step S703, it is calculated whether the sum of the total amount E and the total amount F is equal to the total amount a, so as to determine whether data is lost during the batch processing.
The original data file is decomposed into a plurality of small unit modules according to different granularities, different unit modules are processed by different grid nodes, and when all the unit modules of one original data file are processed by the grid nodes, the target data file can be formed according to the unit modules processed by the grid nodes, so that the total quantity E of the target file is counted. Of course, there may be a unit module for processing an exception, and since there may be a plurality of unit modules for processing an exception among a plurality of unit modules corresponding to one original data file, it is necessary to count the number F of corresponding original data files according to the unit modules for processing an exception. Meanwhile, the unit module for processing the abnormality records the abnormality detailed information for operation and maintenance to check whether human intervention processing is needed. Under normal conditions, the total amount A of the original data files at the database entry should be equal to the sum of the total amount E of the target data files and the total amount F of the original data files with exception, and if the sum of the total amount E and the total amount F is not equal to the total amount A, the situation that data is lost in the batch processing process can be judged.
To better illustrate the above-described batch data processing scheme, a specific explanation will be made below by way of one example.
Examples:
And loading 100 to-be-processed original data files into the database temporary table, and recording the number of the original data files loaded into the database temporary table to calculate the file loading progress, for example, when the number B of the original data files loaded into the database temporary table is 10, the file loading completion progress is 10%, and the rest 90% are not loaded.
The original data files loaded to the temporary table of the database are decomposed into unit modules with different granularities, and for convenience of description, the description will be given below taking the decomposition of each original data file into 4 unit modules as an example, wherein the 1 st original data file is decomposed into unit module a1, unit module b1, unit module c1 and unit module d1, and the 2 nd original data file is decomposed into unit module a2, unit module b2, unit module c2 and unit modules d2, … … and so on.
The 4 grid nodes in the batch processor are respectively and parallelly processing corresponding unit modules, wherein the grid node A is used for processing 100 unit modules such as a unit module a1, a unit module a2, a unit module … … and the like, the grid node B is used for processing 100 unit modules such as a unit module B1, a unit module B2, a unit module … … and the like, the grid node C is used for processing 100 unit modules such as a unit module C1, a unit module C2, a unit module … … and the like, and the grid node D is used for processing 100 unit modules such as a unit module D1, a unit module D2, a unit module … … and the like.
The batch processor generates a batch log in the batch process, the processing state of the unit module can be obtained from the batch log, and the color of the unit module is set according to the processing state so as to display the processing state of the unit module through different colors. For example, for the mesh node a, the unit module a1 at which the processing ends is set to green, the unit module a2 at which the abnormality is handled is set to red, the unit module a3 in the processing is set to blue, and the unit modules a4 to a100 waiting for the processing are set to yellow.
The number of the processed unit modules of each grid node is recorded to calculate the processing progress of each grid node. For example, for grid node a, when the number of processed unit modules is 15 for 15 minutes, the processing progress of the unit modules is 15%, and the remaining 85% is unprocessed, and it is expected that the entire unit modules can be processed for 85 minutes.
When each unit module decomposed from an original data file is processed by a corresponding grid node, a target data file can be formed according to the processed unit modules. For example, when the processing of the 1 st original data file is completed by the corresponding grid node in each of the unit module a1, the unit module b1, the unit module c1, and the unit module d1, the 1 st target data file may be formed according to the unit module a1, the unit module b1, the unit module c1, and the unit module d1 after the processing is completed. At the end of the batch, 91 target data files are co-generated.
The unit module for processing the exception comprises: the unit modules a2, a5, a10, b2, b6, b11, c3, c6, c12, d4, and d13. The original data file corresponding to the abnormal handling unit module includes 9 original data files, including the 2 nd original data file, the 3 rd original data file, the 4 th original data file, the 5 th original data file, the 6 th original data file, the 10 th original data file, the 11 th original data file, the 12 th original data file, and the 13 th original data file.
Since the sum of 91 target data files and 9 original data files that handle exceptions is 100, no data is lost during the batch process.
The time spent for processing all the unit modules by each grid node is obtained, for example, the time spent for grid node A is 2 hours, the time spent for grid node B is 2 hours and 10 minutes, the time spent for grid node C is 2 hours and 30 minutes, and the time spent for grid node D is 5 hours.
The invention further provides an electronic device. Referring to fig. 8, a schematic diagram of a program module of the electronic device 20 according to an exemplary embodiment of the invention is shown.
The electronic device 20 includes:
the loading module 201 is configured to load an original data file to be processed into a temporary database table;
the decomposition module 202 is configured to decompose an original data file loaded to a temporary table of the database into unit modules with different granularities according to data content;
A sending module 203, configured to send the unit module to a grid node process corresponding to the batch processor according to granularity of the unit module;
A monitoring module 204 for monitoring a processing state of the unit module; and
And the output module 205 is configured to compose a target data file according to the unit modules processed by the grid nodes and output the target data file.
The electronic device 20 provided by the invention can decompose the original data file to be processed into the unit modules with different granularities, and reduce the coupling degree among the unit modules as much as possible, so that the grid nodes of the batch processor can process the corresponding unit modules simultaneously and in parallel, and a user can intuitively know the processing state of the data by displaying the processing state of each unit module, so that abnormal data can be conveniently and quickly positioned, and when abnormal conditions occur in batch processing, the abnormal conditions can be quickly positioned and solved, and the manpower is saved.
Further, the processing state includes: processing exception, waiting for processing, in-process, ending the processing; the monitoring module 204 is further configured to monitor whether the unit module is processed by a grid node and record a processing state, record the processing state of the unit module that does not enter the grid node for processing as waiting for processing, record the processing state of the unit module that is being processed by the grid node as processing, record the processing state of the unit module that has been successfully processed by the grid node as processing end, and record the processing state of the unit module that cannot be processed or has an abnormal processing result as processing abnormality; the color of the unit modules is set according to the processing state, so that the processing state of the unit modules is displayed through different colors.
Further, the electronic device 20 further includes a reading module for reading the processing status of the unit module from a batch log generated by the batch processor.
Further, the electronic device 20 further includes a statistics module for counting the total amount a of the original data file to be processed; recording the number B of the original data files loaded to the temporary table of the database; and calculating the file loading progress according to the proportional relation between the quantity B and the total quantity A.
Further, the statistics module is further configured to count a total amount C of unit modules to be processed by the grid node; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
Further, the electronic device 20 further includes a time module for counting the time of the grid node processing unit module.
Further, the statistics module is further configured to count a total amount E of the target data file; calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A so as to judge whether data are lost in the batch processing process.
To achieve the above object, the present invention also provides a computer device 20 comprising a memory 21, a processor 22 and a computer program stored on the memory 21 and executable on the processor 22, the processor 22 implementing the steps of the above method when executing the computer program. The computer program may be stored in the memory 24.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
The invention also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack-mounted server, a blade server, a tower server or a cabinet server (comprising independent servers or a server cluster formed by a plurality of servers) and the like which can execute programs. The computer device of the present embodiment includes at least, but is not limited to: memory, processors, etc. that may be communicatively coupled to each other via a system bus.
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used to store the electronic device 20, which when executed by the processor 22 implements the batch data processing method of the present invention.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (7)
1. A method of batch data processing, the method comprising the steps of:
Loading an original data file to be processed into a temporary database table;
Decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
Sending the unit modules to grid nodes corresponding to a batch processor for processing according to the granularity of the unit modules;
monitoring the processing state of the unit module; and
Forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file;
the unit module for decomposing the original data file loaded to the temporary table of the database into different granularities according to the data content comprises:
Subdividing the data content of the original data file loaded into the temporary table of the database layer by layer to the smallest granularity possible to obtain unit modules with different granularities so as to reduce the coupling degree among the decomposed unit modules as much as possible;
After the step of loading the original data file to be processed into the temporary database table, the method further comprises the following steps:
counting the total amount A of the original data files to be processed;
recording the number B of the original data files loaded to the temporary table of the database; and
Calculating file loading progress according to the proportional relation between the quantity B and the total quantity A;
wherein the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
the step of monitoring the processing state of the unit module further includes:
Monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality;
setting the colors of the unit modules according to the processing states so as to display the processing states of the unit modules through different colors;
After the step of sending the unit modules to the grid node corresponding to the batch processor for processing, the method further comprises the following steps:
Counting the total quantity C of unit modules to be processed of the grid nodes;
recording the number D of the unit modules processed by the grid node; and
And calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
2. The batch data processing method of claim 1, wherein before the step of monitoring the processing state of the unit modules, further comprising:
And reading the processing state of the unit module from a batch log generated by the operation of the batch processor.
3. The batch data processing method according to claim 1, wherein after the step of calculating the processing progress according to the proportional relation of the number D and the total amount C, further comprising:
and counting the time of the grid node processing unit module.
4. The batch data processing method as claimed in claim 1, wherein after the step of composing and outputting the target data file according to the cell modules processed by the grid node, further comprising:
counting the total amount E of the target data files;
Calculating the total quantity F of the corresponding original data files according to the unit modules for processing the exception; and
And calculating whether the sum of the total quantity E and the total quantity F is equal to the total quantity A or not so as to judge whether data are lost in the batch processing process or not.
5. An electronic device, comprising:
the loading module is used for loading the original data file to be processed to the temporary database table;
The decomposition module is used for decomposing the original data file loaded to the temporary table of the database into unit modules with different granularities according to the data content;
The sending module is used for sending the unit modules to grid node processing corresponding to the batch processor according to the granularity of the unit modules;
the monitoring module is used for monitoring the processing state of the unit module; and
The output module is used for forming a target data file according to the unit modules processed by the grid nodes and outputting the target data file;
The decomposition module is specifically configured to: subdividing the data content of the original data file loaded into the temporary table of the database layer by layer to the smallest granularity possible to obtain unit modules with different granularities so as to reduce the coupling degree among the decomposed unit modules as much as possible;
The electronic device further comprises a statistics module for counting the total amount A of the original data files to be processed; recording the number B of the original data files loaded to the temporary table of the database; calculating file loading progress according to the proportional relation between the quantity B and the total quantity A;
wherein the processing state includes: processing exception, waiting for processing, in-process, ending the processing;
The monitoring module is also used for: monitoring whether the unit modules are processed by the grid nodes and recording processing states, marking the processing states of the unit modules which do not enter the grid nodes for processing as waiting for processing, marking the processing states of the unit modules which are being processed by the grid nodes as processing, marking the processing states of the unit modules which are successfully processed by the grid nodes as processing ending, marking the processing states of the unit modules which cannot be processed by the grid nodes or are abnormal in processing results as processing abnormality; setting the colors of the unit modules according to the processing states so as to display the processing states of the unit modules through different colors;
The statistics module is further configured to: counting the total quantity C of unit modules to be processed of the grid nodes; recording the number D of the unit modules processed by the grid node; and calculating the processing progress according to the proportional relation between the quantity D and the total quantity C.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the batch data processing method of any one of claims 1 to 4 when the computer program is executed.
7. A computer-readable storage medium having stored thereon a computer program, characterized by: the computer program, when executed by a processor, implements the steps of the batch data processing method of any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910969164.8A CN110825771B (en) | 2019-10-12 | 2019-10-12 | Batch data processing method, electronic device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910969164.8A CN110825771B (en) | 2019-10-12 | 2019-10-12 | Batch data processing method, electronic device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110825771A CN110825771A (en) | 2020-02-21 |
CN110825771B true CN110825771B (en) | 2024-06-28 |
Family
ID=69549118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910969164.8A Active CN110825771B (en) | 2019-10-12 | 2019-10-12 | Batch data processing method, electronic device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110825771B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114398153A (en) * | 2022-01-21 | 2022-04-26 | 平安科技(深圳)有限公司 | Control method and system for virtual machine live migration, electronic device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101000562A (en) * | 2006-12-30 | 2007-07-18 | 中国建设银行股份有限公司 | Method and device for executing batch processing job |
CN105740063A (en) * | 2014-12-08 | 2016-07-06 | 杭州华为数字技术有限公司 | Data processing method and apparatus |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050058374A (en) * | 2002-08-20 | 2005-06-16 | 도쿄 일렉트론 가부시키가이샤 | Method for processing data based on the data context |
US9842000B2 (en) * | 2015-09-18 | 2017-12-12 | Salesforce.Com, Inc. | Managing processing of long tail task sequences in a stream processing framework |
US10691671B2 (en) * | 2017-12-21 | 2020-06-23 | Cisco Technology, Inc. | Using persistent memory to enable consistent data for batch processing and streaming processing |
CN108737170A (en) * | 2018-05-09 | 2018-11-02 | 中国银行股份有限公司 | A kind of batch daily record abnormal data alarm method and device |
CN109669766A (en) * | 2018-09-11 | 2019-04-23 | 深圳平安财富宝投资咨询有限公司 | Processing method, device, equipment and the storage medium of batch processing job |
-
2019
- 2019-10-12 CN CN201910969164.8A patent/CN110825771B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101000562A (en) * | 2006-12-30 | 2007-07-18 | 中国建设银行股份有限公司 | Method and device for executing batch processing job |
CN105740063A (en) * | 2014-12-08 | 2016-07-06 | 杭州华为数字技术有限公司 | Data processing method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN110825771A (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113176978B (en) | Monitoring method, system, equipment and readable storage medium based on log file | |
CN111723079A (en) | Data migration method, device, computer equipment and storage medium | |
CN112561370A (en) | Software version management method and device, computer equipment and storage medium | |
CN110503564B (en) | Security case processing method, system, equipment and storage medium based on big data | |
CN108090831A (en) | Credit Risk Assessment method, application server and computer readable storage medium | |
CN108536356A (en) | Agent information processing method and device and computer readable storage medium | |
CN110517365A (en) | Automation attendance recording method, device, equipment and storage medium based on AI | |
CN110825771B (en) | Batch data processing method, electronic device, computer equipment and storage medium | |
CN112561385A (en) | Risk monitoring method and system | |
CN110866834A (en) | Batch processing program execution method and system | |
CN113706092A (en) | Engineering project supervision method and system | |
CN110443560B (en) | Protocol data management method, device, computer equipment and storage medium | |
CN117611228A (en) | Enterprise marketing prediction method, system, equipment and storage medium | |
CN115099920A (en) | Financial data safety management system | |
CN111131393B (en) | User activity data statistical method, electronic device and storage medium | |
CN114742521A (en) | Reminder method, apparatus, computer device, and computer-readable storage medium | |
CN114416560A (en) | Program crash analysis aggregation method and system | |
CN113327341A (en) | Equipment early warning system, method and storage medium based on network technology | |
CN111626868A (en) | Account checking method, account checking device, account checking equipment and computer readable storage medium | |
CN113888339B (en) | Method, device, equipment and storage medium for checking risk event | |
US20060106651A1 (en) | Insurance claim monitoring | |
CN116433197B (en) | Information reporting method, device, reporting end and storage medium | |
CN113630504B (en) | Method and device for acquiring abnormal information of recording system, electronic equipment and storage medium | |
CN119696917B (en) | Network security threat situation assessment system based on AI technology | |
CN117057527B (en) | Intelligent operation and maintenance method and system for industrial Internet of things of automobile manufacturing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |