[go: up one dir, main page]

CN111026568B - Data and task relation construction method and device, computer equipment and storage medium - Google Patents

Data and task relation construction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111026568B
CN111026568B CN201911229154.7A CN201911229154A CN111026568B CN 111026568 B CN111026568 B CN 111026568B CN 201911229154 A CN201911229154 A CN 201911229154A CN 111026568 B CN111026568 B CN 111026568B
Authority
CN
China
Prior art keywords
task
created
relationship
blood
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911229154.7A
Other languages
Chinese (zh)
Other versions
CN111026568A (en
Inventor
孙朝和
申志彬
谢瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Original Assignee
Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd filed Critical Shenzhen Qianhai Huanrong Lianyi Information Technology Service Co Ltd
Priority to CN201911229154.7A priority Critical patent/CN111026568B/en
Publication of CN111026568A publication Critical patent/CN111026568A/en
Application granted granted Critical
Publication of CN111026568B publication Critical patent/CN111026568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/004Error avoidance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method, a device, a computer device and a storage medium for constructing a data and task relationship, wherein the method comprises the steps of obtaining a task creation request to obtain a task to be created; generating relevant attributes of a task model according to a task to be created so as to obtain target attributes; determining the dependency relationship between the task to be created and the task in the data model according to the target attribute to obtain a blood relationship; analyzing the task to be created to obtain the association relationship between the task and the input table and the output table; and updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so that the terminal can be positioned to the related task according to the metadata blood-edge management system when the data table has a problem. The application realizes the association relation between the construction task and the data model so as to achieve the blood relationship between the construction task and the data model in a more detailed and clear way, thereby facilitating the quick reading and positioning to the related task and improving the efficiency of solving the problem.

Description

Data and task relation construction method and device, computer equipment and storage medium
Technical Field
The present application relates to a computer, and more particularly, to a data and task relationship construction method, apparatus, computer device, and storage medium.
Background
Big data technology refers to the ability to quickly obtain valuable information from a wide variety of types of data. Technologies applicable to big data include massively parallel processing databases, data mining systems, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.
In the metadata blood-edge management system of the existing large data platform, only the blood-edge relation of the data model of the components such as Hive, kafka, HBase is recorded, the flow direction of the data in the components can be better seen through the metadata management system, but the existing system still cannot meet the requirements of checking the blood-edge relation among tasks, checking the data table of each component associated with each task and checking the task in which each component data model is, when a great number of data tables exist, and when some data tables have problems, the related tasks cannot be quickly positioned for problem processing due to the fact that the requirements cannot be met, and the problem solving efficiency is reduced.
Therefore, a new method is needed to be designed to realize the association relationship between the construction task and the data model so as to achieve the blood-related relationship between the construction task and the data model, facilitate quick reading and positioning to related tasks and improve the efficiency of solving the problems.
Disclosure of Invention
The application aims to overcome the defects of the prior art and provide a data and task relation construction method, a device, computer equipment and a storage medium.
In order to achieve the above purpose, the present application adopts the following technical scheme: the data and task relation construction method comprises the following steps:
acquiring a task creation request to obtain a task to be created;
generating relevant attributes of a task model according to a task to be created so as to obtain target attributes;
determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain a blood margin relationship;
analyzing the task to be created to obtain the association relationship between the task and the input table and the output table;
and updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so as to be convenient for positioning to related tasks according to the metadata blood-edge management system when the data table has problems.
The further technical scheme is as follows: the step of obtaining the task creation request to obtain the task to be created further comprises:
an initial task model is defined at the metadata blood-edge management system.
The further technical scheme is as follows: the generating the relevant attribute of the task model according to the task to be created to obtain the target attribute includes:
and according to the task to be created, retrieving a corresponding initial task model in the metadata blood-edge management system, and generating relevant attributes of the task model to obtain target attributes.
The further technical scheme is as follows: the target attributes include task name, task type, task identification, task creator, and task creation time.
The further technical scheme is as follows: the parsing the task to be created to obtain the association relationship between the task and the input table and the output table includes:
acquiring a script related to a task to be created in a storage process;
analyzing the task to be created by adopting a script analysis engine corresponding to the script to obtain an input source and an output target of the task to be created;
constructing an association relationship between a task and an input table and between the input table and the task according to an input source of the task to be created;
and constructing the association relation between the task and the output table and between the output table and the task according to the output target of the task to be created.
The application also provides a data and task relation construction device, which comprises:
the request acquisition unit is used for acquiring a task creation request to acquire a task to be created;
the attribute generation unit is used for generating relevant attributes of the task model according to the task to be created so as to obtain target attributes;
the blood relationship construction unit is used for determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain the blood relationship;
the analysis unit is used for analyzing the task to be created to obtain the association relationship between the task and the input table and the output table;
and the updating unit is used for updating the incidence relation among the blood edge relation and the task, the input table and the output table in the metadata blood edge management system so as to be convenient for positioning to related tasks according to the metadata blood edge management system when the data table has problems.
The further technical scheme is as follows: further comprises:
and the model definition unit is used for defining an initial task model in the metadata blood-source management system.
The further technical scheme is as follows: the parsing unit further includes:
the script acquisition subunit is used for acquiring scripts related to the task to be created in the storage process;
the engine analysis subunit is used for analyzing the task to be created by adopting a script analysis engine corresponding to the script so as to obtain an input source and an output target of the task to be created;
the first relation construction subunit is used for constructing association relations between tasks and input tables and between the input tables and the tasks according to input sources of the tasks to be created;
and the second relation construction subunit is used for constructing the association relation between the task and the output table and between the output table and the task according to the output target of the task to be created.
The application also provides a computer device which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the method when executing the computer program.
The present application also provides a storage medium storing a computer program which, when executed by a processor, performs the above-described method.
Compared with the prior art, the application has the beneficial effects that: according to the application, related attributes are generated in the process of creating the task, the dependency relationship between the tasks is determined according to the attributes, the task to be created is analyzed, the association relationship between the task and the table, and the association relationship between the table and the task are constructed, so that the association relationship between the task and the data model is constructed, the blood-edge relationship between the task and the data model is constructed in a more detailed and clear manner, the task to be quickly read and positioned to the related task is facilitated, and the efficiency of solving the problem is improved.
The application is further described below with reference to the drawings and specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an application scenario of a method for constructing a data and task relationship according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for constructing a data and task relationship according to an embodiment of the present application;
FIG. 3 is a schematic sub-flowchart of a method for constructing a data and task relationship according to an embodiment of the present application;
FIG. 4 is a schematic diagram of data and task relationships provided by an embodiment of the present application;
FIG. 5 is a flowchart of a method for constructing a data and task relationship according to another embodiment of the present application;
FIG. 6 is a schematic block diagram of a data and task relationship construction apparatus provided by an embodiment of the present application;
FIG. 7 is a schematic block diagram of an parsing unit of a data and task relationship construction device provided by an embodiment of the present application;
FIG. 8 is a schematic block diagram of a data and task relationship construction apparatus provided by another embodiment of the present application;
fig. 9 is a schematic block diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic application scenario diagram of a method for constructing a data and task relationship according to an embodiment of the present application. FIG. 2 is a schematic flow chart of a method for constructing a data and task relationship according to an embodiment of the present application. The data and task relationship construction method is applied to a server with a metadata blood relationship management system. The server can be a server in a distributed service platform, the server performs data interaction with the terminal, in addition, the server also performs data interaction with the executor so as to facilitate task execution, the association relationship between the data model and the task is generated by means of a request initiated by the terminal and updated into the metadata blood-edge relationship management system, and once a problem occurs in the data table, information in the metadata blood-edge relationship management system can be adjusted to perform positioning of related tasks so as to facilitate quick problem solving.
Fig. 2 is a flow chart of a method for constructing a data and task relationship according to an embodiment of the present application. As shown in fig. 2, the method includes the following steps S110 to S150.
S110, acquiring a task creation request to obtain a task to be created.
In this embodiment, the task to be created refers to information such as a type related to the task and an actuator, for example, a task that can automatically identify a category of a commodity needs to be created, and the related type is a classification model, and the actuator can be the actuator where the classification model is located, that is, the task that invokes the actuator to automatically identify the category.
S120, generating relevant attributes of the task model according to the task to be created so as to obtain target attributes.
In this embodiment, the target attribute refers to an attribute that needs to be constructed in the process of generating the task model by the task to be created, and specifically, the target attribute includes a task name, a task type, a task identifier, a task creator, and a task creation time.
Specifically, according to the task to be created, the corresponding initial task model in the metadata blood-edge management system is called, and relevant attributes of the task model are generated to obtain target attributes.
A corresponding initial task model is defined in the metadata blood-relation management system, and when a task is created by the task scheduling system, information required by the task model is acquired to generate data in a JSON format, such as { "jobName": "stg_m_i_h_base_address", "projectName": "stg", "jobType": "sqoop" } "uuid": "stg.stg_m_i_h_base_address" }.
S130, determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain a blood margin relationship;
in this embodiment, the blood relationship refers to a dependency relationship between a task to be created and a task within the data model.
The task scheduling system can construct the dependency relationship of two tasks, and can acquire the item and the task name of the current task, such as { "projectName": "bi", "jobName": "bi_base_address", "depprojectName": "stg", "depJobName": "stg_m_i_h_base_address" }, and can generate the blood-edge relationship { "lineName": "stg_m_i_h_base_address = > bi.bi_base_address", "input" { "uud": "stg.stg_m_i_h_base_address" }, and "output" { "uud": "bi.base_address" }, according to the above data.
And S140, analyzing the task to be created to obtain the association relationship between the task and the input table and the output table.
In this embodiment, the association relationship between the task and the input TABLE and the output TABLE refer to the relationship between the input TABLE and the output TABLE associated with the task and the relationship between the input TABLE, the output TABLE and the task, as shown in fig. 4, where JOB a, JOB B, JOB C refer to the task, TABLE a, TABLE B, TABLE C refer to the input TABLE and the output TABLE, and arrows in the figure indicate the association relationship.
In one embodiment, referring to fig. 3, the step S140 may include steps S141 to S144.
S141, acquiring a script related to the task to be created in the storage process.
Various related scripts, such as sql scripts and sqoop scripts, are stored in the task to be created, the related scripts are actual execution contents of the task, the related script types are determined according to the task types, for example, the task types are tasks for operating data in a database, and the script types are sql scripts.
S142, analyzing the task to be created by adopting a script analysis engine corresponding to the script so as to obtain an input source and an output target of the task to be created.
According to different task types, different script parsing engines are selected to obtain input sources of tasks to be created, such as mysql table, hive table, hdfs path, hbase table and the like, and output targets of tasks to be created, such as mysql table, hive table, hdfs path, hbase table and the like.
S143, building a task and an input table according to an input source of the task to be created, and building an association relationship between the input table and the task;
s144, building association relations between tasks and output tables and between the output tables and the tasks according to output targets of the tasks to be created.
The relation between the task and the table, the table and the task is then updated into a metadata blood relationship management system, such as the relation between the task and the table { "type": "job", "jobName": "stg_m_i_h_base_address", "pro jectName": "stg", "jobType": "sqoop", "uuid": "stg.stg_m_i_h_base_address", "jobType": "sqoop", "references": { "uuid": "ip: host: db1.TableA", "name": "tableA" }, { "uuid": ip: host: db2.TableB "," name "};
relationship of table and task:
{“type”:”table”,“name”:”tableA”,”uuid”:”ip:host:db1.tableA”,”references”:[{“uuid”:”stg.stg_m_i_h_base_address”,”name”:”stg_m_i_h_base_address”}]}。
taking Hive task as an example, analyzing input source and output target related in Hive sql, and constructing association relation between task and table, table and task, wherein the table comprises input table and output table.
And S150, updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so as to be convenient for positioning to related tasks according to the metadata blood-edge management system when the data table has problems.
The updating metadata blood-edge management system can enable the blood-edge relation between the task and the data model to be more detailed and clear, so that when problems occur in certain data tables, related tasks can be rapidly positioned for problem processing, and the problem solving efficiency is improved.
And analyzing task information in the task scheduling system to obtain data model information of each component in the big data component, and constructing an association relationship between the task and the data model so as to achieve a more detailed and clearer blood-edge relationship between the task and the data model.
According to the data and task relation construction method, the related attributes are generated in the task construction process, the dependency relation between the tasks is determined according to the attributes, the task to be constructed is analyzed, the association relation between the task and the table, the association relation between the table and the task is constructed, the association relation between the task and the data model is constructed, the blood-edge relation between the task and the data model which is constructed in more detail and is clear is achieved, the task to be quickly read and positioned to the related task is facilitated, and the problem solving efficiency is improved.
Fig. 5 is a flow chart of a method for constructing a data and task relationship according to another embodiment of the present application. As shown in fig. 5, the data and task relationship construction method of the present embodiment includes steps S210 to S260. Steps S220 to S260 are similar to steps S110 to S150 in the above embodiment, and are not described herein. Step S210 added in the present embodiment is described in detail below.
S210, defining an initial task model in the metadata blood-edge management system.
Models of task entities are defined in the metadata blood-edge management system, including but not limited to attributes such as task names, task types, task identifications, task creators, task creation time and the like, and an initial task model refers to a model for recording related attributes related to different types of tasks.
FIG. 6 is a schematic block diagram of a data and task relationship construction apparatus 300 provided by an embodiment of the present application. As shown in fig. 6, the present application also provides a data and task relationship construction apparatus 300 corresponding to the above data and task relationship construction method. The data and task relationship construction apparatus 300 includes a unit for performing the above-described data and task relationship construction method, and may be configured in a server. Specifically, referring to fig. 6, the data and task relationship construction apparatus 300 includes a request acquisition unit 302, an attribute generation unit 303, a blood relationship construction unit 304, a parsing unit 305, and an updating unit 306.
A request acquiring unit 302, configured to acquire a task creation request to obtain a task to be created; an attribute generating unit 303, configured to generate relevant attributes of the task model according to the task to be created, so as to obtain target attributes; a blood relationship construction unit 304, configured to determine a dependency relationship between a task to be created and a task in the data model according to the target attribute, so as to obtain a blood relationship; the parsing unit 305 is configured to parse the task to be created to obtain an association relationship between the task and the input table and the output table; and the updating unit 306 is configured to update the relationship between the blood-edge relationship and the task and the input table and the output table in the metadata blood-edge management system, so that the related task can be located according to the metadata blood-edge management system when a problem occurs in the data table.
In one embodiment, as shown in fig. 7, the parsing unit 305 includes a script acquisition subunit 3051, an engine parsing subunit 3052, a first relationship construction subunit 3053, and a second relationship construction subunit 3054.
A script acquisition subunit 3051, configured to acquire a script related to a task to be created in a storage process; an engine parsing sub-unit 3052, configured to parse the task to be created by using a script parsing engine corresponding to the script, so as to obtain an input source and an output target of the task to be created; a first relationship construction subunit 3053, configured to construct an association relationship between a task and an input table, and between the input table and the task according to an input source of the task to be created; and the second relationship construction subunit 3054 is configured to construct an association relationship between the task and the output table, and between the output table and the task according to the output target of the task to be created.
FIG. 8 is a schematic block diagram of a data and task relationship construction apparatus 300 provided in another embodiment of the present application. As shown in fig. 8, the data and task relationship construction apparatus 300 of the present embodiment is an addition to the above-described embodiment with a model definition unit 301.
A model definition unit 301 for defining an initial task model in the metadata blood-address management system.
It should be noted that, as will be clearly understood by those skilled in the art, the specific implementation process of the data and task relationship construction apparatus 300 and each unit may refer to the corresponding description in the foregoing method embodiments, and for convenience and brevity of description, the description is omitted here.
The above-described data and task relationship construction apparatus 300 may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 9.
Referring to fig. 9, fig. 9 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server, where the server may be a stand-alone server or may be a server cluster formed by a plurality of servers.
With reference to FIG. 9, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a data and task relationship construction method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a data and task relationship construction method.
The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting of the computer device 500 to which the present inventive arrangements may be implemented, as a particular computer device 500 may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:
acquiring a task creation request to obtain a task to be created; generating relevant attributes of a task model according to a task to be created so as to obtain target attributes; determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain a blood margin relationship; analyzing the task to be created to obtain the association relationship between the task and the input table and the output table; and updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so as to be convenient for positioning to related tasks according to the metadata blood-edge management system when the data table has problems.
In an embodiment, before implementing the step of obtaining the task creation request to obtain the task to be created, the processor 502 further implements the following steps:
an initial task model is defined at the metadata blood-edge management system.
In an embodiment, when the step of generating the relevant attribute of the task model according to the task to be created to obtain the target attribute is implemented by the processor 502, the following steps are specifically implemented:
and according to the task to be created, retrieving a corresponding initial task model in the metadata blood-edge management system, and generating relevant attributes of the task model to obtain target attributes.
The target attributes comprise a task name, a task type, a task identifier, a task creator and task creation time.
In an embodiment, when the step of parsing the task to be created to obtain the association relationship between the task and the input table and the output table is implemented by the processor 502, the following steps are specifically implemented:
acquiring a script related to a task to be created in a storage process; analyzing the task to be created by adopting a script analysis engine corresponding to the script to obtain an input source and an output target of the task to be created; constructing an association relationship between a task and an input table and between the input table and the task according to an input source of the task to be created; and constructing the association relation between the task and the output table and between the output table and the task according to the output target of the task to be created.
It should be appreciated that in an embodiment of the application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), the processor 502 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present application also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring a task creation request to obtain a task to be created; generating relevant attributes of a task model according to a task to be created so as to obtain target attributes; determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain a blood margin relationship; analyzing the task to be created to obtain the association relationship between the task and the input table and the output table; and updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so as to be convenient for positioning to related tasks according to the metadata blood-edge management system when the data table has problems.
In an embodiment, before executing the computer program to implement the get task creation request to get the task to be created step, the processor further implements the following steps:
an initial task model is defined at the metadata blood-edge management system.
In one embodiment, when the processor executes the computer program to implement the step of generating the relevant attribute of the task model according to the task to be created to obtain the target attribute, the following steps are specifically implemented:
and according to the task to be created, retrieving a corresponding initial task model in the metadata blood-edge management system, and generating relevant attributes of the task model to obtain target attributes.
The target attributes comprise a task name, a task type, a task identifier, a task creator and task creation time.
In an embodiment, when the processor executes the computer program to perform the step of parsing the task to be created to obtain the association relationship between the task and the input table and the output table, the following steps are specifically implemented:
acquiring a script related to a task to be created in a storage process; analyzing the task to be created by adopting a script analysis engine corresponding to the script to obtain an input source and an output target of the task to be created; constructing an association relationship between a task and an input table and between the input table and the task according to an input source of the task to be created; and constructing the association relation between the task and the output table and between the output table and the task according to the output target of the task to be created.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the application can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application.
While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (5)

1. The data and task relation construction method is characterized by comprising the following steps:
acquiring a task creation request to obtain a task to be created;
generating relevant attributes of a task model according to a task to be created so as to obtain target attributes;
determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain a blood margin relationship;
analyzing the task to be created to obtain the association relationship between the task and the input table and the output table;
updating the relationship between the blood-edge relation and the task and the input table and the output table in the metadata blood-edge management system so as to be convenient for positioning to related tasks according to the metadata blood-edge management system when the data table has problems;
the parsing the task to be created to obtain the association relationship between the task and the input table and the output table includes:
acquiring a script related to a task to be created in a storage process;
analyzing the task to be created by adopting a script analysis engine corresponding to the script to obtain an input source and an output target of the task to be created;
constructing an association relationship between a task and an input table and between the input table and the task according to an input source of the task to be created;
constructing association relations between tasks and output tables and between the output tables and the tasks according to output targets of the tasks to be created;
the step of obtaining the task creation request to obtain the task to be created further comprises:
defining an initial task model in a metadata blood-edge management system;
the generating the relevant attribute of the task model according to the task to be created to obtain the target attribute includes:
and according to the task to be created, retrieving a corresponding initial task model in the metadata blood-edge management system, and generating relevant attributes of the task model to obtain target attributes.
2. The data and task relationship construction method according to claim 1, wherein the target attributes include task name, task type, task identification, task creator, and task creation time.
3. The data and task relation construction device is characterized by comprising:
the request acquisition unit is used for acquiring a task creation request to acquire a task to be created;
the attribute generation unit is used for generating relevant attributes of the task model according to the task to be created so as to obtain target attributes; specifically, according to the task to be created, the corresponding initial task model in the metadata blood-edge management system is called, and relevant attributes of the task model are generated to obtain target attributes;
the blood relationship construction unit is used for determining the dependency relationship between the task to be created and the task in the data model according to the target attribute so as to obtain the blood relationship;
the analysis unit is used for analyzing the task to be created to obtain the association relationship between the task and the input table and the output table;
the updating unit is used for updating the incidence relation among the blood edge relation and the task, the input table and the output table in the metadata blood edge management system so as to be convenient for positioning to related tasks according to the metadata blood edge management system when the data table has problems;
the parsing unit further includes:
the script acquisition subunit is used for acquiring scripts related to the task to be created in the storage process;
the engine analysis subunit is used for analyzing the task to be created by adopting a script analysis engine corresponding to the script so as to obtain an input source and an output target of the task to be created;
the first relation construction subunit is used for constructing association relations between tasks and input tables and between the input tables and the tasks according to input sources of the tasks to be created;
the second relation construction subunit is used for constructing association relations between tasks and output tables and between the output tables and the tasks according to output targets of the tasks to be created;
further comprises:
and the model definition unit is used for defining an initial task model in the metadata blood-source management system.
4. A computer device, characterized in that it comprises a memory on which a computer program is stored and a processor which, when executing the computer program, implements the method according to any of claims 1-2.
5. A storage medium storing a computer program which, when executed by a processor, performs the method of any one of claims 1 to 2.
CN201911229154.7A 2019-12-04 2019-12-04 Data and task relation construction method and device, computer equipment and storage medium Active CN111026568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911229154.7A CN111026568B (en) 2019-12-04 2019-12-04 Data and task relation construction method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911229154.7A CN111026568B (en) 2019-12-04 2019-12-04 Data and task relation construction method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111026568A CN111026568A (en) 2020-04-17
CN111026568B true CN111026568B (en) 2023-09-29

Family

ID=70204243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911229154.7A Active CN111026568B (en) 2019-12-04 2019-12-04 Data and task relation construction method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111026568B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100661B (en) * 2020-09-16 2024-03-12 深圳集智数字科技有限公司 Data processing method and device
CN112506911A (en) * 2020-12-18 2021-03-16 杭州数澜科技有限公司 Data quality monitoring method and device, electronic equipment and storage medium
CN112506957A (en) * 2020-12-18 2021-03-16 杭州数梦工场科技有限公司 Method and device for determining workflow dependency relationship
CN112860662B (en) * 2021-01-22 2023-10-17 平安科技(深圳)有限公司 Automatic production data blood relationship establishment method, device, computer equipment and storage medium
CN113191879B (en) * 2021-05-21 2024-12-24 中国工商银行股份有限公司 Data reporting method, device, system and medium based on complex network
CN113590386B (en) * 2021-07-30 2023-03-03 深圳前海微众银行股份有限公司 Disaster recovery method, system, terminal device and computer storage medium for data
CN114416856A (en) * 2021-12-17 2022-04-29 北京红山信息科技研究院有限公司 A metadata-based data lineage management method and system
CN114564253B (en) * 2022-03-02 2023-06-09 重庆紫光华山智安科技有限公司 Task creation method, system, electronic device and readable storage medium
CN115827226A (en) * 2022-11-25 2023-03-21 四川新网银行股份有限公司 Method, system, device and medium for task scheduling optimization based on lineage
CN116894035B (en) * 2023-07-11 2025-09-05 中电工业互联网有限公司 Method, system, device and medium for constructing multi-source heterogeneous data lineage relationship

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709024A (en) * 2016-12-28 2017-05-24 深圳市华傲数据技术有限公司 Data table source-tracing method and device based on consanguinity analysis
CN109446274A (en) * 2017-08-31 2019-03-08 北京京东尚科信息技术有限公司 The method and apparatus of big data platform BI metadata management
CN109614400A (en) * 2018-11-30 2019-04-12 深圳前海微众银行股份有限公司 Influence and traceability analysis method, device, equipment and storage medium of failed tasks
CN110019384A (en) * 2017-08-15 2019-07-16 阿里巴巴集团控股有限公司 A kind of acquisition methods of blood relationship data provide the method and device of blood relationship data
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110532261A (en) * 2019-07-24 2019-12-03 苏州浪潮智能科技有限公司 A method and device for visual monitoring of Hive data warehouse

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9619514B2 (en) * 2014-06-17 2017-04-11 Sap Se Integration of optimization and execution of relational calculation models into SQL layer
US10824403B2 (en) * 2015-10-23 2020-11-03 Oracle International Corporation Application builder with automated data objects creation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709024A (en) * 2016-12-28 2017-05-24 深圳市华傲数据技术有限公司 Data table source-tracing method and device based on consanguinity analysis
CN110019384A (en) * 2017-08-15 2019-07-16 阿里巴巴集团控股有限公司 A kind of acquisition methods of blood relationship data provide the method and device of blood relationship data
CN109446274A (en) * 2017-08-31 2019-03-08 北京京东尚科信息技术有限公司 The method and apparatus of big data platform BI metadata management
CN109614400A (en) * 2018-11-30 2019-04-12 深圳前海微众银行股份有限公司 Influence and traceability analysis method, device, equipment and storage medium of failed tasks
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110532261A (en) * 2019-07-24 2019-12-03 苏州浪潮智能科技有限公司 A method and device for visual monitoring of Hive data warehouse

Also Published As

Publication number Publication date
CN111026568A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN111026568B (en) Data and task relation construction method and device, computer equipment and storage medium
US20230126005A1 (en) Consistent filtering of machine learning data
US10754761B2 (en) Systems and methods for testing source code
US10713589B1 (en) Consistent sort-based record-level shuffling of machine learning data
US10366053B1 (en) Consistent randomized record-level splitting of machine learning data
US20150379072A1 (en) Input processing for machine learning
US10261767B2 (en) Data integration job conversion
US10911379B1 (en) Message schema management service for heterogeneous event-driven computing environments
CN112559444B (en) SQL file migration method, device, storage medium and equipment
WO2023098462A1 (en) Improving performance of sql execution sequence in production database instance
CN110928941B (en) A data fragmentation extraction method and device
US20160125026A1 (en) Proactive query migration to prevent failures
CN111143390A (en) Method and device for updating metadata
CN104781814B (en) Splitting of reference data from a single table to multiple tables
CN110727677B (en) Method and device for tracing blood relationship of table in data warehouse
CN116595954A (en) Method, device, equipment and storage medium for editing items
CN110990475B (en) Batch task inserting method and device, computer equipment and storage medium
CN115048288B (en) Interface testing method, device, computing equipment and computer storage medium
CN118626464A (en) Data processing method and device of database based on distributed file storage
US11169979B2 (en) Database-documentation propagation via temporal log backtracking
US20200249876A1 (en) System and method for data storage management
CN112256669A (en) Data processing method, apparatus, electronic device and readable storage medium
CN117390040B (en) Service request processing method, device and storage medium based on real-time wide table
US20240273065A1 (en) Hadoop distributed file system (hdfs) express bulk file deletion
US20250123941A1 (en) Data processing pipeline with data regression framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant