CN114492844A - Method and device for constructing machine learning workflow, electronic equipment and storage medium - Google Patents
Method and device for constructing machine learning workflow, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN114492844A CN114492844A CN202210146721.8A CN202210146721A CN114492844A CN 114492844 A CN114492844 A CN 114492844A CN 202210146721 A CN202210146721 A CN 202210146721A CN 114492844 A CN114492844 A CN 114492844A
- Authority
- CN
- China
- Prior art keywords
- target
- workflow
- node
- hash value
- target node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Development Economics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本公开提供了一种机器学习工作流的构建方法、装置、电子设备及存储介质。该方法包括:获取目标工作流模板;根据目标工作流模板的组织结构信息,生成目标工作流模板对应的目标哈希计算图;利用定义的哈希计算图的计算方式,根据目标工作流模板对应的参数信息,计算目标哈希计算图包含的目标节点的哈希值;对目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用需要被运行的工作流步骤和已经成功运行的工作流步骤,构建目标工作流。该方法可以解决完全相同的工作流被重复执行而产生的时间、资源浪费问题,提高工作流的运行效率和性能,从而提升用户体验。
The present disclosure provides a method, apparatus, electronic device and storage medium for constructing a machine learning workflow. The method includes: acquiring a target workflow template; generating a target hash calculation graph corresponding to the target workflow template according to the organizational structure information of the target workflow template; Parameter information, calculate the hash value of the target node included in the target hash calculation graph; verify the hash value of the target node, determine the workflow steps that need to be run and the workflow steps that have been successfully run, and then use the workflow steps that need to be run. The workflow steps that are running and the workflow steps that have run successfully, build the target workflow. The method can solve the problem of time and resource waste caused by repeated execution of the same workflow, improve the operation efficiency and performance of the workflow, and thus improve the user experience.
Description
技术领域technical field
本公开涉及物流技术领域,尤其涉及一种机器学习工作流的构建方法、装置、电子设备及存储介质。The present disclosure relates to the field of logistics technology, and in particular, to a method, apparatus, electronic device and storage medium for constructing a machine learning workflow.
背景技术Background technique
机器学习工作流技术可以解决应用开发效率低、质量参差不齐的问题。机器学习工作流技术主要从流程管理工作流转变而来,当前工作流的运行可以分为基于数据依赖驱动和基于任务状态依赖驱动。在真实应用开发场景下,机器学习工作流可能会被反复修改和反复运行,且工作流的有向无环图结构复杂,开发者需要根据步骤的运行结果,定位导致整体效果不好的原因。而现有技术在工作流运行时,只关注自上而下的计算正确性、数据交换正确性及运行时间,存在完全相同的工作流被重复执行而产生的时间、资源浪费问题,影响工作流的运行性能,用户体验不好。Machine learning workflow technology can solve the problems of low efficiency and uneven quality of application development. Machine learning workflow technology is mainly transformed from process management workflow. The current operation of workflow can be divided into data-dependent-driven and task-state-dependent driven. In a real application development scenario, the machine learning workflow may be modified and run repeatedly, and the directed acyclic graph structure of the workflow is complex. Developers need to locate the cause of the poor overall effect based on the running results of the steps. However, the prior art only pays attention to the top-down calculation correctness, data exchange correctness and running time when the workflow is running. There are time and resource waste problems caused by the repeated execution of the exact same workflow, which affects the workflow. The running performance is not good, and the user experience is not good.
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above Background section is only for enhancement of understanding of the background of the present disclosure, and therefore may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.
发明内容SUMMARY OF THE INVENTION
本公开的目的在于提供一种机器学习工作流的构建方法、装置、电子设备及存储介质,该方法可以解决完全相同的工作流被重复执行而产生的时间、资源浪费问题,提高工作流的运行效率和性能,从而提升用户体验。The purpose of the present disclosure is to provide a machine learning workflow construction method, device, electronic device and storage medium, which can solve the problem of time and resource waste caused by repeated execution of the exact same workflow, and improve the operation of the workflow efficiency and performance, thereby enhancing the user experience.
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。Other features and advantages of the present disclosure will become apparent from the following detailed description, or be learned in part by practice of the present disclosure.
根据本公开的一个方面,提供一种机器学习工作流的构建方法,包括:获取目标工作流模板;根据所述目标工作流模板的组织结构信息,生成所述目标工作流模板对应的目标哈希计算图;利用定义的哈希计算图的计算方式,根据所述目标工作流模板对应的参数信息,计算所述目标哈希计算图包含的目标节点的哈希值;对所述目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用所述需要被运行的工作流步骤和所述已经成功运行的工作流步骤,构建目标工作流。According to one aspect of the present disclosure, there is provided a method for constructing a machine learning workflow, including: acquiring a target workflow template; and generating a target hash corresponding to the target workflow template according to organizational structure information of the target workflow template Calculation graph; using the calculation method of the defined hash calculation graph, according to the parameter information corresponding to the target workflow template, calculate the hash value of the target node included in the target hash calculation graph; for the hash value of the target node The value is verified, and the workflow steps that need to be executed and the workflow steps that have been successfully executed are determined, and then the target workflow is constructed by using the workflow steps that need to be executed and the workflow steps that have been successfully executed.
在本公开一些示例性实施例中,所述哈希计算图的计算方式包括:节点的哈希值计算公式为:将节点对应的输入参数的哈希值与节点对应的配置参数的哈希值进行求和,然后利用哈希算法对求和结果进行计算;节点对应的输出参数的哈希值计算公式为:将节点的哈希值与节点对应的输出参数名称的哈希值进行求和。In some exemplary embodiments of the present disclosure, the calculation method of the hash calculation graph includes: the calculation formula of the hash value of the node is: the hash value of the input parameter corresponding to the node and the hash value of the configuration parameter corresponding to the node Perform the summation, and then use the hash algorithm to calculate the summation result; the calculation formula of the hash value of the output parameter corresponding to the node is: sum the hash value of the node and the hash value of the output parameter name corresponding to the node.
在本公开一些示例性实施例中,所述根据所述目标工作流模板的组织结构信息,生成所述目标工作流模板对应的目标哈希计算图,包括:将所述目标工作流模板包含的组件转化为所述目标哈希计算图包含的目标节点,将所述组件之间的连接关系转化为所述目标节点之间的连接关系,生成所述目标哈希计算图。In some exemplary embodiments of the present disclosure, the generating a target hash calculation graph corresponding to the target workflow template according to the organizational structure information of the target workflow template includes: The components are converted into target nodes included in the target hash calculation graph, the connection relationship between the components is converted into the connection relationship between the target nodes, and the target hash calculation graph is generated.
在本公开一些示例性实施例中,所述利用定义的哈希计算图的计算方式,根据所述目标工作流模板对应的参数信息,计算所述目标哈希计算图包含的目标节点的哈希值,包括:根据所述目标节点之间的连接关系和所述组件的参数,确定所述目标节点对应的输入参数和所述目标节点对应的配置参数;根据所述目标节点对应的输入参数的参数信息,计算所述目标节点对应的输入参数的哈希值;查询所述目标节点对应的配置参数的参数信息,根据查询的参数信息,计算所述目标节点对应的配置参数的哈希值;将所述目标节点对应的输入参数的哈希值和所述目标节点对应的配置参数的哈希值代入到所述节点的哈希值计算公式中,计算所述目标节点的哈希值。In some exemplary embodiments of the present disclosure, the hash calculation method of the defined hash calculation graph is used to calculate the hash of the target node included in the target hash calculation graph according to the parameter information corresponding to the target workflow template. value, including: determining the input parameter corresponding to the target node and the configuration parameter corresponding to the target node according to the connection relationship between the target nodes and the parameters of the component; according to the input parameter corresponding to the target node parameter information, calculate the hash value of the input parameter corresponding to the target node; query the parameter information of the configuration parameter corresponding to the target node, and calculate the hash value of the configuration parameter corresponding to the target node according to the query parameter information; The hash value of the input parameter corresponding to the target node and the hash value of the configuration parameter corresponding to the target node are substituted into the hash value calculation formula of the node, and the hash value of the target node is calculated.
在本公开一些示例性实施例中,所述目标节点对应的输入参数包括:所述目标节点对应的组件的输入参数、所述目标节点的上游节点对应的输出参数;以及,所述目标节点对应的配置参数包括:所述目标节点对应的组件的配置参数。In some exemplary embodiments of the present disclosure, the input parameters corresponding to the target node include: input parameters of components corresponding to the target node, output parameters corresponding to upstream nodes of the target node; and, the target node corresponds to The configuration parameters include: configuration parameters of the component corresponding to the target node.
在本公开一些示例性实施例中,所述根据所述目标节点对应的输入参数的参数信息,计算所述目标节点对应的输入参数的哈希值,包括:获取所述目标节点对应的组件的输入参数的参数信息,利用哈希算法对获取的参数信息进行计算,获得所述目标节点对应的组件的输入参数的哈希值;将所述目标节点的上游节点的哈希值和所述目标节点的上游节点对应的输出参数的参数名称,代入到所述节点对应的输出参数的哈希值计算公式中,计算所述目标节点的上游节点对应的输出参数的哈希值;对所述目标节点对应的组件的输入参数的哈希值与所述目标节点的上游节点对应的输出参数的哈希值进行求和,获得所述目标节点对应的输入参数的哈希值。In some exemplary embodiments of the present disclosure, the calculating the hash value of the input parameter corresponding to the target node according to the parameter information of the input parameter corresponding to the target node includes: acquiring the information of the component corresponding to the target node. The parameter information of the input parameter, the obtained parameter information is calculated by using a hash algorithm, and the hash value of the input parameter of the component corresponding to the target node is obtained; the hash value of the upstream node of the target node and the target node are obtained. The parameter name of the output parameter corresponding to the upstream node of the node is substituted into the hash value calculation formula of the output parameter corresponding to the node, and the hash value of the output parameter corresponding to the upstream node of the target node is calculated; The hash value of the input parameter of the component corresponding to the node and the hash value of the output parameter corresponding to the upstream node of the target node are summed to obtain the hash value of the input parameter corresponding to the target node.
在本公开一些示例性实施例中,所述计算所述目标节点对应的配置参数的哈希值,包括:判断所述目标节点对应的配置参数的参数类型是否为索引;若是,则采用接口请求的方式,计算所述查询的参数信息下的文件的哈希值,并确定所述文件的哈希值为所述目标节点对应的配置参数的哈希值。In some exemplary embodiments of the present disclosure, the calculating the hash value of the configuration parameter corresponding to the target node includes: judging whether the parameter type of the configuration parameter corresponding to the target node is an index; if so, adopting an interface request method, the hash value of the file under the query parameter information is calculated, and the hash value of the file is determined to be the hash value of the configuration parameter corresponding to the target node.
在本公开一些示例性实施例中,所述对所述目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用所述需要被运行的工作流步骤和所述已经成功运行的工作流步骤,构建目标工作流,包括:查询步骤信息数据库中是否存在所述目标节点的哈希值;若是,则标记所述目标节点对应的工作流步骤为已经成功运行的工作流步骤,并从所述步骤信息数据库中,读取所述已经成功运行的工作流步骤的运行信息;若否,则标记所述目标节点对应的工作流步骤为需要被运行的工作流步骤;根据所述已经成功运行的工作流步骤的运行信息和所述需要被运行的工作流步骤,生成目标工作流配置文件。In some exemplary embodiments of the present disclosure, the hash value of the target node is verified, the workflow steps that need to be executed and the workflow steps that have been executed successfully are determined, and then the work that needs to be executed is used. The flow steps and the workflow steps that have been successfully executed, and constructing the target workflow, include: querying whether the hash value of the target node exists in the step information database; if so, marking the workflow step corresponding to the target node as Workflow steps that have been successfully run, and read the running information of the workflow steps that have been run successfully from the step information database; if not, mark the workflow steps corresponding to the target node as needing to be run The workflow step; according to the operation information of the workflow step that has been successfully executed and the workflow step that needs to be executed, a target workflow configuration file is generated.
在本公开一些示例性实施例中,所述方法还包括:调用接口运行所述目标工作流配置文件;以及,监听所述目标工作流配置文件中每个步骤的运行状态,在所述步骤运行成功的情况下,更新所述步骤的运行状态为运行成功。In some exemplary embodiments of the present disclosure, the method further includes: invoking an interface to run the target workflow configuration file; and monitoring the running status of each step in the target workflow configuration file, and executing the step in the step In the case of success, the running status of the step is updated to run successfully.
根据本公开的一个方面,提供一种机器学习工作流的构建装置,包括:获取模块,用于获取目标工作流模板;生成模块,用于根据所述目标工作流模板的组织结构信息,生成所述目标工作流模板对应的目标哈希计算图;计算模块,用于利用定义的哈希计算图的计算方式,根据所述目标工作流模板对应的参数信息,计算所述目标哈希计算图包含的目标节点的哈希值;构建模块,用于对所述目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用所述需要被运行的工作流步骤和所述已经成功运行的工作流步骤,构建目标工作流。According to an aspect of the present disclosure, there is provided an apparatus for constructing a machine learning workflow, comprising: an acquisition module, used for acquiring a target workflow template; The target hash calculation graph corresponding to the target workflow template; the calculation module is used to utilize the calculation method of the defined hash calculation graph, and calculate the target hash calculation graph according to the parameter information corresponding to the target workflow template, including The hash value of the target node of The workflow steps and the workflow steps that have been successfully run build the target workflow.
根据本公开的一个方面,提供一种电子设备,包括:至少一个处理器;存储装置,用于存储至少一个程序,当所述至少一个程序被所述至少一个处理器执行时,使得所述至少一个处理器实现如上述任一种机器学习工作流的构建方法。According to one aspect of the present disclosure, there is provided an electronic device, comprising: at least one processor; and storage means for storing at least one program, which, when the at least one program is executed by the at least one processor, makes the at least one program A processor implements any of the machine learning workflow construction methods described above.
根据本公开的一个方面,提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如上述任一种机器学习工作流的构建方法。According to an aspect of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, any one of the above-mentioned methods for constructing a machine learning workflow is implemented.
本公开实施例提供的机器学习工作流的构建方法,首先生成目标工作流模板对应的目标哈希计算图,然后可以利用定义的哈希计算图的计算方式,计算目标哈希计算图包含的目标节点的哈希值,接着通过对计算的哈希值进行验证,得到需要被运行的工作流步骤和已经成功运行的工作流步骤,进而可以对已经成功运行的工作流步骤进行剪枝,构建出目标工作流,能够解决现有技术存在的完成相同的工作流被重复执行而产生的时间、资源问题,提高工作流的运行效率和性能,提升用户体验。The method for constructing a machine learning workflow provided by the embodiment of the present disclosure firstly generates a target hash calculation graph corresponding to a target workflow template, and then can use the defined calculation method of the hash calculation graph to calculate the targets included in the target hash calculation graph The hash value of the node, and then by verifying the calculated hash value, the workflow steps that need to be run and the workflow steps that have been successfully run can be obtained, and then the workflow steps that have been successfully run can be pruned to construct a The target workflow can solve the problem of time and resources caused by the repeated execution of the same workflow in the prior art, improve the operation efficiency and performance of the workflow, and improve the user experience.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Obviously, the drawings in the following description are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.
图1示出了可以应用本公开示例性实施例的机器学习工作流的构建方法的示例性系统架构的示意图;FIG. 1 shows a schematic diagram of an exemplary system architecture to which a method for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure can be applied;
图2示出了根据本公开示例性实施例的机器学习工作流的构建方法的流程图;FIG. 2 shows a flowchart of a method for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure;
图3示出了根据本公开示例性实施例的工作流模板的结构示意图;FIG. 3 shows a schematic structural diagram of a workflow template according to an exemplary embodiment of the present disclosure;
图4示出了根据本公开示例性实施例的哈希计算图的结构示意图;FIG. 4 shows a schematic structural diagram of a hash calculation graph according to an exemplary embodiment of the present disclosure;
图5示出了根据本公开示例性实施例的计算目标哈希计算图包含的目标节点的哈希值的流程图;5 shows a flowchart of calculating the hash value of the target node included in the target hash calculation graph according to an exemplary embodiment of the present disclosure;
图6示出了根据本公开示例性实施例的机器学习工作流构建的整体流程图;FIG. 6 shows an overall flow chart of machine learning workflow construction according to an exemplary embodiment of the present disclosure;
图7示出了根据本公开示例性实施例的计算目标哈希计算图包含的每个目标节点的哈希值的流程图;7 shows a flow chart of calculating the hash value of each target node included in the target hash calculation graph according to an exemplary embodiment of the present disclosure;
图8示出了根据本公开示例性实施例的机器学习工作流的构建装置的结构示意图;FIG. 8 shows a schematic structural diagram of an apparatus for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure;
图9示出了根据本公开示例性实施例的电子设备的框图。FIG. 9 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repeated descriptions will be omitted. Some of the block diagrams shown in the figures are functional entities that do not necessarily necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
图1示出了可以应用本公开示例性实施例的机器学习工作流的构建方法的示例性系统架构的示意图。FIG. 1 shows a schematic diagram of an exemplary system architecture to which a method for building a machine learning workflow according to an exemplary embodiment of the present disclosure may be applied.
如图1所示,该系统架构可以包括服务器101、网络102和客户端103。网络102可以在客户端103和服务器101之间提供通信链路的介质。网络102可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture may include a server 101 , a network 102 and a client 103 . The network 102 may provide the medium of the communication link between the client 103 and the server 101 . The network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
服务器101可以是提供各种服务的服务器,例如对用户利用客户端所进行操作的装置提供支持的后台管理服务器。后台管理服务器可以对接收到的请求等数据进行分析等处理,并将处理结果反馈给客户端。The server 101 may be a server that provides various services, for example, a background management server that provides support for devices operated by a user using a client. The background management server can analyze and process the received requests and other data, and feed back the processing results to the client.
客户端103可以是手机、游戏主机、平板电脑、电子书阅读器、智能眼镜、智能家居设备、AR(Augmented Reality,增强现实)设备、VR(Virtual Reality,虚拟现实)设备等移动终端,或者,客户端103也可以是个人计算机,比如膝上型便携计算机和台式计算机等等。The client 103 may be a mobile terminal such as a mobile phone, a game console, a tablet computer, an e-book reader, smart glasses, a smart home device, an AR (Augmented Reality, augmented reality) device, a VR (Virtual Reality, virtual reality) device, or the like, or, Client 103 may also be a personal computer, such as a laptop portable computer and desktop computer, among others.
本公开示例性实施例中,用户可以通过客户端向服务器发送工作流处理指令,服务器可例如获取工作流处理指令,根据该流水处理指令获取目标工作流模板;服务器可例如根据目标工作流模板的组织结构信息,生成目标工作流模板对应的目标哈希计算图;服务器可例如利用定义的哈希计算图的计算方式,根据目标工作流模板对应的参数信息,计算目标哈希计算图包含的目标节点的哈希值;服务器可例如对目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用需要被运行的工作流步骤和已经成功运行的工作流步骤,构建目标工作流。In an exemplary embodiment of the present disclosure, a user can send a workflow processing instruction to a server through a client, and the server can, for example, obtain the workflow processing instruction, and obtain a target workflow template according to the pipeline processing instruction; the server can, for example, obtain the target workflow template according to the Organizational structure information to generate a target hash calculation graph corresponding to the target workflow template; the server may, for example, use the defined hash calculation graph calculation method to calculate the target included in the target hash calculation graph according to the parameter information corresponding to the target workflow template The hash value of the node; the server can, for example, verify the hash value of the target node, determine the workflow steps that need to be executed and the workflow steps that have been successfully executed, and then use the workflow steps that need to be executed and the workflow steps that have been successfully executed. Workflow steps that build the target workflow.
应该理解,图1中的客户端、网络和服务器的数目仅仅是示意性的,服务器101可以是一个实体的服务器,还可以为多个服务器组成的服务器集群,还可以是云端服务器,根据实际需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of clients, networks and servers in FIG. 1 are only schematic, and the server 101 may be an entity server, a server cluster composed of multiple servers, or a cloud server, depending on actual needs , can have any number of terminal devices, networks and servers.
下面,将结合附图及实施例对本公开示例实施例中的机器学习工作流(MachineLearning Pipelines)的构建方法的各个步骤进行更详细的说明。Hereinafter, each step of the method for constructing a machine learning workflow (MachineLearning Pipelines) in an exemplary embodiment of the present disclosure will be described in more detail with reference to the accompanying drawings and embodiments.
图2示出了根据本公开示例性实施例的机器学习工作流的构建方法的流程图。本公开示例性实施例提供的方法可以由如图1所示的服务器中运行,但本公开并不限定于此。FIG. 2 shows a flowchart of a method for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure. The method provided by the exemplary embodiment of the present disclosure may be executed by the server as shown in FIG. 1 , but the present disclosure is not limited thereto.
如图2所示,本公开示例性实施例提供的机器学习工作流的构建方法可以包括以下步骤。As shown in FIG. 2 , the method for constructing a machine learning workflow provided by an exemplary embodiment of the present disclosure may include the following steps.
步骤S201:获取目标工作流模板。Step S201: Obtain a target workflow template.
目标工作流模板为需要被运行的工作流模板,可以根据用户提供的目标工作流模板的唯一标识获取该工作流模板。如用户在工作流系统的界面输入需要被运行的工作流模板的唯一标识,然后在确定目标工作流模板的唯一标识的情况下,可以根据该唯一标识,在工作流模板数据库中查询到目标工作流模板。此外,工作流模板的唯一标识可以根据实际情况设置,如工作流名称、工作流的业务场景名称等。其中,工作流系统可以看作为对工作流进行处理的系统,其可以提供用于用户输入信息的系统界面。The target workflow template is a workflow template that needs to be run, and the workflow template can be obtained according to the unique identifier of the target workflow template provided by the user. For example, the user inputs the unique identifier of the workflow template to be run on the interface of the workflow system, and then, when the unique identifier of the target workflow template is determined, the target job can be queried in the workflow template database according to the unique identifier. flow template. In addition, the unique identifier of the workflow template can be set according to the actual situation, such as the workflow name, the business scenario name of the workflow, and so on. Among them, the workflow system can be regarded as a system for processing workflow, which can provide a system interface for the user to input information.
需要注意的是,本公开示例性实施例的机器学习工作流的构建方法中,可以根据需求预先创建工作流模板,然后将创建的工作流模板存储至工作流模板数据库中,以便于利用模板的唯一标识,查询到需要的工作流模板。当然,还可以将工作流模板的组织结构信息存储至工作流模板数据库中。其中,工作流模板的组织结构信息可以包括:模板中包括的组件、组件之间的连接关系、以及每个组件的输入参数、输出参数、普通配置参数和资源配置参数。图3示出了根据本公开示例性实施例的工作流模板的结构示意图。图3所示的工作流模板包括读取本地文件、数据划分、文件合并(按列)、文件合并(按行)以及线性回归组件。对于数据划分组件来说,其用于将单个数据集划分为训练集和测试集。另外,数据划分组件的普通配置参数为{参数名称:test_size,参数类型:Float}和{参数名称:target_column,参数类型:String},资源配置参数为{CPU:0.5核,内存:1024M,GPU:0张}。图3所示的工作流模板还可以看出,各个组件之间的连接关系,具体为上游组件的输出参数与下游组件的输入参数之间的连接关系。It should be noted that, in the method for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure, a workflow template may be created in advance according to requirements, and then the created workflow template may be stored in the workflow template database, so as to facilitate the use of Unique identifier to query the required workflow template. Of course, the organizational structure information of the workflow template can also be stored in the workflow template database. The organizational structure information of the workflow template may include: components included in the template, connection relationships between components, and input parameters, output parameters, common configuration parameters and resource configuration parameters of each component. FIG. 3 shows a schematic structural diagram of a workflow template according to an exemplary embodiment of the present disclosure. The workflow template shown in Figure 3 includes read local files, data partitioning, file merging (by column), file merging (by row), and linear regression components. For the data partitioning component, it is used to partition a single dataset into training and test sets. In addition, the common configuration parameters of the data partitioning component are {parameter name: test_size, parameter type: Float} and {parameter name: target_column, parameter type: String}, and resource configuration parameters are {CPU: 0.5 core, memory: 1024M, GPU: 0}. It can also be seen from the workflow template shown in FIG. 3 that the connection relationship between the various components is specifically the connection relationship between the output parameters of the upstream components and the input parameters of the downstream components.
因此,在确定目标工作流模板的唯一标识后,可以根据该唯一标识,获取到目标工作流模板,并且获取该目标工作流模板包含的组件、组件之间的连接关系、以及组件的参数。其中,组件的参数可以包括输入参数、输出参数、普通配置参数以及资源配置参数。Therefore, after the unique identifier of the target workflow template is determined, the target workflow template can be obtained according to the unique identifier, and the components included in the target workflow template, the connection relationship between the components, and the parameters of the components can be obtained. The parameters of the component may include input parameters, output parameters, common configuration parameters, and resource configuration parameters.
步骤S202:根据目标工作流模板的组织结构信息,生成目标工作流模板对应的目标哈希计算图。Step S202: Generate a target hash calculation graph corresponding to the target workflow template according to the organizational structure information of the target workflow template.
根据上文步骤S201得知,获取到目标工作流模板后,可以获取到目标工作流模板的组织结构信息,即目标工作流模板包含的组件、组件之间的连接关系、以及每个组件的输入参数、输出参数、普通配置参数和资源配置参数。这样,可以利用目标工作流模板的组织结构信息,生成该目标工作流模板对应的目标哈希计算图。其中,哈希计算图是指为各个节点添加哈希函数的计算图。According to the above step S201, after obtaining the target workflow template, the organizational structure information of the target workflow template can be obtained, that is, the components included in the target workflow template, the connection relationship between the components, and the input of each component parameters, output parameters, common configuration parameters, and resource configuration parameters. In this way, the target hash calculation graph corresponding to the target workflow template can be generated by using the organizational structure information of the target workflow template. The hash calculation graph refers to a calculation graph in which hash functions are added to each node.
本公开示例性实施例中,根据目标工作流模板的组织结构信息,生成目标工作流模板对应的目标哈希计算图,可以包括:将目标工作流模板包含的组件转化为目标哈希计算图包含的目标节点,将组件之间的连接关系转化为目标节点之间的连接关系,生成目标哈希计算图。In an exemplary embodiment of the present disclosure, generating a target hash calculation graph corresponding to the target workflow template according to the organizational structure information of the target workflow template may include: converting the components included in the target workflow template into the target hash calculation graph including: The target node, the connection relationship between components is converted into the connection relationship between target nodes, and the target hash calculation graph is generated.
在生成目标哈希计算图的过程中,可以将目标工作流模板包含的组件转化为目标哈希计算图包含的目标节点,将组件之间的连接关系转化为目标节点之间的连接关系。具体来说,目标工作流模板中的每个组件都可以看作是目标哈希计算图的一个目标节点,并且组件之间的连接关系对应目标节点之间的连接关系。此外,目标工作流模板包含的组件数量为一个或多个。由于每个组件都可以看作一个节点,因此目标哈希计算图包含的目标节点数量也为一个或多个。还有,用目标节点表示目标哈希计算图中包含的节点,是为了与节点的哈希值计算公式中的节点区分开。In the process of generating the target hash calculation graph, the components included in the target workflow template can be converted into target nodes included in the target hash calculation graph, and the connection relationship between the components can be converted into the connection relationship between the target nodes. Specifically, each component in the target workflow template can be regarded as a target node of the target hash calculation graph, and the connection relationship between the components corresponds to the connection relationship between the target nodes. Additionally, the target workflow template contains one or more components. Since each component can be regarded as a node, the target hash calculation graph contains one or more target nodes. In addition, the target node is used to represent the node included in the target hash calculation graph in order to distinguish it from the node in the node hash value calculation formula.
图4示出了根据本公开示例性实施例的哈希计算图的结构示意图,且图4所示的哈希计算图是根据图3所示的工作流模板生成的。从图4可以看出,将图3中的读取本地文件、数据划分、文件合并(按列)、文件合并(按行)以及线性回归组件分别转化为节点A、B、C、D以及E,根据组件之间的连接关系设置节点之间的关系为节点A为节点B的上游节点、节点B为节点C和节点D的上游节点、节点E的上游节点为节点C和节点D。FIG. 4 shows a schematic structural diagram of a hash calculation graph according to an exemplary embodiment of the present disclosure, and the hash calculation graph shown in FIG. 4 is generated according to the workflow template shown in FIG. 3 . As can be seen from Figure 4, the components of reading local files, data division, file merging (by column), file merging (by row) and linear regression in Figure 3 are converted into nodes A, B, C, D and E respectively. , according to the connection relationship between components, the relationship between nodes is set such that node A is the upstream node of node B, node B is the upstream node of node C and node D, and the upstream nodes of node E are node C and node D.
步骤S203:利用定义的哈希计算图的计算方式,根据目标工作流模板对应的参数信息,计算目标哈希计算图包含的目标节点的哈希值。Step S203: Calculate the hash value of the target node included in the target hash calculation graph according to the parameter information corresponding to the target workflow template by using the defined calculation method of the hash calculation graph.
与哈希链不同的是,哈希计算图能够以图的方式串联各个节点,将所有直接关联的上游节点的值拼接后,通过哈希算法进行计算得到的值为当前节点的值,即利用定义的哈希计算图的计算方式,计算节点的哈希值。其中,哈希计算图的计算方式可以看作为基于哈希计算图的工作流剪枝算法。剪枝是通过某种判断,避免一些不必要的遍历过程,就是剪去了搜索树中的某些“枝条”。应用剪枝优化的核心问题是设计剪枝判断方法,即确定哪些枝条应当舍弃,哪些枝条应当保留的方法。基于哈希计算图的工作流剪枝算法是指利用哈希计算图进行剪枝,具体来说,对哈希计算图包含的各个节点进行分析,判断节点对应的工作流步骤应当舍弃或者保留,从而可以对重复运行的工作流步骤进行剪枝,即运行工作流的时候,可以跳过被剪枝的工作流步骤,避免完全相同的工作流被重复执行,提高工作流的运行效率,提升用户体验。Different from the hash chain, the hash calculation graph can connect each node in the form of a graph. After splicing the values of all directly associated upstream nodes, the value calculated by the hash algorithm is the value of the current node, that is, using Define the calculation method of the hash calculation graph, and calculate the hash value of the node. Among them, the calculation method of the hash calculation graph can be regarded as a workflow pruning algorithm based on the hash calculation graph. Pruning is to avoid some unnecessary traversal processes through a certain judgment, that is, to cut off some "branches" in the search tree. The core problem of applying pruning optimization is to design a pruning judgment method, that is, a method to determine which branches should be discarded and which should be kept. The workflow pruning algorithm based on the hash calculation graph refers to the use of the hash calculation graph for pruning. Specifically, each node included in the hash calculation graph is analyzed, and it is judged that the workflow steps corresponding to the nodes should be discarded or retained. In this way, the repeatedly running workflow steps can be pruned, that is, when running the workflow, the pruned workflow steps can be skipped, so as to avoid the repeated execution of the exact same workflow, improve the running efficiency of the workflow, and improve the user experience. experience.
本公开示例性实施例中,哈希计算图的计算方式可以包括:(1)节点的哈希值=Hash(该节点对应的输入参数的哈希值+该节点对应的配置参数的哈希值),Hash()是指利用哈希算法进行计算,求()中内容的哈希值;(2)节点对应的输出参数的哈希值=该节点的哈希值+该输出参数名称的哈希值。哈希算法保证了相同的输入能够得到相同的输出,并且哈希算法能够防止不同的输入得到相同的输出。目前,有多种哈希算法可以使用,如md5、sha1、sha256、sha512,考虑到计算效率、存储要求以及冲突概率等,优选md5算法。In an exemplary embodiment of the present disclosure, the calculation method of the hash calculation graph may include: (1) the hash value of the node=Hash (the hash value of the input parameter corresponding to the node+the hash value of the configuration parameter corresponding to the node) ), Hash() refers to using the hash algorithm to calculate the hash value of the content in (); (2) the hash value of the output parameter corresponding to the node = the hash value of the node + the hash value of the output parameter name Greek value. The hash algorithm guarantees that the same input can get the same output, and the hash algorithm can prevent different inputs from getting the same output. Currently, there are various hash algorithms available, such as md5, sha1, sha256, and sha512. Considering the computational efficiency, storage requirements, and collision probability, the md5 algorithm is preferred.
节点的哈希值计算公式中,该节点对应的输入参数的哈希值=该节点的所有输入参数的哈希值+该节点的所有输入参数的参数名称的哈希值,该节点对应的配置参数的哈希值=该节点的所有配置参数的哈希值+该节点的所有配置参数的参数名称的哈希值。由于对于同一工作流模板包含的不同组件,其对应的资源配置参数为相同的,因此,此处的配置参数可以为普通配置参数。此外,考虑到节点可以有多个输出参数,为了进行区分,计算公式中引入了参数名称的哈希值。In the hash value calculation formula of the node, the hash value of the input parameter corresponding to the node = the hash value of all input parameters of the node + the hash value of the parameter names of all input parameters of the node, the corresponding configuration of the node The hash value of the parameter = the hash value of all configuration parameters of the node + the hash value of the parameter names of all configuration parameters of the node. Since the corresponding resource configuration parameters for different components included in the same workflow template are the same, the configuration parameters here can be common configuration parameters. In addition, considering that a node can have multiple output parameters, a hash value of the parameter name is introduced into the calculation formula for differentiation.
图5示出了根据本公开示例性实施例的计算目标哈希计算图包含的目标节点的哈希值的流程图。如图5所示,利用定义的哈希计算图的计算方式,根据目标工作流模板对应的参数信息,计算目标哈希计算图包含的目标节点的哈希值,可以包括:FIG. 5 shows a flowchart of calculating a hash value of a target node included in a target hash calculation graph according to an exemplary embodiment of the present disclosure. As shown in Figure 5, using the calculation method of the defined hash calculation graph, according to the parameter information corresponding to the target workflow template, calculate the hash value of the target node included in the target hash calculation graph, which may include:
步骤S501,根据目标节点之间的连接关系和组件的参数,确定目标节点对应的输入参数和目标节点对应的配置参数;Step S501, according to the connection relationship between the target nodes and the parameters of the components, determine the input parameters corresponding to the target node and the configuration parameters corresponding to the target node;
步骤S502,根据目标节点对应的输入参数的参数信息,计算目标节点对应的输入参数的哈希值;Step S502, according to the parameter information of the input parameter corresponding to the target node, calculate the hash value of the input parameter corresponding to the target node;
步骤S503,查询目标节点对应的配置参数的参数信息,根据查询的参数信息,计算目标节点对应的配置参数的哈希值;Step S503, query the parameter information of the configuration parameter corresponding to the target node, and calculate the hash value of the configuration parameter corresponding to the target node according to the queried parameter information;
步骤S504,将目标节点对应的输入参数的哈希值和目标节点对应的配置参数的哈希值代入到节点的哈希值计算公式中,计算目标节点的哈希值。Step S504: Substitute the hash value of the input parameter corresponding to the target node and the hash value of the configuration parameter corresponding to the target node into the hash value calculation formula of the node to calculate the hash value of the target node.
其中,步骤S501是确定目标节点对应的输入参数和配置参数;步骤S502是根据输入参数的信息,计算目标节点对应的输入参数的哈希值;步骤S503是根据配置参数的信息,计算目标节点对应的配置参数的哈希值;步骤S504是将上述步骤计算得到的输入参数的哈希值和配置参数的哈希值,代入到定义的节点的哈希值计算公式中,这样就可以得到目标节点的哈希值。需要注意的是,上文已经说明目标哈希计算图包含的目标节点数量也为一个或多个,因此需要利用步骤S501至步骤S504描述的方法,计算每个目标节点的哈希值。接下来,将详细说明步骤S501至步骤S504。Among them, step S501 is to determine the input parameters and configuration parameters corresponding to the target node; step S502 is to calculate the hash value of the input parameters corresponding to the target node according to the information of the input parameters; step S503 is to calculate the corresponding input parameters of the target node according to the information of the configuration parameters. The hash value of the configuration parameter; step S504 is to substitute the hash value of the input parameter and the hash value of the configuration parameter calculated in the above steps into the hash value calculation formula of the defined node, so that the target node can be obtained. hash value. It should be noted that the number of target nodes included in the target hash calculation graph is also one or more, so it is necessary to use the methods described in steps S501 to S504 to calculate the hash value of each target node. Next, steps S501 to S504 will be described in detail.
首先,在步骤S501中,目标节点对应的输入参数可以包括:目标节点对应的组件的输入参数、目标节点的上游节点对应的输出参数;目标节点对应的配置参数可以包括:目标节点对应的组件的配置参数。当然,如果某节点不存在上游节点,那么该节点对应的输入参数为该节点对应的组件的输入参数。First, in step S501, the input parameters corresponding to the target node may include: input parameters of the component corresponding to the target node, and output parameters corresponding to the upstream node of the target node; the configuration parameters corresponding to the target node may include: the component corresponding to the target node. Configuration parameters. Of course, if a node does not have an upstream node, the input parameter corresponding to the node is the input parameter of the component corresponding to the node.
为了便于理解,以图4中的节点A、B、C、D以及E为例进行说明。节点A对应读取本地文件组件,且节点A没有上游节点,那么节点A对应的输入参数为读取本地文件组件的输入参数a,节点A对应的配置参数为读取本地文件组件的配置参数。节点B对应数据划分组件,且节点B的上游节点为节点A,那么节点B对应的输入参数为数据划分组件的输入参数b和节点A对应的输出参数a1,节点B对应的配置参数为数据划分组件的配置参数。节点C对应文件合并(按列)组件,且节点C的上游节点为节点B,那么节点C对应的输入参数为文件合并(按列)组件的输入参数c和节点B对应的输出参数b1,节点C对应的配置参数为文件合并(按列)组件的配置参数。节点D对应文件合并(按行)组件,且节点D的上游节点为节点B,那么节点D对应的输入参数为文件合并(按行)组件的输入参数d和节点B对应的输出参数b2,节点D对应的配置参数为文件合并(按行)组件的配置参数。节点E对应线性回归组件,且节点E的上游节点为节点C和节点D,那么节点E对应的输入参数为线性回归组件的输入参数e、节点C对应的输出参数c1和节点D的输出参数d1,节点E对应的配置参数为线性回归组件的配置参数。For ease of understanding, nodes A, B, C, D, and E in FIG. 4 are used as examples for description. Node A corresponds to the read local file component, and node A has no upstream node, then the input parameter corresponding to node A is the input parameter a of the read local file component, and the configuration parameter corresponding to node A is the configuration parameter of the read local file component. Node B corresponds to the data partitioning component, and the upstream node of node B is node A, then the input parameters corresponding to node B are the input parameter b of the data partitioning component and the output parameter a1 corresponding to node A, and the configuration parameter corresponding to node B is data partitioning Configuration parameters for the component. Node C corresponds to the file merge (by column) component, and the upstream node of node C is node B, then the input parameter corresponding to node C is the input parameter c of the file merge (by column) component and the output parameter b1 corresponding to node B, node The configuration parameter corresponding to C is the configuration parameter of the file merge (by column) component. Node D corresponds to the file merge (by line) component, and the upstream node of node D is node B, then the input parameter corresponding to node D is the input parameter d of the file merge (by line) component and the output parameter b2 corresponding to node B, node The configuration parameter corresponding to D is the configuration parameter of the file merge (by line) component. Node E corresponds to the linear regression component, and the upstream nodes of node E are node C and node D, then the input parameters corresponding to node E are the input parameter e of the linear regression component, the output parameter c1 corresponding to node C, and the output parameter d1 of node D , the configuration parameters corresponding to node E are the configuration parameters of the linear regression component.
在步骤S502中,根据目标节点对应的输入参数的参数信息,计算目标节点对应的输入参数的哈希值,具体实现可以为:In step S502, the hash value of the input parameter corresponding to the target node is calculated according to the parameter information of the input parameter corresponding to the target node. The specific implementation can be as follows:
(1)获取目标节点对应的组件的输入参数的参数信息,利用哈希算法对获取的参数信息进行计算,获得目标节点对应的组件的输入参数的哈希值。(1) Obtain the parameter information of the input parameters of the component corresponding to the target node, use the hash algorithm to calculate the obtained parameter information, and obtain the hash value of the input parameter of the component corresponding to the target node.
用户提供目标节点对应的组件的输入参数的参数信息,如用户在工作流系统的界面输入目标工作流模板的各组件的输入参数的实际运行参数信息,这样可以对用户提供的参数信息进行计算,得到相应的哈希值。比如,获取到节点B对应的数据划分组件的输入参数b的具体参数值m1,然后利用哈希算法对m1进行哈希计算,以及利用哈希算法对b的参数名称进行哈希计算,将两个计算结果进行求和,得到节点B对应的数据划分组件的输入参数b的哈希值。The user provides the parameter information of the input parameters of the component corresponding to the target node, such as the user inputs the actual operation parameter information of the input parameters of each component of the target workflow template on the interface of the workflow system, so that the parameter information provided by the user can be calculated, Get the corresponding hash value. For example, the specific parameter value m1 of the input parameter b of the data partition component corresponding to node B is obtained, and then the hash algorithm is used to perform the hash calculation on m1, and the hash algorithm is used to perform the hash calculation on the parameter name of b. The calculation results are summed to obtain the hash value of the input parameter b of the data division component corresponding to the node B.
需要注意的是,如果目标节点对应的组件的输入参数有多个,那么需要计算每个输入参数的哈希值,然后进行求和,得到目标节点对应的组件的输入参数的哈希值。It should be noted that if there are multiple input parameters of the component corresponding to the target node, the hash value of each input parameter needs to be calculated, and then summed to obtain the hash value of the input parameter of the component corresponding to the target node.
(2)将目标节点的上游节点的哈希值和目标节点的上游节点对应的输出参数的参数名称,代入到节点对应的输出参数的哈希值计算公式中,计算目标节点的上游节点对应的输出参数的哈希值。(2) Substitute the hash value of the upstream node of the target node and the parameter name of the output parameter corresponding to the upstream node of the target node into the hash value calculation formula of the output parameter corresponding to the node, and calculate the corresponding value of the upstream node of the target node. The hash value of the output parameter.
上文已经说明,目标节点对应的输入参数包括目标节点的上游节点对应的输出参数,因此可以利用节点对应的输出参数的哈希值计算公式,计算上游节点对应的输出函数的哈希值。比如,节点B对应的输入参数包括节点A对应的输出参数a1,那么将节点A的哈希值与a1的参数名称的哈希值进行求和,得到节点B的上游节点A对应的输出参数a1的哈希值。当然,如果目标节点的上游节点对应的输出参数有多个,那么需要计算每个输出参数的哈希值,然后进行求和,得到目标节点的上游节点对应的输出参数的哈希值。还需要注意的是,目标节点的上游节点对应的输出参数是指从上游节点输出、且输入至目标节点的参数。比如,图4中的输出参数b1为节点C的上游节点B对应的输出参数,其为节点C的输入参数,而参数b2为节点B的输出参数,但其不能看作为节点C的输入参数。As described above, the input parameters corresponding to the target node include the output parameters corresponding to the upstream nodes of the target node. Therefore, the hash value of the output function corresponding to the upstream node can be calculated by using the hash value calculation formula of the output parameters corresponding to the node. For example, if the input parameter corresponding to node B includes the output parameter a1 corresponding to node A, then the hash value of node A and the hash value of the parameter name of a1 are summed to obtain the output parameter a1 corresponding to the upstream node A of node B hash value. Of course, if there are multiple output parameters corresponding to the upstream node of the target node, then the hash value of each output parameter needs to be calculated, and then summed to obtain the hash value of the output parameter corresponding to the upstream node of the target node. It should also be noted that the output parameter corresponding to the upstream node of the target node refers to the parameter output from the upstream node and input to the target node. For example, the output parameter b1 in Figure 4 is the output parameter corresponding to the upstream node B of the node C, which is the input parameter of the node C, and the parameter b2 is the output parameter of the node B, but it cannot be regarded as the input parameter of the node C.
(3)对目标节点对应的组件的输入参数的哈希值与目标节点的上游节点对应的输出参数的哈希值进行求和,获得目标节点对应的输入参数的哈希值。比如,节点B对应的输入参数的哈希值=输入参数b的哈希值+节点A的哈希值+a1的参数名称的哈希值。(3) The hash value of the input parameter of the component corresponding to the target node and the hash value of the output parameter corresponding to the upstream node of the target node are summed to obtain the hash value of the input parameter corresponding to the target node. For example, the hash value of the input parameter corresponding to the node B = the hash value of the input parameter b + the hash value of the node A + the hash value of the parameter name of a1.
步骤S502为计算目标节点对应的输入参数的哈希值,步骤S503为查询目标节点对应的配置参数的参数信息,然后根据查询的参数信息,计算目标节点对应的配置参数的哈希值。Step S502 is to calculate the hash value of the input parameter corresponding to the target node, and step S503 is to query the parameter information of the configuration parameter corresponding to the target node, and then calculate the hash value of the configuration parameter corresponding to the target node according to the queried parameter information.
在步骤S503中,通过查询参数配置数据库,获取到目标工作流模板的各组件的配置参数的实际运行参数信息,主要包括组件的普通配置参数的实际值和运行资源配置参数。其中,组件的普通配置参数的类型包括值和索引,如文件路径;运行资源配置参数包括CPU个数、GPU核数、内存大小。另外,如果组件的普通配置参数的类型为索引,则需要先通过接口请求的方式,请求计算对应的文件的哈希值,获取到的哈希值作为参数配置的值。所以,本公开示例性实施例中,计算目标节点对应的配置参数的哈希值,可以包括:判断目标节点对应的配置参数的参数类型是否为索引;若是,则采用接口请求的方式,计算查询的参数信息下的文件的哈希值,并确定文件的哈希值为目标节点对应的配置参数的哈希值。In step S503, by querying the parameter configuration database, the actual operation parameter information of the configuration parameters of each component of the target workflow template is obtained, mainly including the actual values of the common configuration parameters of the components and the operation resource configuration parameters. Among them, the types of common configuration parameters of components include values and indexes, such as file paths; the configuration parameters of running resources include the number of CPUs, the number of GPU cores, and the memory size. In addition, if the type of the common configuration parameter of the component is index, you need to request the calculation of the hash value of the corresponding file through the interface request method, and the obtained hash value is used as the value of the parameter configuration. Therefore, in the exemplary embodiment of the present disclosure, calculating the hash value of the configuration parameter corresponding to the target node may include: judging whether the parameter type of the configuration parameter corresponding to the target node is an index; The hash value of the file under the parameter information is determined, and the hash value of the file is determined to be the hash value of the configuration parameter corresponding to the target node.
通过步骤S502,计算得到目标节点对应的输入参数的哈希值,通过步骤S503,计算得到目标节点对应的配置参数的哈希值,然后可以将在步骤S504中,可以将通过步骤S502计算得到的目标节点对应的输入参数的哈希值、以及通过步骤S503计算得到的目标节点对应的配置参数的哈希值,代入到节点的哈希值计算公式中,最后可以得到目标节点的哈希值。需要注意的是,本公开示例性实施例中,对于目标哈希计算图中包含的目标节点,需要从没有上游节点的目标节点开始,由上到下依次计算每个目标节点的哈希值。Through step S502, the hash value of the input parameter corresponding to the target node is obtained by calculation, and through step S503, the hash value of the configuration parameter corresponding to the target node is obtained by calculation, and then in step S504, the value obtained by calculation in step S502 can be The hash value of the input parameter corresponding to the target node and the hash value of the configuration parameter corresponding to the target node calculated in step S503 are substituted into the hash value calculation formula of the node, and finally the hash value of the target node can be obtained. It should be noted that, in the exemplary embodiment of the present disclosure, for the target nodes included in the target hash calculation graph, the hash value of each target node needs to be calculated from top to bottom starting from the target node without upstream nodes.
步骤S204:对目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用需要被运行的工作流步骤和已经成功运行的工作流步骤,构建目标工作流。Step S204: Verify the hash value of the target node, determine the workflow steps that need to be executed and the workflow steps that have been successfully executed, and then construct the target using the workflow steps that need to be executed and the workflow steps that have been successfully executed. Workflow.
通过步骤S203计算得到每个目标节点的哈希值,在步骤S204中,需要对计算得到的哈希值进行验证,具体验证过程可以为:查询步骤信息数据库中是否存在目标节点的哈希值;若是,则标记目标节点对应的工作流步骤为已经成功运行的工作流步骤,并从步骤信息数据库中,读取已经成功运行的工作流步骤的运行信息;若否,则标记目标节点对应的工作流步骤为需要被运行的工作流步骤;根据已经成功运行的工作流步骤的运行信息和需要被运行的工作流步骤,生成目标工作流配置文件。The hash value of each target node is obtained by calculating in step S203, and in step S204, the calculated hash value needs to be verified, and the specific verification process may be: query whether the hash value of the target node exists in the step information database; If yes, mark the workflow step corresponding to the target node as a successfully executed workflow step, and read the running information of the successfully executed workflow step from the step information database; if not, mark the work corresponding to the target node A flow step is a workflow step that needs to be executed; a target workflow configuration file is generated according to the running information of the successfully executed workflow step and the workflow step that needs to be executed.
在计算并得到某目标节点的哈希值后,会查询步骤信息数据库中是否已经存在该哈希值。具体来说,如果计算得到的哈希值已经存在该步骤信息数据库中,意味着相同参数配置、相同定义的(即完全相同的)步骤在之前的某个时间已经被成功运行过,因此只需要直接在步骤信息数据库中,查询到之前的运行状态和信息,并赋值给该目标节点对应的工作流步骤即可,并且将该目标节点对应的工作流步骤标记为已经成功运行的工作流步骤。如果该目标节点的哈希值不在步骤信息数据库中,那么意味着没有与该目标节点对应的工作流步骤相同的步骤被运行过,因此需要将该目标节点对应的工作流步骤标记为需要被运行的工作流步骤。After calculating and obtaining the hash value of a target node, it will query whether the hash value already exists in the step information database. Specifically, if the calculated hash value already exists in the step information database, it means that the step with the same parameter configuration and the same definition (that is, the exact same) has been successfully run at a certain time before, so only need The previous running status and information can be directly queried in the step information database and assigned to the workflow step corresponding to the target node, and the workflow step corresponding to the target node can be marked as a workflow step that has been successfully run. If the hash value of the target node is not in the step information database, it means that the same step as the workflow step corresponding to the target node has not been executed, so the workflow step corresponding to the target node needs to be marked as needing to be executed workflow steps.
在对目标哈希计算图包含的各目标节点的哈希值分析后,可以确定该目标工作流模板包含的已经成功运行的工作流步骤和需要被运行的工作流步骤,并且能够获取到已经成功运行的工作流步骤的运行信息,进而可以生成目标工作流配置文件。也就是说,在计算完目标哈希计算图包含的所有目标节点的哈希值后,已经成功运行的工作流步骤(即,不需要再次被运行的工作流步骤)会被跳过,也即对已经成功运行的工作流步骤进行了剪枝,还可以得到需要被运行的工作流步骤。此时,可以利用需要被运行的工作流步骤和已经成功运行的工作流步骤的运行信息,重新构建目标工作流配置文件。After analyzing the hash value of each target node included in the target hash calculation graph, it can be determined that the target workflow template contains the workflow steps that have been successfully run and the workflow steps that need to be executed, and can obtain the workflow steps that have been successfully executed. Run information for the workflow steps that are run, which in turn can generate the target workflow configuration file. That is to say, after the hash values of all target nodes contained in the target hash calculation graph are calculated, the workflow steps that have been successfully run (that is, the workflow steps that do not need to be run again) will be skipped, that is The workflow steps that have been successfully run are pruned, and the workflow steps that need to be run can also be obtained. At this point, the target workflow configuration file can be reconstructed using the workflow steps that need to be executed and the running information of the workflow steps that have been successfully executed.
此外,本公开示例性实施例中,工作流处理方法还可以包括:调用接口运行所述目标工作流配置文件;以及,监听目标工作流配置文件中每个步骤的运行状态,在步骤运行成功的情况下,更新步骤的运行状态为运行成功。In addition, in an exemplary embodiment of the present disclosure, the workflow processing method may further include: calling an interface to run the target workflow configuration file; and monitoring the running status of each step in the target workflow configuration file, and if the step is successfully executed In this case, the run status of the update step is run successful.
在工作流系统和底层工作流执行器之间建立应用程序调用接口,工作流系统调用接口运行工作流配置文件。并且,工作流系统会一直监听工作流的每个步骤的运行状态。当工作流的某步骤运行成功后,会在步骤信息数据库中更新该步骤的运行信息。总的来说,步骤信息数据库中记录了所有步骤、步骤所属工作流、步骤的运行信息,这样可以根据某一步骤的运行信息查询到该步骤及该步骤所属工作流。An application program calling interface is established between the workflow system and the underlying workflow executor, and the workflow system calling interface runs the workflow configuration file. Moreover, the workflow system will always monitor the running status of each step of the workflow. When a step of the workflow runs successfully, the running information of the step will be updated in the step information database. In general, the step information database records all steps, the workflow to which the step belongs, and the operation information of the step, so that the step and the workflow to which the step belongs can be queried according to the operation information of a step.
本公开实施例提供的工作流处理方法,首先生成目标工作流模板对应的目标哈希计算图,然后可以利用定义的哈希计算图的计算方式,计算目标哈希计算图包含的目标节点的哈希值,接着通过对计算的哈希值进行验证,得到需要被运行的工作流步骤和已经成功运行的工作流步骤,进而可以对已经成功运行的工作流步骤进行剪枝,构建出目标工作流,能够解决现有技术存在的完成相同的工作流被重复执行而产生的时间、资源问题,提高工作流的运行效率和性能,提升用户体验。并且,在步骤信息数据库中记录了所有步骤、步骤所属工作流、步骤的运行信息,能够根据某步骤查找到该步骤所属工作流,还能够查询该步骤的运行结果以及与其他步骤之间的联系,实现了工作流运行结果的溯源。In the workflow processing method provided by the embodiments of the present disclosure, a target hash calculation graph corresponding to the target workflow template is first generated, and then the hash calculation method of the defined hash calculation graph can be used to calculate the hash rate of the target node included in the target hash calculation graph. Then, by verifying the calculated hash value, the workflow steps that need to be run and the workflow steps that have been successfully run can be obtained, and then the workflow steps that have been successfully run can be pruned to construct the target workflow , which can solve the problem of time and resources caused by the repeated execution of the same workflow in the prior art, improve the running efficiency and performance of the workflow, and improve the user experience. In addition, all steps, the workflow to which the step belongs, and the operation information of the step are recorded in the step information database, and the workflow to which the step belongs can be found according to a certain step, and the operation result of the step and the connection with other steps can also be queried. , which realizes the traceability of workflow running results.
图6示出了根据本公开示例性实施例的机器学习工作流构建的整体流程图,如图6所示,工作流构建的整体流程为:根据用户提供的工作流模板的唯一标识,从工作流模板数据库中读取目标工作流模板;将目标工作流模板包含的组件转化为目标哈希计算图包含的目标节点,将组件之间的连接关系转化为目标节点之间的连接关系,生成目标哈希计算图;从没有上游节点的目标节点开始,由上到下依次计算目标哈希计算图包含的每个目标节点的哈希值,然后查询步骤信息数据库中是否存在该目标节点的哈希值;若是,则标记该目标节点对应的工作流步骤为已经成功运行的工作流步骤,并从步骤信息数据库中,读取已经成功运行的工作流步骤的运行信息;若否,则标记该目标节点对应的工作流步骤为需要被运行的工作流步骤;根据已经成功运行的工作流步骤的运行信息和需要被运行的工作流步骤,生成目标工作流配置文件;调用接口运行目标工作流配置文件;监听目标工作流配置文件中每个步骤的运行状态,在步骤运行成功的情况下,更新步骤的运行状态为运行成功,将步骤的运行结果存储至步骤信息数据库中。Fig. 6 shows an overall flow chart of machine learning workflow construction according to an exemplary embodiment of the present disclosure. As shown in Fig. 6, the overall flow of workflow construction is: according to the unique identifier of the workflow template provided by the user, from the workflow Read the target workflow template from the flow template database; convert the components contained in the target workflow template into the target nodes contained in the target hash calculation graph, convert the connection relationship between the components into the connection relationship between the target nodes, and generate the target Hash calculation graph; starting from the target node without an upstream node, calculate the hash value of each target node included in the target hash calculation graph from top to bottom, and then check whether the hash value of the target node exists in the step information database value; if so, mark the workflow step corresponding to the target node as a successfully executed workflow step, and read the running information of the successfully executed workflow step from the step information database; if not, mark the target The workflow step corresponding to the node is the workflow step that needs to be executed; according to the running information of the workflow step that has been successfully executed and the workflow step that needs to be executed, the target workflow configuration file is generated; the interface is called to run the target workflow configuration file ; Monitor the running status of each step in the target workflow configuration file. If the step runs successfully, update the running status of the step to run successfully, and store the running result of the step in the step information database.
图7示出了根据本公开示例性实施例的计算目标哈希计算图包含的每个目标节点的哈希值的流程图。如图7所示,计算每个目标节点的哈希值的具体流程可以包括:FIG. 7 shows a flowchart of calculating the hash value of each target node included in the target hash calculation graph according to an exemplary embodiment of the present disclosure. As shown in Figure 7, the specific process of calculating the hash value of each target node may include:
步骤S701,根据目标节点之间的连接关系和组件的参数,确定该目标节点对应的输入参数和该目标节点对应的配置参数;Step S701, according to the connection relationship between the target nodes and the parameters of the components, determine the input parameters corresponding to the target node and the configuration parameters corresponding to the target node;
步骤S702,获取该目标节点对应的组件的输入参数的参数信息,利用哈希算法对获取的参数信息进行计算,获得该目标节点对应的组件的输入参数的哈希值;Step S702, obtaining parameter information of the input parameter of the component corresponding to the target node, using a hash algorithm to calculate the obtained parameter information, and obtaining the hash value of the input parameter of the component corresponding to the target node;
步骤S703,将该目标节点的上游节点的哈希值和该目标节点的上游节点对应的输出参数的参数名称,代入到节点对应的输出参数的哈希值计算公式中,计算该目标节点的上游节点对应的输出参数的哈希值;Step S703: Substitute the hash value of the upstream node of the target node and the parameter name of the output parameter corresponding to the upstream node of the target node into the hash value calculation formula of the output parameter corresponding to the node, and calculate the upstream node of the target node. The hash value of the output parameter corresponding to the node;
步骤S704,对该目标节点对应的组件的输入参数的哈希值与该目标节点的上游节点对应的输出参数的哈希值进行求和,获得该目标节点对应的输入参数的哈希值;Step S704, summing the hash value of the input parameter of the component corresponding to the target node and the hash value of the output parameter corresponding to the upstream node of the target node to obtain the hash value of the input parameter corresponding to the target node;
步骤S705,查询该目标节点对应的配置参数的参数信息,根据查询的参数信息,判断该目标节点对应的配置参数的参数类型是否为索引,若是,则执行步骤S706,若否,则执行步骤S707;Step S705, query the parameter information of the configuration parameter corresponding to the target node, and determine whether the parameter type of the configuration parameter corresponding to the target node is an index according to the parameter information inquired, if yes, execute step S706, if not, execute step S707 ;
步骤S706,采用接口请求的方式,计算查询的参数信息下的文件的哈希值,并确定文件的哈希值为该目标节点对应的配置参数的哈希值;Step S706, using the method of interface request, calculate the hash value of the file under the query parameter information, and determine that the hash value of the file is the hash value of the configuration parameter corresponding to the target node;
步骤S707,利用哈希算法计算查询的参数信息的哈希值;Step S707, using a hash algorithm to calculate the hash value of the query parameter information;
步骤S708,将该目标节点对应的输入参数的哈希值和该目标节点对应的配置参数的哈希值代入到节点的哈希值计算公式中,计算该目标节点的哈希值。Step S708, the hash value of the input parameter corresponding to the target node and the hash value of the configuration parameter corresponding to the target node are substituted into the hash value calculation formula of the node, and the hash value of the target node is calculated.
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开方法实施例。The following are the apparatus embodiments of the present disclosure, which can be used to execute the method embodiments of the present disclosure. For details not disclosed in the apparatus embodiments of the present disclosure, please refer to the method embodiments of the present disclosure.
图8示出了根据本公开示例性实施例的机器学习工作流的构建装置的结构示意图。FIG. 8 shows a schematic structural diagram of an apparatus for constructing a machine learning workflow according to an exemplary embodiment of the present disclosure.
如图8所示,机器学习工作流的构建装置800可以包括:获取模块801、生成模块802、计算模块803和构建模块804。As shown in FIG. 8 , the
其中,获取模块801可用于:获取目标工作流模板;生成模块802可用于:根据目标工作流模板的组织结构信息,生成目标工作流模板对应的目标哈希计算图;计算模块803可用于:利用定义的哈希计算图的计算方式,根据目标工作流模板对应的参数信息,计算目标哈希计算图包含的目标节点的哈希值;构建模块804可用于:对目标节点的哈希值进行验证,确定需要被运行的工作流步骤和已经成功运行的工作流步骤,然后利用需要被运行的工作流步骤和已经成功运行的工作流步骤,构建目标工作流。Wherein, the obtaining module 801 can be used for: obtaining the target workflow template; the generating module 802 can be used for: generating the target hash calculation graph corresponding to the target workflow template according to the organizational structure information of the target workflow template; the calculating module 803 can be used for: using The calculation method of the defined hash calculation graph is to calculate the hash value of the target node included in the target hash calculation graph according to the parameter information corresponding to the target workflow template; the building module 804 can be used for: verifying the hash value of the target node , determine the workflow steps that need to be executed and the workflow steps that have been successfully executed, and then construct the target workflow using the workflow steps that need to be executed and the workflow steps that have been successfully executed.
其中,哈希计算图的计算方式可以包括:(1)节点的哈希值计算公式为:将节点对应的输入参数的哈希值与节点对应的配置参数的哈希值进行求和,然后利用哈希算法对求和结果进行计算;(2)节点对应的输出参数的哈希值计算公式为:将节点的哈希值与节点对应的输出参数名称的哈希值进行求和。The calculation method of the hash calculation graph may include: (1) The calculation formula of the hash value of the node is: sum the hash value of the input parameter corresponding to the node and the hash value of the configuration parameter corresponding to the node, and then use The hash algorithm calculates the summation result; (2) The calculation formula of the hash value of the output parameter corresponding to the node is: sum the hash value of the node and the hash value of the output parameter name corresponding to the node.
在本公开示例性实施例中,生成模块802还可用于:将目标工作流模板包含的组件转化为目标哈希计算图包含的目标节点,将组件之间的连接关系转化为目标节点之间的连接关系,生成目标哈希计算图。In an exemplary embodiment of the present disclosure, the generating module 802 is further configured to: convert the components included in the target workflow template into target nodes included in the target hash calculation graph, and convert the connection relationship between the components into the connection relationship between the target nodes Connect the relationship to generate the target hash calculation graph.
在本公开示例性实施例中,计算模块803还可用于:根据目标节点之间的连接关系和组件的参数,确定目标节点对应的输入参数和目标节点对应的配置参数;根据目标节点对应的输入参数的参数信息,计算目标节点对应的输入参数的哈希值;查询目标节点对应的配置参数的参数信息,根据查询的参数信息,计算目标节点对应的配置参数的哈希值;将目标节点对应的输入参数的哈希值和目标节点对应的配置参数的哈希值代入到节点的哈希值计算公式中,计算目标节点的哈希值。In an exemplary embodiment of the present disclosure, the calculation module 803 may be further configured to: determine the input parameters corresponding to the target node and the configuration parameters corresponding to the target node according to the connection relationship between the target nodes and the parameters of the components; according to the input parameters corresponding to the target node The parameter information of the parameter, calculate the hash value of the input parameter corresponding to the target node; query the parameter information of the configuration parameter corresponding to the target node, and calculate the hash value of the configuration parameter corresponding to the target node according to the queried parameter information; The hash value of the input parameter and the hash value of the configuration parameter corresponding to the target node are substituted into the hash value calculation formula of the node, and the hash value of the target node is calculated.
其中,目标节点对应的输入参数可以包括:目标节点对应的组件的输入参数、目标节点的上游节点对应的输出参数;以及,目标节点对应的配置参数可以包括:目标节点对应的组件的配置参数。The input parameters corresponding to the target node may include: input parameters of components corresponding to the target node, and output parameters corresponding to upstream nodes of the target node; and configuration parameters corresponding to the target node may include: configuration parameters of components corresponding to the target node.
在本公开示例性实施例中,计算模块803还可用于:获取目标节点对应的组件的输入参数的参数信息,利用哈希算法对获取的参数信息进行计算,获得目标节点对应的组件的输入参数的哈希值;将目标节点的上游节点的哈希值和目标节点的上游节点对应的输出参数的参数名称,代入到节点对应的输出参数的哈希值计算公式中,计算目标节点的上游节点对应的输出参数的哈希值;对目标节点对应的组件的输入参数的哈希值与目标节点的上游节点对应的输出参数的哈希值进行求和,获得目标节点对应的输入参数的哈希值。In an exemplary embodiment of the present disclosure, the calculation module 803 may be further configured to: obtain parameter information of the input parameters of the component corresponding to the target node, use a hash algorithm to calculate the obtained parameter information, and obtain the input parameters of the component corresponding to the target node The hash value of the target node; the hash value of the upstream node of the target node and the parameter name of the output parameter corresponding to the upstream node of the target node are substituted into the hash value calculation formula of the output parameter corresponding to the node, and the upstream node of the target node is calculated. Hash value of the corresponding output parameter; sum the hash value of the input parameter of the component corresponding to the target node and the hash value of the output parameter corresponding to the upstream node of the target node to obtain the hash value of the input parameter corresponding to the target node value.
在本公开示例性实施例中,计算模块803还可用于:判断目标节点对应的配置参数的参数类型是否为索引;若是,则采用接口请求的方式,计算查询的参数信息下的文件的哈希值,并确定文件的哈希值为目标节点对应的配置参数的哈希值。In an exemplary embodiment of the present disclosure, the calculation module 803 is further configured to: determine whether the parameter type of the configuration parameter corresponding to the target node is an index; if so, calculate the hash of the file under the query parameter information by means of an interface request value, and determine the hash value of the file as the hash value of the configuration parameter corresponding to the target node.
在本公开示例性实施例中,构建模块804还可用于:查询步骤信息数据库中是否存在目标节点的哈希值;若是,则标记目标节点对应的工作流步骤为已经成功运行的工作流步骤,并从步骤信息数据库中,读取已经成功运行的工作流步骤的运行信息;若否,则标记目标节点对应的工作流步骤为需要被运行的工作流步骤;根据已经成功运行的工作流步骤的运行信息和需要被运行的工作流步骤,生成目标工作流配置文件。In an exemplary embodiment of the present disclosure, the building module 804 is further configured to: query whether the hash value of the target node exists in the step information database; if so, mark the workflow step corresponding to the target node as a workflow step that has been successfully executed, And from the step information database, read the operation information of the workflow step that has been successfully run; if not, mark the workflow step corresponding to the target node as the workflow step that needs to be run; Run information and workflow steps that need to be run to generate the target workflow configuration file.
在本公开示例性实施例中,上述装置还可以包括:运行模块。该运行模块用于:调用接口运行目标工作流配置文件;以及,监听目标工作流配置文件中每个步骤的运行状态,在步骤运行成功的情况下,更新步骤的运行状态为运行成功,将步骤的运行结果存储至步骤信息数据库中。In an exemplary embodiment of the present disclosure, the above-mentioned apparatus may further include: an operation module. The running module is used for: calling the interface to run the target workflow configuration file; and, monitoring the running status of each step in the target workflow configuration file, if the step runs successfully, update the running status of the step to run successfully, and the step The running results are stored in the step information database.
需要注意的是,上述附图中所示的框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。It should be noted that the block diagrams shown in the above figures are functional entities, and do not necessarily necessarily correspond to physically or logically independent entities. These functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
图9示出了根据本公开示例性实施例的电子设备的框图。需要说明的是,图9示出的电子设备仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。FIG. 9 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure. It should be noted that the electronic device shown in FIG. 9 is only an example, and should not impose any limitations on the function and scope of use of the embodiments of the present invention.
如图9所示,电子设备900包括中央处理单元(CPU)901,其可以根据存储在只读存储器(ROM)902中的程序或者从存储部分908加载到随机访问存储器(RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中,还存储有系统900操作所需的各种程序和数据。CPU 901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG. 9 , an
以下部件连接至I/O接口905:包括键盘、鼠标等的输入部分906;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分907;包括硬盘等的存储部分908;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至I/O接口905。可拆卸介质911,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器910上,以便于从其上读出的计算机程序根据需要被安装入存储部分908。The following components are connected to the I/O interface 905: an
特别地,根据本发明的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本发明的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分909从网络上被下载和安装,和/或从可拆卸介质911被安装。在该计算机程序被中央处理单元(CPU)901执行时,执行本发明的系统中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as computer software programs according to embodiments of the present invention. For example, embodiments of the present invention include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the
需要说明的是,本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.
描述于本发明实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括发送单元、获取单元、确定单元和第一处理单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,发送单元还可以被描述为“向所连接的服务端发送图片获取请求的单元”。The units involved in the embodiments of the present invention may be implemented in a software manner, and may also be implemented in a hardware manner. The described unit may also be provided in a processor, for example, it may be described as: a processor includes a sending unit, an obtaining unit, a determining unit and a first processing unit. Wherein, the names of these units do not constitute a limitation on the unit itself under certain circumstances. For example, the sending unit may also be described as "a unit that sends a request for obtaining pictures to the connected server".
作为另一方面,本公开还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读存储介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现如下述实施例中所述的方法。例如,所述的电子设备可以实现如图2所示的各个步骤。As another aspect, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be included in the electronic device described in the above embodiments; in electronic equipment. The above-mentioned computer-readable storage medium carries one or more programs, and when the above-mentioned one or more programs are executed by an electronic device, the electronic device enables the electronic device to implement the methods described in the following embodiments. For example, the electronic device can implement the various steps shown in FIG. 2 .
根据本公开的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述实施例的各种可选实现方式中提供的方法。According to one aspect of the present disclosure, there is provided a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in various optional implementations of the above-described embodiments.
需要理解的是,在本公开附图中的任何元素数量均用于示例而非限制,以及任何命名都仅用于区分,而不具有任何限制含义。It should be understood that any number of elements in the drawings of the present disclosure is for illustration rather than limitation, and any designation is for distinction only and does not have any limiting meaning.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (12)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210146721.8A CN114492844B (en) | 2022-02-17 | 2022-02-17 | Method, device, electronic device and storage medium for constructing machine learning workflow |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210146721.8A CN114492844B (en) | 2022-02-17 | 2022-02-17 | Method, device, electronic device and storage medium for constructing machine learning workflow |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114492844A true CN114492844A (en) | 2022-05-13 |
| CN114492844B CN114492844B (en) | 2025-06-20 |
Family
ID=81483256
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210146721.8A Active CN114492844B (en) | 2022-02-17 | 2022-02-17 | Method, device, electronic device and storage medium for constructing machine learning workflow |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114492844B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115982518A (en) * | 2023-01-19 | 2023-04-18 | 蚂蚁区块链科技(上海)有限公司 | Component operation method and related equipment |
| WO2024119573A1 (en) * | 2022-12-07 | 2024-06-13 | 奇安信科技集团股份有限公司 | Workflow detection method and device |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110928534A (en) * | 2019-10-14 | 2020-03-27 | 上海唯链信息科技有限公司 | Workflow node authentication method and device based on block chain |
| CN112231711A (en) * | 2020-10-20 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Vulnerability detection method and device, computer equipment and storage medium |
| WO2021086362A1 (en) * | 2019-10-31 | 2021-05-06 | Hewlett-Packard Development Company, L.P. | Workflow management |
| CN112862449A (en) * | 2021-03-02 | 2021-05-28 | 岭东核电有限公司 | Structural chemical industry bill generation method and device, computer equipment and storage medium |
| CN112862455A (en) * | 2021-03-02 | 2021-05-28 | 岭东核电有限公司 | Test execution work order generation method and device, computer equipment and storage medium |
| CN112949276A (en) * | 2021-03-31 | 2021-06-11 | 中国建设银行股份有限公司 | Report generation method and device, electronic equipment and storage medium |
| CN113312630A (en) * | 2021-05-31 | 2021-08-27 | 支付宝(杭州)信息技术有限公司 | Method and device for realizing trusted scheduling |
-
2022
- 2022-02-17 CN CN202210146721.8A patent/CN114492844B/en active Active
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110928534A (en) * | 2019-10-14 | 2020-03-27 | 上海唯链信息科技有限公司 | Workflow node authentication method and device based on block chain |
| WO2021086362A1 (en) * | 2019-10-31 | 2021-05-06 | Hewlett-Packard Development Company, L.P. | Workflow management |
| US20220405426A1 (en) * | 2019-10-31 | 2022-12-22 | Hewlett-Packard Development Company, L.P. | Workflow management |
| CN112231711A (en) * | 2020-10-20 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Vulnerability detection method and device, computer equipment and storage medium |
| CN112862449A (en) * | 2021-03-02 | 2021-05-28 | 岭东核电有限公司 | Structural chemical industry bill generation method and device, computer equipment and storage medium |
| CN112862455A (en) * | 2021-03-02 | 2021-05-28 | 岭东核电有限公司 | Test execution work order generation method and device, computer equipment and storage medium |
| CN112949276A (en) * | 2021-03-31 | 2021-06-11 | 中国建设银行股份有限公司 | Report generation method and device, electronic equipment and storage medium |
| CN113312630A (en) * | 2021-05-31 | 2021-08-27 | 支付宝(杭州)信息技术有限公司 | Method and device for realizing trusted scheduling |
Non-Patent Citations (1)
| Title |
|---|
| 王竹荣 等: "哈希桶Variety-B树的数据流处理方法", 西安理工大学学报, vol. 33, no. 01, 30 March 2017 (2017-03-30), pages 13 - 17 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024119573A1 (en) * | 2022-12-07 | 2024-06-13 | 奇安信科技集团股份有限公司 | Workflow detection method and device |
| CN115982518A (en) * | 2023-01-19 | 2023-04-18 | 蚂蚁区块链科技(上海)有限公司 | Component operation method and related equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114492844B (en) | 2025-06-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240202028A1 (en) | System and method for collaborative algorithm development and deployment, with smart contract payment for contributors | |
| US20240250996A1 (en) | System and method for algorithm crowdsourcing, monetization, and exchange | |
| CN110263938B (en) | Method and apparatus for generating information | |
| US10901961B2 (en) | Systems and methods for generating schemas that represent multiple data sources | |
| US9569288B2 (en) | Application pattern discovery | |
| CN107330522A (en) | Method, apparatus and system for updating deep learning model | |
| CN113220907B (en) | Construction method and device of business knowledge graph, medium and electronic equipment | |
| CN114491536B (en) | Code analysis method and device based on knowledge graph | |
| CN114036248A (en) | High-precision map data processing method, device and electronic device | |
| CN113282489B (en) | Interface testing method and device | |
| CN111435367A (en) | Knowledge graph construction method, system, equipment and storage medium | |
| CN113485763B (en) | Data processing method, device, electronic device and computer readable medium | |
| CN111427971A (en) | Business modeling method, device, system and medium for computer system | |
| CN108933695B (en) | Method and apparatus for processing information | |
| CN113760728A (en) | Method and apparatus for application testing | |
| CN114492844A (en) | Method and device for constructing machine learning workflow, electronic equipment and storage medium | |
| CN114238055A (en) | Task data processing method, device, electronic device and storage medium | |
| CN116756245A (en) | Achievement protection methods, equipment and media based on the full life cycle of blockchain | |
| CN114398678A (en) | Registration verification method and device for preventing electronic file from being tampered, electronic equipment and medium | |
| CN110928594A (en) | Service Development Methodology and Platform | |
| CN110795424B (en) | Characteristic engineering variable data request processing method and device and electronic equipment | |
| CN119106750A (en) | Task processing method based on large model, device, equipment and medium | |
| CN115829053B (en) | Model operation strategy determination method and device, electronic equipment and storage medium | |
| CN111951112A (en) | Intelligent contract execution method based on block chain, terminal equipment and storage medium | |
| US11074508B2 (en) | Constraint tracking and inference generation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |