[go: up one dir, main page]

CN113326116A - Data processing method and system - Google Patents

Data processing method and system Download PDF

Info

Publication number
CN113326116A
CN113326116A CN202110739528.0A CN202110739528A CN113326116A CN 113326116 A CN113326116 A CN 113326116A CN 202110739528 A CN202110739528 A CN 202110739528A CN 113326116 A CN113326116 A CN 113326116A
Authority
CN
China
Prior art keywords
distributed architecture
data processing
node
environment information
resource configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110739528.0A
Other languages
Chinese (zh)
Inventor
路明奎
方磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zetyun Tech Co ltd
Original Assignee
Beijing Zetyun Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zetyun Tech Co ltd filed Critical Beijing Zetyun Tech Co ltd
Priority to CN202110739528.0A priority Critical patent/CN113326116A/en
Publication of CN113326116A publication Critical patent/CN113326116A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides a data processing method and a system, which relate to the field of big data processing, wherein the data processing method comprises the following steps: acquiring resource allocation and a frame type; and constructing a distributed architecture based on the acquired resource configuration and the framework type. According to the embodiment of the invention, by decoupling the distributed architecture and the framework, the resource can be repeatedly utilized, and the efficiency of distributed training is effectively improved.

Description

Data processing method and system
Technical Field
The present invention relates to the field of big data processing, and in particular, to a data processing method and system.
Background
With the improvement of social informatization and intelligence level, the method for training the business model by using the big data system and realizing the intelligent processing of big data business by using the trained business model also gradually becomes a general means of big data industry. When the existing big data system is used for big data analysis, a distributed or single-machine mode can be selected to process data and train a model. However, when data is processed in a distributed manner in an existing big data system, the distributed architecture and the framework are manually set by a user as a whole to adapt to complex parameters, so that resource utilization rate is low and data processing efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a data processing method and a data processing system, which solve the problems that when the existing big data system processes data in a distributed mode, a distributed architecture and a framework are used as a whole, and a user manually sets adaptive parameters, so that the resource utilization rate is low and the data processing efficiency is low.
In order to solve the above technical problem, the present invention provides a data processing method, including:
acquiring resource allocation and a frame type;
and constructing a distributed architecture based on the acquired resource configuration and the framework type.
Optionally, in the foregoing method, the resource configuration includes a node number and a node operation resource configuration, where the node is obtained by performing logical abstraction processing on a distributed architecture.
Optionally, in the above method, the step of constructing a distributed architecture based on the obtained resource configuration and the obtained framework type includes:
determining a target node based on the resource configuration;
and constructing a distributed architecture based on the target node and the frame type.
Optionally, in the foregoing method, the step of determining the target node based on the resource configuration includes:
determining the number of the target nodes based on the resource configuration;
and determining available target nodes from the nodes based on the operation states of the nodes.
Optionally, in the above method, the step of constructing a distributed architecture based on the target node and the frame type includes:
acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
Optionally, in the method, the step of obtaining the environment information and the running code corresponding to the frame type includes:
recommending at least one environment information according to the frame type;
and determining environmental information corresponding to the frame type from at least one recommended environmental information based on user operation.
Optionally, in the method, the step of obtaining the environment information and the running code corresponding to the frame type includes:
displaying an environment information configuration template corresponding to the frame type according to the frame type;
and generating environment information corresponding to the frame type according to the input operation of the user.
Optionally, in the method, the step of constructing the distributed architecture by each target node based on the environment information and the running code includes:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
Optionally, in the above method, the step of determining an available target node from each node based on the operating state of each node specifically includes:
acquiring started candidate nodes according to the running state of each node;
and according to the running resource configuration of the nodes in the resource configuration, selecting available nodes meeting the running resource configuration from the candidate nodes as target nodes.
Optionally, in the above method, the step of constructing a distributed architecture based on the target node and the frame type includes:
determining whether a distributed architecture corresponding to the started group of nodes supports the framework type;
if so, distributing the running code to the started group of nodes.
An embodiment of the present invention further provides a data processing system, where the data processing system includes:
the acquisition module is used for acquiring resource configuration and a frame type;
and the construction module is used for constructing a distributed architecture based on the acquired resource configuration and the framework type.
Optionally, in the data processing system, the resource configuration includes a node number and a node operation resource configuration, where the node is obtained by performing logical abstraction processing on a distributed architecture.
Optionally, in the data processing system, the building module includes:
a determining submodule for determining a target node based on the resource configuration;
and the construction submodule is used for constructing a distributed architecture based on the target node and the frame type.
Optionally, in the data processing system, the determining submodule is specifically configured to:
determining the number of the target nodes based on the resource configuration;
and determining available target nodes from the nodes based on the operation states of the nodes.
Optionally, in the data processing system, the building submodule includes:
the acquisition unit is used for acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
Optionally, in the data processing system, the obtaining unit is specifically configured to:
recommending at least one environment information according to the frame type;
and determining environmental information corresponding to the frame type from at least one recommended environmental information based on user operation.
Optionally, in the data processing system, the obtaining unit is specifically configured to:
displaying an environment information configuration template corresponding to the frame type according to the frame type;
and generating environment information corresponding to the frame type according to the input operation of the user.
Optionally, in the data processing system, the constructing unit is specifically configured to:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
Optionally, in the data processing system, the determining sub-module performs the step of determining an available target node from each node based on the operating state of each node, specifically including:
acquiring started candidate nodes according to the running state of each node;
and according to the running resource configuration of the nodes in the resource configuration, selecting available nodes meeting the running resource configuration from the candidate nodes as target nodes.
Optionally, in the data processing system, the building submodule is further specifically configured to:
determining whether a distributed architecture corresponding to the started group of nodes supports the framework type;
if so, distributing the running code to the started group of nodes.
The embodiment of the present invention further provides a data processing system, which includes a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when the computer program is executed by the processor, the steps of the data processing method are implemented.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the data processing method.
According to the embodiment of the invention, by decoupling the distributed architecture and the framework, the resource can be repeatedly utilized, and the efficiency of distributed training is effectively improved.
The embodiment of the invention obtains the resource object after the distributed architecture is subjected to logic abstraction processing, and facilitates the expansion of the distributed architecture through the scheduling configuration of the resource object. The distributed architecture is abstracted into resource objects, and when the architecture corresponding to the training framework is adapted, a user does not need to set complex adaptation parameters, and only simple configuration needs to be set, so that automatic adaptation can be realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a data processing system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention. The method may be applied to a data processing system, as shown in fig. 1, the method comprising the steps of:
step 101, acquiring resource configuration and framework type.
And 102, constructing a distributed architecture based on the acquired resource configuration and the framework type.
In particular, a user may set a resource configuration and a framework type in a data processing system. The resource allocation comprises the node number and the node operation resource allocation, wherein the nodes are obtained by the distributed architecture after logical abstraction processing. The node operation resource configuration comprises a CPU specification, a GPU specification and the like.
The architecture is a cluster, and the architecture comprises a stand-alone architecture and a distributed architecture. Wherein the distributed architecture comprises: PS Architecture (Parameter Server), ring-reduce Architecture.
The framework is a component library and can provide a plurality of components for a developer to call so as to construct a model. One type of framework provides a series of components. The frame includes: for example, a framework that supports deep learning and/or a framework that supports machine learning. In particular, the framework supporting deep learning includes but is not limited to: tensorflow, PyTorch, MXNet, SWIFT, Caffe, frameworks that support machine learning include, but are not limited to: tensorflow, PyTorch.
In an embodiment of the invention, the abstraction of the distributed architecture over resources is represented as a plurality of nodes in the compute engine of the data analysis system, which are integrated into one resource object. The computing engine is responsible for managing all nodes in the resource object, runs in a main node of the resource object, and constructs a distributed architecture through node scheduling. In the embodiment of the invention, the distributed architecture is logically abstracted into nodes in the cluster, so that the decoupling of the distributed architecture and the framework is realized, and the distributed architecture and the framework can be dynamically adapted; and the distributed framework is convenient to expand through the scheduling configuration of the resource objects.
Optionally, the step 102 of constructing a distributed architecture based on the obtained resource configuration and the framework type includes:
determining a target node based on the resource configuration;
and constructing a distributed architecture based on the target node and the frame type.
Specifically, the data processing system receives resource configuration set by a user, determines the number of target nodes needing to be started based on the resource configuration, and completes construction of the distributed architecture based on the target nodes and the framework type. The data processing system can multiplex the started nodes as target nodes to construct a distributed architecture, and can also determine the target nodes from the nodes to be started to construct the distributed architecture. The embodiment of the present invention is not limited thereto.
Optionally, the step of determining a target node based on the resource configuration includes:
determining the number of the target nodes based on the resource configuration;
and determining available target nodes from the nodes based on the operation states of the nodes.
In the embodiment of the invention, the current idle nodes are selected by analyzing the running state of each node, and then the target node is selected from the idle nodes to construct the distributed architecture, so that the waiting time can be avoided, and the efficiency of constructing the distributed architecture can be improved.
Optionally, the determining an available target node from the nodes based on the operating states of the nodes includes:
acquiring started candidate nodes according to the running state of each node;
and according to the running resource configuration of the nodes in the resource configuration, selecting available nodes meeting the running resource configuration from the candidate nodes as target nodes.
Specifically, the data processing system determines the nodes that have been started in the data processing system based on the operating state of each node. The data processing system takes the started nodes as candidate nodes, operates resource configuration according to the nodes in the resource configuration, and selects available nodes meeting the operation resource configuration from the candidate nodes as target nodes, so that a distributed architecture can be directly constructed based on the started nodes. Illustratively, the data processing system has started 2 nodes, the running resource of the 2 nodes is 2 cores 2G of CPU, the running tensoflow frame set by the user is 3 nodes, the running resource of the user is 1 core 2G of CPU, and the running resource of the 2 nodes that have run can satisfy the node running resource configuration of the tensoflow frame, so that the data processing system can reuse the currently started 2 nodes and only needs to start 1 node.
In the embodiment of the invention, the started resources can be directly reused if being adapted, so that the utilization rate of the resources is improved; and the started nodes are multiplexed, so that the node starting time is saved, and the working efficiency can be further improved.
Optionally, the step of constructing a distributed architecture based on the target node and the frame type includes:
acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
Specifically, a computing engine in the data processing system receives a frame type set by a user, and obtains environment information corresponding to the frame type based on the frame type set by the user. The environment information includes: environment variable parameters, supported framework types, etc.
The motion code may be a code pre-stored in the data processing system, or may be a code imported by the user, which is not limited in the present invention.
The step of acquiring the environment information and the running code corresponding to the frame type comprises:
recommending at least one environment information according to the frame type;
and determining environmental information corresponding to the frame type from at least one recommended environmental information based on user operation.
Specifically, the computing engine pre-stores environment information, and the data processing system queries the environment information corresponding to a frame type set by a user after receiving the frame type. Or after receiving the frame type set by the user in the data processing system, acquiring the environment information corresponding to the frame type from the specified position in which the environment information is stored.
The data processing system displays a plurality of pieces of recommended environment information based on the type of the frame set by the user. The calculation engine selects one type of environment information from the displayed plurality of types of recommended environment information as the frame-type environment information based on a selection operation by the user.
Further, the user may modify the recommended context information based on the context information to define a new frame type.
Optionally, the step of obtaining the environment information and the running code corresponding to the frame type includes:
displaying an environment information configuration template corresponding to the frame type according to the frame type;
and generating environment information corresponding to the frame type according to the input operation of the user.
And displaying an environment information definition interface on the data processing system, wherein the environment information definition interface comprises an environment information configuration template corresponding to the frame type, and a user defines a new environment supporting the frame type in the environment information definition interface. Specifically, the user sets the environment parameters based on the environment information configuration template corresponding to the frame type, and completes the self-definition of the environment.
Optionally, the step of constructing a distributed architecture by each target node based on the environment information and the running code includes:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
Specifically, before code distribution, the computing engine acquires an operation code, and reads the operation code into a main node memory corresponding to the computing engine. A computing engine of the data processing system sends a starting instruction to the target nodes and receives starting feedback sent to the computing engine after each target node is started. And the main node reads the running code into a main node memory, and then sends environment information and distributes the code to each target node to complete the construction of the distributed architecture. The code is an operation code corresponding to the frame type set by the user. The master node is one of the target nodes.
Optionally, the step of constructing a distributed architecture based on the target node and the frame type includes:
determining whether a distributed architecture corresponding to the started group of nodes supports the framework type;
if so, distributing the running code to the started group of nodes.
Specifically, before multiplexing the started node, it is further required to determine whether the distributed architecture corresponding to the started node supports the framework type set by the user. If so, the calculation engine directly acquires the code corresponding to the frame type and distributes the code to the started nodes.
Illustratively, a set of resource containers is started, a framework supporting TensorFlow and PyTorch is used, and a ring-allow architecture is used. The first time a task is run, which is completed without closing the container group, a TensorFlow framework is used. When the second task is run, if the used architectures are the same and the resource sizes are also the same, even if a new task needs a PyTorch framework, the existing container group can be directly used, the restarting of a group of containers is avoided, and the efficiency of the data processing system for running the distributed training task is effectively improved. For example. If 3 nodes are currently started in the data processing system, the number of the nodes configured by the user is met, and the currently started running resources of the 3 nodes also meet the running resources set by the user, the data processing system needs to determine whether the distributed architecture corresponding to the currently started 3 nodes supports the frame type set by the user, and if so, the data processing system determines that the existing container group can be directly used for running the code corresponding to the frame set by the user.
Optionally, the method further includes:
and the computing engine monitors the started designated node.
Specifically, the computing engine may also monitor the started nodes, collect the running logs and running states of the nodes, and display the running logs and the running states. The run status includes running, success or failure.
According to the embodiment of the invention, by decoupling the distributed architecture and the framework, the resource can be repeatedly utilized, and the efficiency of distributed training is effectively improved.
The embodiment of the invention abstracts the distributed architecture into the resource objects, and facilitates the expansion of the distributed architecture through the scheduling configuration of the resource objects. The distributed architecture is abstracted into resource objects, and when the architecture corresponding to the training framework is adapted, a user does not need to set complex adaptation parameters, and only simple configuration needs to be set, so that automatic adaptation can be realized.
Based on the data processing method provided in the above embodiment, an embodiment of the present invention further provides a data processing system for implementing the above method, and referring to fig. 2, a data processing system 200 provided in an embodiment of the present invention includes:
an obtaining module 201, configured to obtain a resource configuration and a framework type;
a building module 202, configured to build a distributed architecture based on the obtained resource configuration and the framework type.
Optionally, the resource configuration includes a node number and a node operation resource configuration, where the node is obtained by performing logical abstraction processing on a distributed architecture.
Optionally, the building module 202 includes:
a determining submodule for determining a target node based on the resource configuration;
and the construction submodule is used for constructing a distributed architecture based on the target node and the frame type.
Optionally, the determining sub-module is specifically configured to:
determining the number of the target nodes based on the resource configuration;
and determining available target nodes from the nodes based on the operation states of the nodes.
Alternatively to this, the first and second parts may,
the building submodule comprises:
the acquisition unit is used for acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
The optional acquiring unit is specifically configured to:
recommending at least one environment information according to the frame type;
and determining environmental information corresponding to the frame type from at least one recommended environmental information based on user operation.
Optionally, the obtaining unit is specifically configured to:
displaying an environment information configuration template corresponding to the frame type according to the frame type;
and generating environment information corresponding to the frame type according to the input operation of the user.
Optionally, the building unit is specifically configured to:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
Optionally, the step of determining, by the determining sub-module, an available target node from the nodes based on the operating state of each node includes:
acquiring started candidate nodes according to the running state of each node;
and according to the running resource configuration of the nodes in the resource configuration, selecting available nodes meeting the running resource configuration from the candidate nodes as target nodes.
Optionally, the building submodule is further specifically configured to:
determining whether a distributed architecture corresponding to the started group of nodes supports the framework type;
if so, distributing the running code to the started group of nodes.
Embodiments of the present invention provide a data processing system, which includes a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when the computer program is executed by the processor, the steps of the data processing method as described above are implemented.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the data processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A method of data processing, the method comprising:
acquiring resource allocation and a frame type;
and constructing a distributed architecture based on the acquired resource configuration and the framework type.
2. The method according to claim 1, wherein the resource configuration comprises a node number and a node running resource configuration, wherein the node is obtained by a distributed architecture after logical abstraction processing.
3. The method of claim 1, wherein the step of building a distributed architecture based on the obtained resource configuration and the framework type comprises:
determining a target node based on the resource configuration;
and constructing a distributed architecture based on the target node and the frame type.
4. The method of claim 3, wherein the step of building a distributed architecture based on the target node and the framework type comprises:
acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
5. The method of claim 4, wherein the step of constructing a distributed architecture based on the environment information and the running code by each of the target nodes comprises:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
6. A data processing system, characterized in that the data processing system comprises:
the acquisition module is used for acquiring resource configuration and a frame type;
and the construction module is used for constructing a distributed architecture based on the acquired resource configuration and the framework type.
7. The data processing system of claim 6, wherein the resource configuration comprises a node number and a node running resource configuration, wherein the node is obtained by a distributed architecture after logical abstraction processing.
8. The data processing system of claim 6, wherein the build module comprises:
a determining submodule for determining a target node based on the resource configuration;
and the construction submodule is used for constructing a distributed architecture based on the target node and the frame type.
9. The data processing system of claim 8, wherein the building submodule comprises:
the acquisition unit is used for acquiring environment information and an operation code corresponding to the frame type;
and each target node constructs a distributed architecture based on the environment information and the running code.
10. The data processing system of claim 9, wherein the construction unit is specifically configured to:
the main node distributes environment information and operation codes to each target node;
and each target node constructs a distributed architecture according to the environment information and the running code.
CN202110739528.0A 2021-06-30 2021-06-30 Data processing method and system Pending CN113326116A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110739528.0A CN113326116A (en) 2021-06-30 2021-06-30 Data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110739528.0A CN113326116A (en) 2021-06-30 2021-06-30 Data processing method and system

Publications (1)

Publication Number Publication Date
CN113326116A true CN113326116A (en) 2021-08-31

Family

ID=77423613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110739528.0A Pending CN113326116A (en) 2021-06-30 2021-06-30 Data processing method and system

Country Status (1)

Country Link
CN (1) CN113326116A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385502A (en) * 2022-01-12 2022-04-22 神州数码系统集成服务有限公司 Offline prediction method, system, storage medium and terminal based on K8s cluster

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106293933A (en) * 2015-12-29 2017-01-04 北京典赞科技有限公司 A kind of cluster resource configuration supporting much data Computational frames and dispatching method
US20170220949A1 (en) * 2016-01-29 2017-08-03 Yahoo! Inc. Method and system for distributed deep machine learning
EP3267310A1 (en) * 2015-04-29 2018-01-10 Huawei Technologies Co. Ltd. Data processing method and device
CN111435315A (en) * 2019-01-14 2020-07-21 北京沃东天骏信息技术有限公司 Method, apparatus, device and computer readable medium for allocating resources
CN112000473A (en) * 2020-08-12 2020-11-27 中国银联股份有限公司 Distributed training method and device for deep learning model
CN112418438A (en) * 2020-11-24 2021-02-26 国电南瑞科技股份有限公司 Container-based machine learning procedural training task execution method and system
CN112667393A (en) * 2020-12-19 2021-04-16 前海飞算科技(深圳)有限公司 Method and device for building distributed task computing scheduling framework and computer equipment
CN112667594A (en) * 2021-01-14 2021-04-16 北京智源人工智能研究院 Heterogeneous computing platform based on hybrid cloud resources and model training method
CN112862098A (en) * 2021-02-10 2021-05-28 杭州幻方人工智能基础研究有限公司 Method and system for processing cluster training task

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3267310A1 (en) * 2015-04-29 2018-01-10 Huawei Technologies Co. Ltd. Data processing method and device
CN106293933A (en) * 2015-12-29 2017-01-04 北京典赞科技有限公司 A kind of cluster resource configuration supporting much data Computational frames and dispatching method
US20170220949A1 (en) * 2016-01-29 2017-08-03 Yahoo! Inc. Method and system for distributed deep machine learning
CN111435315A (en) * 2019-01-14 2020-07-21 北京沃东天骏信息技术有限公司 Method, apparatus, device and computer readable medium for allocating resources
CN112000473A (en) * 2020-08-12 2020-11-27 中国银联股份有限公司 Distributed training method and device for deep learning model
CN112418438A (en) * 2020-11-24 2021-02-26 国电南瑞科技股份有限公司 Container-based machine learning procedural training task execution method and system
CN112667393A (en) * 2020-12-19 2021-04-16 前海飞算科技(深圳)有限公司 Method and device for building distributed task computing scheduling framework and computer equipment
CN112667594A (en) * 2021-01-14 2021-04-16 北京智源人工智能研究院 Heterogeneous computing platform based on hybrid cloud resources and model training method
CN112862098A (en) * 2021-02-10 2021-05-28 杭州幻方人工智能基础研究有限公司 Method and system for processing cluster training task

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385502A (en) * 2022-01-12 2022-04-22 神州数码系统集成服务有限公司 Offline prediction method, system, storage medium and terminal based on K8s cluster

Similar Documents

Publication Publication Date Title
US20190163524A1 (en) Method and apparatus for processing task in smart device
CN110019651A (en) A kind of streaming regulation engine and business data processing method
CN111369011A (en) Method and device for applying machine learning model, computer equipment and storage medium
CN109117252B (en) Method and system for task processing based on container and container cluster management system
CN114546387B (en) A service arrangement script execution system and method
CN112363913B (en) Parallel test task scheduling optimizing method, device and computing equipment
CN110737653A (en) micro-service-based enterprise data processing system and method
CN109597810B (en) Task segmentation method, device, medium and electronic equipment
CN108021369B (en) Data integration processing method and related device
CN112306452A (en) Method, device and system for processing service data by merging and sorting algorithm
CN113326116A (en) Data processing method and system
CN120371530A (en) Numerical model operation method and device, storage medium and electronic equipment
CN120258094A (en) A model training method, data processing method, device and program product
CN114091688A (en) Computing resource obtaining method and device, electronic equipment and storage medium
CN107766156A (en) Task processing method and device
CN113835706B (en) Artificial intelligence-based skeleton screen generation method, device, electronic device and medium
CN113722341B (en) Operation data processing method and related device
CN106648895A (en) Data processing method and device, and terminal
CN115129481B (en) Computing resource allocation method and device and electronic equipment
CN111241159A (en) Method and device for determining task execution time
CN115222269A (en) Rule judging method and related equipment
CN116932147A (en) Streaming job processing method and device, electronic equipment and medium
CN111208980B (en) Data analysis processing method and system
CN111459576B (en) Data analysis processing system and model operation method
CN113515355A (en) Resource scheduling method, device, server and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210831