[go: up one dir, main page]

WO2022120979A1 - Procédé et dispositif de gestion de ressources pour tâches d'apprentissage automatique distribué - Google Patents

Procédé et dispositif de gestion de ressources pour tâches d'apprentissage automatique distribué Download PDF

Info

Publication number
WO2022120979A1
WO2022120979A1 PCT/CN2020/139264 CN2020139264W WO2022120979A1 WO 2022120979 A1 WO2022120979 A1 WO 2022120979A1 CN 2020139264 W CN2020139264 W CN 2020139264W WO 2022120979 A1 WO2022120979 A1 WO 2022120979A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
machine learning
size
resource management
cache mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2020/139264
Other languages
English (en)
Chinese (zh)
Inventor
罗树添
叶可江
须成忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Publication of WO2022120979A1 publication Critical patent/WO2022120979A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the field of machine learning tasks, in particular, to a resource management method and device for distributed machine learning tasks.
  • the Quasar system designed by Delimitrou Christina et al. of the Department of Computer Science of Stanford University runs small data in advance to collect the result data at runtime, and predicts the resources required by the task under the condition that the performance expectations are met according to the result data.
  • the disadvantage of the prior art is that the impact of different cache modes on memory allocation and performance is not considered, and additionally running tasks are required to collect result data before memory usage can be predicted.
  • Embodiments of the present invention provide a resource management method and device for distributed machine learning tasks, so as to at least solve the technical problem that the existing resource management system does not consider the impact of different cache modes on memory allocation and performance.
  • a resource management method for distributed machine learning tasks including the following steps:
  • the user submits a machine learning task, which includes two aspects of information, one is the size of the data set, and the other is the number of containers;
  • the prediction model calculates the memory allocation size according to the data set size and the number of containers, and selects the corresponding cache mode at the same time;
  • the memory allocation is divided into two cases.
  • the optimal performance model is used; when the memory is insufficient, the optimal resource utilization model is used.
  • the optimal performance model is that both data caching and computing use memory resources;
  • the optimal resource utilization model is that data caching uses disk resources, and computing uses memory resources.
  • the method is to first calculate the size of the cached data required for each container according to the size of the data set and the number of containers, and then calculate the required allocated memory according to the calculation of the storage ratio allocated to the cache. the size of.
  • a part is reserved for memory garbage collection in the method, and additional memory for garbage collection is added to the calculated memory size.
  • the machine learning task runs on the Java virtual machine, and the size of the memory used for garbage collection is in a fixed proportional relationship with the total memory, and the size of the additional memory used for garbage collection is directly obtained by calculation.
  • the selection of the cache mode is switched to the disk cache.
  • the disk cache is selected, and all data is cached on the disk, and only a small amount of memory needs to be reserved to meet the computing memory requirements.
  • a resource management apparatus for distributed machine learning tasks including:
  • the submission unit is used for users to submit machine learning tasks.
  • the task includes two aspects of information, one is the size of the data set, and the other is the number of containers;
  • the cache mode selection unit is used for the prediction model to calculate the memory allocation size according to the data set size and the number of containers, and select the corresponding cache mode at the same time;
  • the memory allocation unit is used to divide the memory allocation into two cases according to the selection of the cache mode. When the memory is sufficient, the optimal performance model is selected; when the memory is insufficient, the optimal resource utilization model is selected.
  • a storage medium storing program files capable of implementing any one of the above-mentioned resource management methods for distributed machine learning tasks.
  • a processor is used for running a program, wherein the resource management method for executing any one of the above-mentioned distributed machine learning tasks is performed when the program is running.
  • the resource management method and device for distributed machine learning tasks in the embodiments of the present invention save resources and improve task performance through memory prediction and selection of cache modes, and are used to guide users to run distributed machine learning tasks.
  • the invention mainly analyzes the characteristics of distributed machine learning and the resource management of the computing framework, builds a model for memory prediction and cache mode selection based on these analyses, and directly allocates memory and memory to new machine learning tasks without requiring additional application portraits. Select cache mode.
  • Fig. 1 is the flow chart of the resource management method of the distributed machine learning task of the present invention
  • Fig. 2 is the logic diagram when the machine learning task of the present invention is running
  • Fig. 3 is the logic diagram of the resource management system of the distributed machine learning task that the present invention is directed to;
  • FIG. 4 is a block diagram of a resource management apparatus for distributed machine learning tasks according to the present invention.
  • a resource management method for distributed machine learning tasks includes the following steps:
  • S101 The user submits a machine learning task, which includes two pieces of information, one is the size of the dataset, and the other is the number of containers;
  • the prediction model calculates the memory allocation size according to the data set size and the number of containers, and selects a corresponding cache mode at the same time;
  • S103 Divide the memory allocation into two cases according to the selection of the cache mode. When the memory is sufficient, the optimal performance model is selected; when the memory is insufficient, the optimal resource utilization model is selected.
  • the resource management method for distributed machine learning tasks in the embodiments of the present invention saves resources and improves task performance through memory prediction and selection of cache modes, and is used to guide users to run distributed machine learning tasks.
  • the invention mainly analyzes the characteristics of distributed machine learning and the resource management of the computing framework, builds a model for memory prediction and cache mode selection based on these analyses, and directly allocates memory and memory to new machine learning tasks without requiring additional application portraits. Select cache mode.
  • the optimal performance model is that both data caching and computing use memory resources;
  • the optimal resource utilization model is that data caching uses disk resources, and computing uses memory resources.
  • the method is to first calculate the size of the data that needs to be cached in each container according to the size of the data set and the number of containers, and then calculate the required memory allocation according to the calculation of the storage ratio allocated to the cache. size.
  • a part is reserved for memory garbage collection in the method, and the calculated memory size is additionally added to the memory used for garbage collection.
  • the machine learning task runs on the Java virtual machine, and the size of the memory used for garbage collection is in a fixed proportional relationship with the total memory. Therefore, it is not necessary to predict through historical data, and the additional memory used for garbage collection can be directly calculated through calculation. size.
  • the selection of the cache mode is switched to the disk cache.
  • the disk cache is selected, and all data is cached on the disk, and only a small amount of memory needs to be reserved to meet the computing memory requirements.
  • the present invention mainly designs a set of resource management methods for distributed machine learning tasks, saves resources and improves task performance through memory prediction and selection of cache modes, and is used to guide users to run distributed machine learning tasks.
  • the purpose of the present invention is to consider the impact of the cache mode on performance, and at the same time, additional application portrait data, such as data collected by additional running tasks, is not required when performing resource prediction.
  • additional application portrait data such as data collected by additional running tasks
  • memory is allocated and the cache mode is selected directly from the size of the dataset and the number of allocated containers, making full use of the advantages of cache and reducing the extra overhead of application portraits.
  • the technical solution of the present invention is mainly to analyze the characteristics of distributed machine learning and the resource management of the computing framework, build a model for memory prediction and cache mode selection based on these analyses, and directly assign new machine learning tasks without additional application portraits. memory and select cache mode.
  • Figure 2 is a logical diagram of the runtime of a machine learning task. Because the task runs on each worker node, resources need to be allocated to the worker node before the task runs.
  • phase 1 reads data from the distributed file system and caches it; then the subsequent algorithm iterations, such as algorithm iteration 1 and algorithm iteration 2 in Figure 2, each algorithm iteration can be split into two phases , including algorithm gradient calculation and gradient collection update, and parameter exchange is required between these two stages.
  • algorithm iteration 1 and algorithm iteration 2 in Figure 2 each algorithm iteration can be split into two phases , including algorithm gradient calculation and gradient collection update, and parameter exchange is required between these two stages.
  • the machine learning task training process requires hundreds of iterations, the length of an iteration directly determines the performance. In one iteration, because the amount of data of gradient parameters in the gradient collection update is small, the performance overhead of gradient parameter exchange, collection and update is small, so the key overhead is the data that needs to be read in the gradient calculation.
  • FIG. 3 is a logic diagram of a resource management system for distributed machine learning tasks targeted by the present invention.
  • a user submits a machine learning task.
  • the task includes two pieces of information, one is the data set size; the other is the number of containers.
  • the prediction model calculates the memory allocation size based on the dataset size and the number of containers, and selects the corresponding cache mode.
  • the conclusion obtained from the analysis of the performance overhead of machine learning tasks above shows that the data reading in the gradient calculation is the bottleneck of performance. Therefore, when choosing the cache mode, it is necessary to focus on the size of the memory.
  • the invention mainly divides memory allocation into two situations according to the selection of the cache mode, one is optimal performance; the other is optimal resource utilization.
  • the former uses memory resources for data caching and computing to obtain optimal performance; the latter uses disk resources for data caching, and memory resources are used for computing to obtain optimal resource utilization.
  • the memory allocation and cache mode selection of the two are different.
  • the present invention For optimal performance memory allocation and cache mode selection, the present invention first calculates the size of the data to be cached in each container according to the size of the data set and the number of containers, and then calculates the size of the memory to be allocated according to the storage ratio allocated to the cache.
  • the current computing framework basically uses the Java virtual machine, and it is necessary to reserve a part of the memory for garbage collection, so the calculated memory size can also add this part of the memory to meet the normal memory garbage collection.
  • the memory allocation and cache selection model with the best performance. All data is cached in memory to obtain the best performance.
  • a resource management apparatus for distributed machine learning tasks including:
  • the submitting unit 201 is used for a user to submit a machine learning task, and the task includes two aspects of information, one is the size of the data set, and the other is the number of containers;
  • the cache mode selection unit 202 is used for the prediction model to calculate the allocation size of the memory according to the data set size and the number of containers, and select the corresponding cache mode at the same time;
  • the memory allocation unit 203 is configured to divide the memory allocation into two situations according to the selection of the cache mode. When the memory is sufficient, the optimal performance model is selected; when the memory is insufficient, the optimal resource utilization model is selected.
  • the resource management device for distributed machine learning tasks in the embodiments of the present invention saves resources and improves task performance through memory prediction and selection of cache modes, and is used to guide users to run distributed machine learning tasks.
  • the invention mainly analyzes the characteristics of distributed machine learning and the resource management of the computing framework, builds a model for memory prediction and cache mode selection based on these analyses, and directly allocates memory and memory to new machine learning tasks without requiring additional application portraits. Select cache mode.
  • the invention mainly designs a set of resource management device for distributed machine learning tasks, saves resources and improves task performance through memory prediction and selection of cache mode, and is used to guide users to run distributed machine learning tasks.
  • the purpose of the present invention is to consider the impact of the cache mode on performance, and at the same time, additional application portrait data, such as data collected by additional running tasks, is not required when performing resource prediction.
  • additional application portrait data such as data collected by additional running tasks
  • memory is allocated and the cache mode is selected directly from the size of the dataset and the number of allocated containers, making full use of the advantages of cache and reducing the extra overhead of application portraits.
  • the technical solution of the present invention is mainly to analyze the characteristics of distributed machine learning and the resource management of the computing framework, build a model for memory prediction and cache mode selection based on these analyses, and directly assign new machine learning tasks without additional application portraits. memory and select cache mode.
  • Figure 2 is a logical diagram of the runtime of a machine learning task. Because the task runs on each worker node, resources need to be allocated to the worker node before the task runs.
  • phase 1 reads data from the distributed file system and caches it; then the subsequent algorithm iterations, such as algorithm iteration 1 and algorithm iteration 2 in Figure 2, each algorithm iteration can be split into two phases , including algorithm gradient calculation and gradient collection update, and parameter exchange is required between these two stages.
  • algorithm iteration 1 and algorithm iteration 2 in Figure 2 each algorithm iteration can be split into two phases , including algorithm gradient calculation and gradient collection update, and parameter exchange is required between these two stages.
  • the machine learning task training process requires hundreds of iterations, the length of an iteration directly determines the performance. In one iteration, because the amount of data of gradient parameters in the gradient collection update is small, the performance overhead of gradient parameter exchange, collection and update is small, so the key overhead is the data that needs to be read in the gradient calculation.
  • 3 is a logic diagram of a resource management system for distributed machine learning tasks targeted by the present invention
  • submission unit 201 first, a user submits a machine learning task, and the task includes two aspects of information, one is the size of the data set; the other is the number of containers .
  • Cache mode selection unit 202 The prediction model calculates the memory allocation size according to the data set size and the number of containers, and selects a corresponding cache mode at the same time. The conclusion obtained from the analysis of the performance overhead of machine learning tasks above shows that the data reading in the gradient calculation is the bottleneck of performance. Therefore, when choosing the cache mode, it is necessary to focus on the size of the memory.
  • Memory allocation unit 203 The present invention mainly divides memory allocation into two situations according to the selection of the cache mode, one is optimal performance; the other is optimal resource utilization.
  • the former uses memory resources for data caching and computing to obtain optimal performance; the latter uses disk resources for data caching, and memory resources are used for computing to obtain optimal resource utilization.
  • the memory allocation and cache mode selection of the two are different.
  • the present invention For optimal performance memory allocation and cache mode selection, the present invention first calculates the size of the data to be cached in each container according to the size of the data set and the number of containers, and then calculates the size of the memory to be allocated according to the storage ratio allocated to the cache.
  • the current computing framework basically uses the Java virtual machine, and it is necessary to reserve a part of the memory for garbage collection, so the calculated memory size can also add this part of the memory to meet the normal memory garbage collection.
  • the memory allocation and cache selection model with the best performance. All data is cached in memory to obtain the best performance.
  • a storage medium storing program files capable of implementing any one of the above-mentioned resource management methods for distributed machine learning tasks.
  • a processor is used for running a program, wherein the resource management method for executing any one of the above distributed machine learning tasks is performed when the program is running.
  • the present invention does not need to make additional application portraits for the running machine learning tasks, reducing the overhead of this part;
  • the present invention fully considers the influence of different cache modes.
  • the present invention has been verified by experiments, the computing framework of the experiment is Spark, and three machine learning applications are selected, namely, linear regression, logistic regression and support vector machine.
  • the running results show that the accuracy of the memory allocation and cache selection scheme of the present invention is as high as 95%, and the optimal performance and optimal resource utilization can be achieved.
  • the choice of cache mode can be considered in combination with memory cache and hard disk cache.
  • the cache is full of memory first, and then the remaining part of the data is cached to the hard disk.
  • the advantage of this is to take full advantage of the advantages of fast reading and writing in memory, and at the same time take into account the situation that the data is not enough to be fully cached in memory.
  • Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over multiple units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
  • the technical solution of the present invention is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , which includes several instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of the various embodiments of the present invention.
  • the aforementioned storage medium includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne le domaine des tâches d'apprentissage automatique et porte plus particulièrement sur un procédé et un dispositif de tâches d'apprentissage automatique distribué. Le procédé et le dispositif comprennent la soumission, par un utilisateur, d'une tâche d'apprentissage automatique, la tâche comprenant deux aspects d'informations, le premier étant la taille d'un ensemble de données, le second étant le nombre de conteneurs ; un modèle de prédiction calcule la taille d'attribution d'une mémoire en fonction de la taille de l'ensemble de données et du nombre de conteneurs tout en sélectionnant un mode de mémoire cache correspondant ; en fonction de la sélection du mode de mémoire cache, l'attribution de la mémoire peut être divisée en deux circonstances : un modèle de fonctionnement optimal est sélectionné lorsque la mémoire est suffisante ; un modèle d'utilisation optimale des ressources est sélectionné lorsque la mémoire est insuffisante. La présente invention analyse principalement les attributs de l'apprentissage automatique distribué et calcule la gestion de ressources pour un cadre de référence et, en fonction de l'analyse précitée, construit un modèle pour une prédiction de mémoire et une sélection de mode de mémoire cache, qui attribue directement une mémoire et sélectionne un mode de mémoire cache pour une nouvelle tâche d'apprentissage automatique sans avoir besoin de portraits d'application supplémentaires.
PCT/CN2020/139264 2020-12-10 2020-12-25 Procédé et dispositif de gestion de ressources pour tâches d'apprentissage automatique distribué Ceased WO2022120979A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011435550.8A CN112463389B (zh) 2020-12-10 2020-12-10 分布式机器学习任务的资源管理方法及装置
CN202011435550.8 2020-12-10

Publications (1)

Publication Number Publication Date
WO2022120979A1 true WO2022120979A1 (fr) 2022-06-16

Family

ID=74801211

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/139264 Ceased WO2022120979A1 (fr) 2020-12-10 2020-12-25 Procédé et dispositif de gestion de ressources pour tâches d'apprentissage automatique distribué

Country Status (2)

Country Link
CN (1) CN112463389B (fr)
WO (1) WO2022120979A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119512764A (zh) * 2025-01-14 2025-02-25 青岛国实科技集团有限公司 一种基于机器学习的内存池配置方法及系统
CN120743560A (zh) * 2025-09-01 2025-10-03 麒麟软件有限公司 一种用于cae软件迭代求解法的内存管理方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076555B (zh) * 2023-05-08 2024-03-22 深圳市优友网络科技有限公司 一种基于计算的分布式任务管理系统及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014188052A1 (fr) * 2013-05-24 2014-11-27 Nokia Corporation Procédé d'optimisation de l'utilisation d'une mémoire
CN105824737A (zh) * 2016-03-31 2016-08-03 华中科技大学 用于大数据处理系统的内存数据集置换系统与置换方法
CN109961151A (zh) * 2017-12-21 2019-07-02 同方威视科技江苏有限公司 用于机器学习的计算服务的系统及用于机器学习的方法
CN110427263A (zh) * 2018-04-28 2019-11-08 深圳先进技术研究院 一种面向Docker容器的Spark大数据应用程序性能建模方法、设备及存储设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9311236B2 (en) * 2012-11-20 2016-04-12 International Business Machines Corporation Out-of-memory avoidance in dynamic virtual machine memory adjustment
US9672064B2 (en) * 2015-07-13 2017-06-06 Palo Alto Research Center Incorporated Dynamically adaptive, resource aware system and method for scheduling
CN111831699B (zh) * 2020-09-21 2021-01-08 北京新唐思创教育科技有限公司 数据缓存方法、电子设备及计算机可读介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014188052A1 (fr) * 2013-05-24 2014-11-27 Nokia Corporation Procédé d'optimisation de l'utilisation d'une mémoire
CN105824737A (zh) * 2016-03-31 2016-08-03 华中科技大学 用于大数据处理系统的内存数据集置换系统与置换方法
CN109961151A (zh) * 2017-12-21 2019-07-02 同方威视科技江苏有限公司 用于机器学习的计算服务的系统及用于机器学习的方法
CN110427263A (zh) * 2018-04-28 2019-11-08 深圳先进技术研究院 一种面向Docker容器的Spark大数据应用程序性能建模方法、设备及存储设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG SUFANG; ZHAI JUNHAI, WANG CONG, SHEN CHU: "Big Data and Big Data Machine Learning", JOURNAL OF HEBEI UNIVERSITY (NATURAL SCIENCE EDITION), vol. 38, no. 3, 25 May 2018 (2018-05-25), pages 299 - 308, 336, XP009537475, ISSN: 1000-5854 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119512764A (zh) * 2025-01-14 2025-02-25 青岛国实科技集团有限公司 一种基于机器学习的内存池配置方法及系统
CN120743560A (zh) * 2025-09-01 2025-10-03 麒麟软件有限公司 一种用于cae软件迭代求解法的内存管理方法

Also Published As

Publication number Publication date
CN112463389A (zh) 2021-03-09
CN112463389B (zh) 2024-06-18

Similar Documents

Publication Publication Date Title
US20250307015A1 (en) Method for static scheduling of artificial neural networks for a processor
KR101361945B1 (ko) 컴퓨터 스레드들의 이종 리소스들로의 맵핑
Bicer et al. Time and cost sensitive data-intensive computing on hybrid clouds
WO2022120979A1 (fr) Procédé et dispositif de gestion de ressources pour tâches d'apprentissage automatique distribué
JP2004199561A (ja) 計算機資源割当方法、それを実行するための資源管理サーバおよび計算機システム
CN111722908B (zh) 一种虚拟机的创建方法、系统、设备以及介质
CN112597076B (zh) 一种面向Spark的基于数据感知的缓存替换方法及系统
JP2021524100A (ja) グラフデータに基づくタスクスケジューリング方法、装置、プログラム及び機器
Han et al. Marble: A multi-gpu aware job scheduler for deep learning on hpc systems
WO2020125396A1 (fr) Procédé et dispositif de traitement pour données partagées et serveur
CN109710372B (zh) 一种基于猫头鹰搜索算法的计算密集型云工作流调度方法
WO2021232769A1 (fr) Procédé de stockage de données et appareil de traitement de données
CN108073457B (zh) 一种超融合基础架构的分层资源管理方法、装置及系统
JP2009528649A (ja) 分散コンピューティングに関する改良
CN113037800A (zh) 作业调度方法以及作业调度装置
CN118708533A (zh) 面向K8s的多机多卡GPU最优通信调度方法和系统
CN102750364A (zh) 为多镜像文件分配内存地址空间的方法、编译器和系统
CN118427501A (zh) 一种基于张量融合的数据流优化方法、装置、设备及介质
CN107172222A (zh) 一种基于分布式存储系统的数据存储方法及装置
KR20230058621A (ko) 메모리-한도 스케줄링
CN119902900B (zh) 基于高速互通连接的语言模型检索推理系统、方法
Zhao et al. Resource-aware cache management for in-memory data analytics frameworks
JP7624056B2 (ja) ニューラルネットワークアクセラレータ用の命令生成方法、装置および電子デバイス
CN103403698A (zh) 分布式计算方法和分布式计算系统
Tong et al. DAG-aware harmonizing job scheduling and data caching for disaggregated analytics frameworks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20964908

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20964908

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 150524)

122 Ep: pct application non-entry in european phase

Ref document number: 20964908

Country of ref document: EP

Kind code of ref document: A1