Zhu et al., 2022 - Google Patents
QoS-aware co-scheduling for distributed long-running applications on shared clustersZhu et al., 2022
View PDF- Document ID
- 8716287784576044383
- Author
- Zhu J
- Yang R
- Sun X
- Wo T
- Hu C
- Peng H
- Xiao J
- Zomaya A
- Xu J
- Publication year
- Publication venue
- IEEE Transactions on Parallel and Distributed Systems
External Links
Snippet
To achieve a high degree of resource utilization, production clusters need to co-schedule diverse workloads–including both batch analytic jobs with short-lived tasks and long-running applications (LRAs) that execute for a long time frame from hours to months–onto the shared …
- 238000004458 analytical method 0 abstract description 12
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogramme communication; Intertask communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Carrión | Kubernetes scheduling: Taxonomy, ongoing issues and challenges | |
Zhu et al. | QoS-aware co-scheduling for distributed long-running applications on shared clusters | |
Abdulhamid et al. | Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm | |
Singh et al. | QoS-aware autonomic resource management in cloud computing: a systematic review | |
Gill et al. | Resource provisioning based scheduling framework for execution of heterogeneous and clustered workloads in clouds: from fundamental to autonomic offering | |
Kumar et al. | ARPS: An autonomic resource provisioning and scheduling framework for cloud platforms | |
Jennings et al. | Resource management in clouds: Survey and research challenges | |
Coutinho et al. | Elasticity in cloud computing: a survey | |
Calzarossa et al. | Workloads in the Clouds | |
Gill et al. | RADAR: Self‐configuring and self‐healing in resource management for enhancing quality of cloud services | |
Calheiros et al. | Cloudsim: A novel framework for modeling and simulation of cloud computing infrastructures and services | |
US20220027249A1 (en) | Automated methods and systems for troubleshooting problems in a distributed computing system | |
Saxena et al. | OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environments | |
Saxena et al. | A high availability management model based on VM significance ranking and resource estimation for cloud applications | |
Niehorster et al. | Autonomic resource management with support vector machines | |
US20220391279A1 (en) | Machine learning methods and systems for discovering problem incidents in a distributed computer system | |
Meng et al. | Service-oriented reliability modeling and autonomous optimization of reliability for public cloud computing systems | |
RahimiZadeh et al. | Workload-Aware Placement of Multi-Tier Applications in Virtualized Datacenters | |
Hu et al. | Toposch: Latency-aware scheduling based on critical path analysis on shared yarn clusters | |
Femminella et al. | Comparison of Reinforcement Learning Algorithms for Edge Computing Applications Deployed by Serverless Technologies. | |
Datt et al. | Analysis of infrastructure monitoring requirements for OpenStack Nova | |
Caglar et al. | iTune: Engineering the performance of Xen hypervisor via autonomous and dynamic scheduler reconfiguration | |
Sujaudeen et al. | TARNN: Task‐aware autonomic resource management using neural networks in cloud environment | |
Handaoui et al. | Salamander: a holistic scheduling of mapreduce jobs on ephemeral cloud resources | |
Ru et al. | Software engineering for multi-tenancy computing challenges and implications |