Fu et al., 2024 - Google Patents
Improving data locality of tasks by executor allocation in Spark computing environmentFu et al., 2024
- Document ID
- 14719705551490982756
- Author
- Fu Z
- He M
- Yi Y
- Tang Z
- Publication year
- Publication venue
- IEEE Transactions on Cloud Computing
External Links
Snippet
The concept of data locality is crucial for distributed systems (eg, Spark and Hadoop) to process Big Data. Most of the existing research optimized the data locality from the aspect of task scheduling. However, as the execution container of Spark's tasks, the executor …
- 230000006854 communication 0 abstract description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network-specific arrangements or communication protocols supporting networked applications
- H04L67/10—Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fu et al. | An optimal locality-aware task scheduling algorithm based on bipartite graph modelling for spark applications | |
Pakize | A comprehensive view of Hadoop MapReduce scheduling algorithms | |
Fu et al. | Improving data locality of tasks by executor allocation in Spark computing environment | |
Gandomi et al. | HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework | |
Maleki et al. | MapReduce: an infrastructure review and research insights | |
Senthilkumar et al. | A survey on job scheduling in big data | |
Javanmardi et al. | A unit-based, cost-efficient scheduler for heterogeneous Hadoop systems. | |
Li et al. | MapReduce delay scheduling with deadline constraint | |
Maleki et al. | TMaR: a two-stage MapReduce scheduler for heterogeneous environments | |
Ji et al. | Adaptive workflow scheduling for diverse objectives in cloud environments | |
Hadjar et al. | A new approach for scheduling tasks and/or jobs in big data cluster | |
Aarthee et al. | Energy-aware heuristic scheduling using bin packing mapreduce scheduler for heterogeneous workloads performance in big data | |
Idris et al. | Context‐aware scheduling in MapReduce: a compact review | |
Ghazali et al. | CLQLMRS: improving cache locality in MapReduce job scheduling using Q-learning | |
Perwej | The ambient scrutinize of scheduling algorithms in big data territory | |
SM et al. | Priority based resource allocation and demand based pricing model in peer-to-peer clouds | |
Fu et al. | Load balancing algorithms for hadoop cluster in unbalanced environment | |
Lin et al. | Impact of MapReduce policies on job completion reliability and job energy consumption | |
Li et al. | Migration-based online CPSCN big data analysis in data centers | |
Fu et al. | Optimizing data locality by executor allocation in spark computing environment | |
Xu et al. | Multi resource scheduling with task cloning in heterogeneous clusters | |
Pandey et al. | An energy-efficient greedy MapReduce scheduler for heterogeneous Hadoop YARN cluster | |
Chen et al. | A real-time scheduling strategy based on processing framework of Hadoop | |
He et al. | Design and implementation of fully distributed heterogeneous resource management system | |
Dadamis et al. | IgNITE: Scheduling pipeline-parallel DNN training jobs on heterogeneous infrastructures |