[go: up one dir, main page]

Fu et al., 2024 - Google Patents

Improving data locality of tasks by executor allocation in Spark computing environment

Fu et al., 2024

Document ID
14719705551490982756
Author
Fu Z
He M
Yi Y
Tang Z
Publication year
Publication venue
IEEE Transactions on Cloud Computing

External Links

Snippet

The concept of data locality is crucial for distributed systems (eg, Spark and Hadoop) to process Big Data. Most of the existing research optimized the data locality from the aspect of task scheduling. However, as the execution container of Spark's tasks, the executor …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Programme initiating; Programme switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network

Similar Documents

Publication Publication Date Title
Fu et al. An optimal locality-aware task scheduling algorithm based on bipartite graph modelling for spark applications
Pakize A comprehensive view of Hadoop MapReduce scheduling algorithms
Fu et al. Improving data locality of tasks by executor allocation in Spark computing environment
Gandomi et al. HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework
Maleki et al. MapReduce: an infrastructure review and research insights
Senthilkumar et al. A survey on job scheduling in big data
Javanmardi et al. A unit-based, cost-efficient scheduler for heterogeneous Hadoop systems.
Li et al. MapReduce delay scheduling with deadline constraint
Maleki et al. TMaR: a two-stage MapReduce scheduler for heterogeneous environments
Ji et al. Adaptive workflow scheduling for diverse objectives in cloud environments
Hadjar et al. A new approach for scheduling tasks and/or jobs in big data cluster
Aarthee et al. Energy-aware heuristic scheduling using bin packing mapreduce scheduler for heterogeneous workloads performance in big data
Idris et al. Context‐aware scheduling in MapReduce: a compact review
Ghazali et al. CLQLMRS: improving cache locality in MapReduce job scheduling using Q-learning
Perwej The ambient scrutinize of scheduling algorithms in big data territory
SM et al. Priority based resource allocation and demand based pricing model in peer-to-peer clouds
Fu et al. Load balancing algorithms for hadoop cluster in unbalanced environment
Lin et al. Impact of MapReduce policies on job completion reliability and job energy consumption
Li et al. Migration-based online CPSCN big data analysis in data centers
Fu et al. Optimizing data locality by executor allocation in spark computing environment
Xu et al. Multi resource scheduling with task cloning in heterogeneous clusters
Pandey et al. An energy-efficient greedy MapReduce scheduler for heterogeneous Hadoop YARN cluster
Chen et al. A real-time scheduling strategy based on processing framework of Hadoop
He et al. Design and implementation of fully distributed heterogeneous resource management system
Dadamis et al. IgNITE: Scheduling pipeline-parallel DNN training jobs on heterogeneous infrastructures