Jog et al., 2016 - Google Patents
Exploiting core criticality for enhanced GPU performanceJog et al., 2016
View PDF- Document ID
- 5933080122041335085
- Author
- Jog A
- Kayiran O
- Pattnaik A
- Kandemir M
- Mutlu O
- Iyer R
- Das C
- Publication year
- Publication venue
- Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science
External Links
Snippet
Modern memory access schedulers employed in GPUs typically optimize for memory throughput. They implicitly assume that all requests from different cores are equally important. However, we show that during the execution of a subset of CUDA applications …
- 230000015654 memory 0 abstract description 159
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
- G06F9/5088—Techniques for rebalancing the load in a distributed system involving task migration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5094—Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/885—Monitoring specific for caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02B—INDEXING SCHEME RELATING TO CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO BUILDINGS, e.g. INCLUDING HOUSING AND APPLIANCES OR RELATED END-USER APPLICATIONS
- Y02B60/00—Information and communication technologies [ICT] aiming at the reduction of own energy use
- Y02B60/10—Energy efficient computing
- Y02B60/14—Reducing energy-consumption by means of multiprocessor or multiprocessing based techniques, other than acting upon the power supply
- Y02B60/142—Resource allocation
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Jog et al. | Exploiting core criticality for enhanced GPU performance | |
| Kayıran et al. | Neither more nor less: Optimizing thread-level parallelism for GPGPUs | |
| Kasture et al. | Ubik: Efficient cache sharing with strict QoS for latency-critical workloads | |
| Kayiran et al. | Managing GPU concurrency in heterogeneous architectures | |
| Mutlu et al. | Stall-time fair memory access scheduling for chip multiprocessors | |
| Jog et al. | Orchestrated scheduling and prefetching for GPGPUs | |
| Iyer et al. | QoS policies and architecture for cache/memory in CMP platforms | |
| Usui et al. | DASH: Deadline-aware high-performance memory scheduler for heterogeneous systems with hardware accelerators | |
| Li et al. | Priority-based cache allocation in throughput processors | |
| Ausavarungnirun et al. | Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems | |
| US8924690B2 (en) | Apparatus and method for heterogeneous chip multiprocessors via resource allocation and restriction | |
| Jog et al. | OWL: Cooperative thread array aware scheduling techniques for improving GPGPU performance | |
| Kayiran et al. | μC-States: Fine-grained GPU datapath power management | |
| US20130081039A1 (en) | Resource allocation using entitlements | |
| Kanduri et al. | Approximation knob: Power capping meets energy efficiency | |
| CN112230757B (en) | Method and system for reducing power by vacating a subset of CPU and memory | |
| Zhou et al. | Writeback-aware bandwidth partitioning for multi-core systems with pcm | |
| Gupta et al. | Timecube: A manycore embedded processor with interference-agnostic progress tracking | |
| Kas | Toward on-chip datacenters: a perspective on general trends and on-chip particulars | |
| Chiang et al. | Kernel mechanisms with dynamic task-aware scheduling to reduce resource contention in NUMA multi-core systems | |
| Modi et al. | CABARRE: Request response arbitration for shared cache management | |
| Tiwari et al. | REAL: REquest arbitration in last level caches | |
| Kim et al. | Using DVFS and task scheduling algorithms for a hard real-time heterogeneous multicore processor environment | |
| Jahre et al. | A light-weight fairness mechanism for chip multiprocessor memory systems | |
| Nath et al. | Temperature aware thread block scheduling in GPGPUs |