Hinrichs, 1995 - Google Patents
Compiler directed architecture-dependent communication optimizationsHinrichs, 1995
View PDF- Document ID
- 12789126447294772138
- Author
- Hinrichs S
- Publication year
External Links
Snippet
Compiler directed architecture-dependent communication optimizations Compiler directed
architecture-dependent communication optimizations Abstract Communication required for
distributed data structures is one of the major overheads of parallelization. Poor communication …
- 238000004891 communication 0 title abstract description 698
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17356—Indirect interconnection networks
- G06F15/17368—Indirect interconnection networks non hierarchical topologies
- G06F15/17381—Two dimensional, e.g. mesh, torus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17337—Direct connection machines, e.g. completely connected computers, point to point communication networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ramanujam et al. | Tiling multidimensional iteration spaces for multicomputers | |
US6106575A (en) | Nested parallel language preprocessor for converting parallel language programs into sequential code | |
US6088511A (en) | Nested parallel 2D Delaunay triangulation method | |
US6292822B1 (en) | Dynamic load balancing among processors in a parallel computer | |
US6212617B1 (en) | Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication | |
Cowan et al. | Mscclang: Microsoft collective communication language | |
Peir | Program Partitioning and Synchronization on Multiprocessor Systems (Parallel, Computer Architecture, Compiler) | |
Feautrier | Compiling for massively parallel architectures: a perspective | |
Cowan et al. | Gc3: An optimizing compiler for gpu collective communication | |
Hinrichs | Compiler directed architecture-dependent communication optimizations | |
Su et al. | Efficient DOACROSS execution on distributed shared-memory multiprocessors | |
Miller | Two approaches to architecture-independent parallel computation | |
Gaudiot | Data-driven multicomputers in digital signal processing | |
Ramanujam | Compile time techniques for parallel execution of loops on distributed memory multiprocessors | |
Agrawal et al. | Efficient runtime support for parallelizing block structured applications | |
Fimmel | Generation of scheduling functions supporting LSGP-partitioning | |
Ebcioglu et al. | Highly Parallel Multi-FPGA System Compilation from Sequential C/C++ Code in the AWS Cloud | |
Mauney et al. | Computational models and resource allocation for supercomputers | |
Ramanujam et al. | Iteration space tiling for distributed memory machines | |
Tongsima et al. | Architecture-dependent loop scheduling via communication-sensitive remapping | |
Ngai | Runtime resource management in concurrent systems | |
Curtis | A special instruction set multiple chip computer for DSP: architecture and compiler design | |
Richardson | Evaluation of a parallel Chaos router simulator | |
Sussman | Model-driven mapping of computation onto distributed memory parallel computers | |
Kandemir | 2D data locality: definition, abstraction, and application |