Donaldson et al., 2010 - Google Patents
Automatic offloading of C++ for the Cell BE processor: a case study using OffloadDonaldson et al., 2010
View PDF- Document ID
- 943924982738058752
- Author
- Donaldson A
- Dolinsky U
- Richards A
- Russell G
- Publication year
- Publication venue
- 2010 International Conference on Complex, Intelligent and Software Intensive Systems
External Links
Snippet
Offload C++ is an extended version of the C++ language, together with a compiler and runtime system, for automatically offloading general-purpose C++ code to run on the Synergistic Processor Elements (SPEs) of the Cell Broadband Engine (BE) processor. We …
- 238000000034 method 0 abstract description 6
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by a single instruction, e.g. SIMD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3877—Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/451—Code distribution
- G06F8/452—Loops
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/4421—Execution paradigms
- G06F9/4428—Object-oriented
- G06F9/443—Object-oriented method invocation or resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | TVM: end-to-end optimization stack for deep learning | |
Ragan-Kelley et al. | Halide: Decoupling algorithms from schedules for high-performance image processing | |
Yang et al. | A GPGPU compiler for memory optimization and parallelism management | |
Ragan-Kelley et al. | Decoupling algorithms from schedules for easy optimization of image processing pipelines | |
Zhang et al. | On-the-fly elimination of dynamic irregularities for GPU computing | |
Jia et al. | Optimizing N-dimensional, winograd-based convolution for manycore CPUs | |
CN101268444B (en) | Data conversion for streaming applications on multiprocessors | |
Leißa et al. | Extending a C-like language for portable SIMD programming | |
Hagiescu et al. | Automated architecture-aware mapping of streaming applications onto GPUs | |
Talbot et al. | Riposte: a trace-driven compiler and parallel VM for vector code in R | |
WO2022126885A1 (en) | Method and apparatus for retaining optimal width vector operations in arbitrary/flexible vector width architecture | |
Diamos et al. | Translating GPU binaries to tiered SIMD architectures with Ocelot | |
Lueh et al. | C-for-metal: High performance simd programming on intel gpus | |
Michell et al. | Tasklettes–a fine grained parallelism for Ada on multicores | |
Ching et al. | Automatic parallelization of array-oriented programs for a multi-core machine | |
Parashar et al. | Hardware abstractions for targeting EDDO architectures with the polyhedral model | |
Donaldson et al. | Automatic offloading of C++ for the Cell BE processor: a case study using Offload | |
Kazi et al. | Coarse-grained thread pipelining: A speculative parallel execution model for shared-memory multiprocessors | |
Goossens | Dataflow management, dynamic load balancing, and concurrent processing for real‐time embedded vision applications using Quasar | |
Ravishankar et al. | Automatic acceleration of Numpy applications on GPUs and multicore CPUs | |
Gay et al. | Yada: Straightforward parallel programming | |
Rapaport et al. | Streamlining whole function vectorization in C using higher order vector semantics | |
Donaldson et al. | Auto-parallelisation of Sieve C++ programs | |
Chimeh et al. | Compiling vector pascal to the xeonphi | |
Mastoras et al. | Ariadne—Directive-based parallelism extraction from recursive functions |