Zhao et al., 2023 - Google Patents
Supercut: Communication-aware partitioning for near-memory graph processingZhao et al., 2023
View PDF- Document ID
- 1029521103381206034
- Author
- Zhao C
- Chamberlain R
- Zhang X
- Publication year
- Publication venue
- Proceedings of the 20th ACM International Conference on Computing Frontiers
External Links
Snippet
The parallel execution of many graph algorithms is frequently dominated by data communication overheads between compute nodes. This bottleneck becomes even more pronounced in Near-Memory Processing (NMP) architectures with multiple memory cubes …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17356—Indirect interconnection networks
- G06F15/17368—Indirect interconnection networks non hierarchical topologies
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | GraphP: Reducing communication for PIM-based graph processing with efficient data partition | |
Serafini et al. | Scalable graph neural network training: The case for sampling | |
US9465632B2 (en) | Parallel hardware hypervisor for virtualizing application-specific supercomputers | |
Besta et al. | Substream-centric maximum matchings on fpga | |
Ax et al. | CoreVA-MPSoC: A many-core architecture with tightly coupled shared and local data memories | |
Xiao et al. | Plasticity-on-chip design: Exploiting self-similarity for data communications | |
JP7595587B2 (en) | Compilation flow for heterogeneous multicore architectures | |
Yao et al. | Scalagraph: A scalable accelerator for massively parallel graph processing | |
US20250258794A1 (en) | Sorting and Placing Nodes of an Operation Unit Graph onto a Reconfigurable Processor | |
Liu et al. | OBFS: OpenCL based BFS optimizations on software programmable FPGAs | |
US20230162032A1 (en) | Estimating Throughput for Placement Graphs for a Reconfigurable Dataflow Computing System | |
Zhao et al. | Supercut: Communication-aware partitioning for near-memory graph processing | |
Bach et al. | Building the 4 processor SB-PRAM prototype | |
Ye | On-chip multiprocessor communication network design and analysis | |
US12332836B2 (en) | Estimating a scaled cost of implementing an operation unit graph on a reconfigurable processor | |
US12242403B2 (en) | Direct access to reconfigurable processor memory | |
US12386602B2 (en) | Operation fusion in nested meta-pipeline loops | |
US20240427727A1 (en) | Handling dynamic tensor lengths in a reconfigurable processor that includes multiple memory units | |
US12135990B2 (en) | Modeling and compiling tensor processing applications for a computing platform using multi-layer adaptive data flow graphs | |
Mattheakis et al. | Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systems | |
Yang et al. | LMC: Automatic resource-aware program-optimized memory partitioning | |
Ax et al. | System-level analysis of network interfaces for hierarchical mpsocs | |
Min | Fine-grained memory access over I/O interconnect for efficient remote sparse data access | |
Kocoloski | Scalability in the Presence of Variability | |
Sanaullah | Towards hardware as a reconfigurable, elastic, and specialized service |