Zhao et al., 2023 - Google Patents

Supercut: Communication-aware partitioning for near-memory graph processing

Zhao et al., 2023

Document ID: 1029521103381206034
Author: Zhao C; Chamberlain R; Zhang X
Publication year: 2023
Publication venue: Proceedings of the 20th ACM International Conference on Computing Frontiers

External Links

Cited by

Snippet

The parallel execution of many graph algorithms is frequently dominated by data communication overheads between compute nodes. This bottleneck becomes even more pronounced in Near-Memory Processing (NMP) architectures with multiple memory cubes …

Continue reading at dl.acm.org (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17356—Indirect interconnection networks
- G06F15/17368—Indirect interconnection networks non hierarchical topologies
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL

Similar Documents

Publication	Publication Date	Title
Zhang et al.	2018	GraphP: Reducing communication for PIM-based graph processing with efficient data partition
Serafini et al.	2021	Scalable graph neural network training: The case for sampling
US9465632B2 (en)	2016-10-11	Parallel hardware hypervisor for virtualizing application-specific supercomputers
Besta et al.	2020	Substream-centric maximum matchings on fpga
Ax et al.	2017	CoreVA-MPSoC: A many-core architecture with tightly coupled shared and local data memories
Xiao et al.	2021	Plasticity-on-chip design: Exploiting self-similarity for data communications
JP7595587B2 (en)	2024-12-06	Compilation flow for heterogeneous multicore architectures
Yao et al.	2022	Scalagraph: A scalable accelerator for massively parallel graph processing
US20250258794A1 (en)	2025-08-14	Sorting and Placing Nodes of an Operation Unit Graph onto a Reconfigurable Processor
Liu et al.	2019	OBFS: OpenCL based BFS optimizations on software programmable FPGAs
US20230162032A1 (en)	2023-05-25	Estimating Throughput for Placement Graphs for a Reconfigurable Dataflow Computing System
Zhao et al.	2023	Supercut: Communication-aware partitioning for near-memory graph processing
Bach et al.	1997	Building the 4 processor SB-PRAM prototype
Ye	2004	On-chip multiprocessor communication network design and analysis
US12332836B2 (en)	2025-06-17	Estimating a scaled cost of implementing an operation unit graph on a reconfigurable processor
US12242403B2 (en)	2025-03-04	Direct access to reconfigurable processor memory
US12386602B2 (en)	2025-08-12	Operation fusion in nested meta-pipeline loops
US20240427727A1 (en)	2024-12-26	Handling dynamic tensor lengths in a reconfigurable processor that includes multiple memory units
US12135990B2 (en)	2024-11-05	Modeling and compiling tensor processing applications for a computing platform using multi-layer adaptive data flow graphs
Mattheakis et al.	2013	Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systems
Yang et al.	2016	LMC: Automatic resource-aware program-optimized memory partitioning
Ax et al.	2015	System-level analysis of network interfaces for hierarchical mpsocs
Min	2022	Fine-grained memory access over I/O interconnect for efficient remote sparse data access
Kocoloski	2018	Scalability in the Presence of Variability
Sanaullah	2019	Towards hardware as a reconfigurable, elastic, and specialized service