[go: up one dir, main page]

Girondi et al., 2024 - Google Patents

Toward GPU-centric Networking on Commodity Hardware

Girondi et al., 2024

View PDF
Document ID
15208686095084488382
Author
Girondi M
Scazzariello M
Maguire Jr G
Kostić D
Publication year
Publication venue
Proceedings of the 7th International Workshop on Edge Systems, Analytics and Networking

External Links

Snippet

GPUs are emerging as the most popular accelerator for many applications, powering the core of machine learning applications. In networked GPU-accelerated applications input & output data typically traverse the CPU and the OS network stack multiple times, getting …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Programme initiating; Programme switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogramme communication; Intertask communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • G06F9/455Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Application independent communication protocol aspects or techniques in packet data networks
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32High level architectural aspects of 7-layer open systems interconnection [OSI] type protocol stacks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Application independent communication protocol aspects or techniques in packet data networks
    • H04L69/12Protocol engines, e.g. VLSIs or transputers

Similar Documents

Publication Publication Date Title
US11734179B2 (en) Efficient work unit processing in a multicore system
US10841245B2 (en) Work unit stack data structures in multiple core processor system for stream data processing
US11237880B1 (en) Dataflow all-reduce for reconfigurable processor systems
US11392740B2 (en) Dataflow function offload to reconfigurable processors
US10929175B2 (en) Service chaining hardware accelerators within a data stream processing integrated circuit
CN110915173B (en) Data processing unit for computing nodes and storage nodes
US8438578B2 (en) Network on chip with an I/O accelerator
US8462369B2 (en) Hybrid image processing system for a single field of view having a plurality of inspection threads
Daoud et al. GPUrdma: GPU-side library for high performance networking from GPU kernels
US8675219B2 (en) High bandwidth image processing with run time library function offload via task distribution to special purpose engines
US11956156B2 (en) Dynamic offline end-to-end packet processing based on traffic class
US20220229677A1 (en) Performance modeling of graph processing computing architectures
WO2020163327A1 (en) System-based ai processing interface framework
US20230100935A1 (en) Microservice deployments using accelerators
KR20120019711A (en) Network architecture and method for processing packet data using the same
CN104820582A (en) Realization method of multicore embedded DSP (Digital Signal Processor) parallel programming model based on Navigator
Girondi et al. Toward GPU-centric Networking on Commodity Hardware
EP4607350A1 (en) Unloading card provided with accelerator
EP4475001A1 (en) Network device-based data processing method and network device
Cardellini et al. Overlapping communication with computation in MPI applications
Zhang et al. DFabric: Scaling out data parallel applications with CXL-Ethernet hybrid interconnects
CN119493661A (en) Telemetry and load balancing in CXL systems
US20250315319A1 (en) Asynchronous post-send
Nan et al. Plug & Offload: Transparently Offloading TCP Stack onto Off-path SmartNIC with PnO-TCP
Brasilino et al. In-network processing for edge computing with InLocus