Keller et al., 2000 - Google Patents
Anatomy of a resource management system for HPC clustersKeller et al., 2000
View PDF- Document ID
- 4193742046305128334
- Author
- Keller A
- Reinefeld A
- Publication year
External Links
Snippet
Workstation clusters are often not only used for high-throughput computing in time-sharing mode but also for running complex parallel jobs in space-sharing mode. This poses several difficulties to the resource management system, which must be able to reserve computing …
- 210000003484 anatomy 0 title abstract description 6
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogramme communication; Intertask communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant details of failing over
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Application independent communication protocol aspects or techniques in packet data networks
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—High level architectural aspects of 7-layer open systems interconnection [OSI] type protocol stacks
- H04L69/322—Aspects of intra-layer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/329—Aspects of intra-layer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer, i.e. layer seven
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L29/00—Arrangements, apparatus, circuits or systems, not covered by a single one of groups H04L1/00 - H04L27/00 contains provisionally no documents
- H04L29/02—Communication control; Communication processing contains provisionally no documents
- H04L29/06—Communication control; Communication processing contains provisionally no documents characterised by a protocol
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network-specific arrangements or communication protocols supporting networked applications
- H04L67/10—Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Keller et al. | Anatomy of a resource management system for HPC clusters | |
US6223202B1 (en) | Virtual machine pooling | |
Caron et al. | Diet: A scalable toolbox to build network enabled servers on the grid | |
US7171654B2 (en) | System specification language for resource management architecture and corresponding programs therefore | |
Ghormley et al. | GLUix: a global layer unix for a network of workstations | |
US20060085785A1 (en) | Method and apparatus for configuring, monitoring and/or managing resource groups including a virtual machine | |
US20170364844A1 (en) | Automated-application-release-management subsystem that supports insertion of advice-based crosscutting functionality into pipelines | |
US20060085668A1 (en) | Method and apparatus for configuring, monitoring and/or managing resource groups | |
EP1649368A1 (en) | Grid browser component | |
Keller et al. | CCS resource management in networked HPC systems | |
Brune et al. | Managing clusters of geographically distributed high‐performance computers | |
Baliś et al. | A solution for adapting legacy code as web services | |
Vogels et al. | An overview of the galaxy management framework for scalable enterprise cluster computing | |
Uthayopas et al. | Sce: A fully integrated software tool for beowulf cluster system | |
Klous et al. | Transparent access to grid resources for user software | |
Miller et al. | The tool daemon protocol (TDP) | |
Schmidt | A domain analysis of network daemon design dimensions | |
Brune et al. | Specifying resources and services in metacomputing environments | |
Geist | Advanced programming in PVM | |
Brune et al. | 26. Multi-User System Management on SCI Clusters | |
Wu et al. | Creation of web-based user interface for supercomputing environment | |
Margaris | Local Area Multicomputer (LAM-MPI) | |
Clarke et al. | Cluster Operating Systems | |
Rathore et al. | Implementing Checkpointing Algorithm in Alchemi. NET | |
Walsh et al. | TCD-CS-2000-13 Title: Review of mobility systems |