gflow - A lightweight, single-node job scheduler
gflow is a lightweight, single-node job scheduler written in Rust, inspired by Slurm. It is designed for efficiently managing and scheduling tasks, especially on machines with GPU resources.
Snapshot
Core Features
- Daemon-based Scheduling: A persistent daemon (
gflowd) manages the job queue and resource allocation. - Rich Job Submission: Supports dependencies, priorities, job arrays, and time limits via the
gbatchcommand. - Time Limits: Set maximum runtime for jobs (similar to Slurm's
--time) to prevent runaway processes. - Service and Job Control: Provides clear commands to inspect the scheduler state (
ginfo), query the job queue (gqueue), and control job states (gcancel). tmuxIntegration: Usestmuxfor robust, background task execution and session management.- Output Logging: Automatic capture of job output to log files via
tmux pipe-pane. - Simple Command-Line Interface: Offers a user-friendly and powerful set of command-line tools.
Component Overview
The gflow suite consists of several command-line tools:
gflowd: The scheduler daemon that runs in the background, managing jobs and resources.ginfo: Displays scheduler and GPU information.gbatch: Submits jobs to the scheduler, similar to Slurm'ssbatch.gqueue: Lists and filters jobs in the queue, similar to Slurm'ssqueue.gcancel: Cancels jobs and manages job states (internal use).
Installation
Quick Install (Linux x86_64)
Install gflow with a single command:
|
This will download and install the latest release binaries to ~/.cargo/bin. You can customize the installation directory by setting the GFLOW_INSTALL_DIR environment variable:
| GFLOW_INSTALL_DIR=/usr/local/bin
Install via cargo
This will install all the necessary binaries (gflowd, ginfo, gbatch, gqueue, gcancel, gjob).
Install via Conda
You can install gflow using Conda from the conda-forge channel:
Build Manually
-
Clone the repository:
-
Build the project:
The executables will be available in the
target/release/directory.
Quick Start
-
Start the scheduler daemon:
Run this in a dedicated terminal or
tmuxsession and leave it running. You can check its health at any time withgflowd statusand inspect resources withginfo. -
Submit a job: Create a script
my_job.sh:#!/bin/bashSubmit it using
gbatch: -
Check the job queue:
You can also watch the queue update live:
watch gqueue. -
Stop the scheduler:
This shuts down the daemon and cleans up the tmux session.
Usage Guide
Submitting Jobs with gbatch
gbatch provides flexible options for job submission.
-
Submit a command directly:
-
Set a job name and priority:
-
Create a job that depends on another:
# First job # Get job ID from gqueue, e.g., 123 # Second job depends on the first -
Set a time limit for a job:
# 30-minute limit # 2-hour limit (HH:MM:SS format) # 5 minutes 30 secondsSee docs/TIME_LIMITS.md for detailed documentation on time limits.
Querying Jobs with gqueue
gqueue allows you to filter and format the job list.
-
Filter by job state:
-
Filter by job ID or name:
-
Customize output format:
Configuration
Configuration for gflowd can be customized. The default configuration file is located at ~/.config/gflow/gflowd.toml.
Star History
Contributing
If you find any bugs or have feature requests, feel free to create an Issue and contribute by submitting Pull Requests.
License
gflow is licensed under the MIT License. See LICENSE for more details.