Glossary
Here we define Flux-specific and general HPC and workload management terms used in our documentation that may not be familiar to all readers.
- clique
A group of tasks belonging to a parallel program that are co-located on a node. Cliques may communicate with each other more efficiently than tasks on different nodes.
- critical ranks
The Flux broker ranks that must remain online for Flux to continue operating.
- enclosing instance
The Flux instance that a process naturally interacts with. It is the instance referred to by the
FLUX_URIenvironment variable, or if that is not set, it is the system instance.- evolving
An evolving job is similar to a malleable job, except that the application, rather than the system, may initiate resource grow and shrink at runtime 1.
- expedited
A job is said to be expedited if its urgency is set to the maximum value of 31. An expedited job's priority is always set to the maximum value.
- FSD
A common string representation of time duration, defined by 23/Flux Standard Duration. Example:
2.5h.- guest
A Flux user that is not the instance owner. Guests are only allowed to run in a Flux instance configured for multi-user support, normally a system instance.
- held
A job is said to be held if its urgency is set to zero. This prevents it from being considered for scheduling until the urgency is raised.
- hostlist
A compact string representation of a list of hostnames, defined by 29/Hostlist Format. Example:
fluke[0-127,130].- idset
A compact string representation of a set of non-negative integers, defined by 22/Idset String Representation. Example:
2,4,6,1-100.- IMP
The Independent Minister of Privilege. The simple setuid root component of Flux, from the flux-security project, that allows an instance owner to perform a limited set of tasks on behalf of a guest user in a multi-user Flux instance.
- initial program
A user-defined program, such as a batch script, launched on the first node of a Flux instance. Its purpose is to launch and monitor a workload. Once it is complete, the instance exits.
- instance owner
The user that started the Flux instance. The instance owner has control over all aspects of the Flux instance's operation.
- job
The smallest unit of work that can be allocated resources and run by Flux. A job is typically a parallel program, but may consist of one or more singletons. A job can be a Flux instance which in turn can run more jobs.
- jobspec
The JSON or YAML object representing a Flux job request, defined by 14/Canonical Job Specification (the general specification) and 25/Job Specification Version 1 (the current version). It includes the abstract resource requirements of the job and instructions for job execution.
- malleable
A malleable job requests a variable, bounded quantity of resources that the system may grow or shrink (within bounds) at runtime 1.
- moldable
A moldable job requests a variable, bounded quantity of resources that, once allocated by the system, is fixed at runtime 1.
- parallel program
A ranked group of tasks, often the same executable, launched in parallel and working together to solve a problem.
- PMI
The Process Management Interface is a quasi-standard interface for bootstrapping MPI programs. Version 1 is described in 13/Simple Process Manager Interface v1.
- priority
The order in which the scheduler considers jobs. By default, priority is derived from the urgency and submit time, but a priority plugin can be used to override this calculation.
- R
The JSON object used by Flux to represent a concrete resource set. See 20/Resource Set Specification Version 1.
- resource inventory
The concrete set of resources managed by a given Flux instance.
- rigid
A rigid job requests a fixed quantity of resources that remains fixed at runtime 1.
- scheduler
The Flux component that fulfills resource allocation requests from the resource inventory. Abstract resource requirements are extracted from the user-provided jobspec, and fulfilled with a resource set expressed as R. In addition to fitting concrete resources to abstract requests, the scheduler must balance goals such as fairness and resource utilization when it decides upon a schedule for fulfilling competing requests.
- singleton
A degenerate parallel program with only one task.
- slot
The abstract resource requirements of one task.
- step
In other workload managers, a job step is a unit of work within a job. Flux, which has a robust recursive definition of a job, does not use this term.
- system instance
A multi-user Flux instance running as the primary resource manager on a cluster. The system instance typically runs as an unprivileged system user like
flux, is started by systemd(1), and allows guest users to run jobs.- task
A process at the operating system level. A task may represent one rank of a parallel program.
- taskmap
A compact mapping between job task ranks and node IDs, defined by 34/Flux Task Map.
- TBON
Tree based overlay network. Flux brokers are interconnected with one.
- urgency
A job attribute that the user sets to indicate how urgent the work is. The range is 0 to 31, with a default value of 16. Urgency is defined by 30/Job Urgency.
- workflow
A set of related jobs that are orchestrated to accomplish a goal. In Flux, orchestration naturally maps to the initial program of a Flux instance. An example of a simple workflow is a batch job whose batch script submits a set of inter-dependent jobs then waits for them to complete.