Limits
Overview
The multi-factor priority plugin in flux-accounting not only performs validation for associations and priority calculation for jobs; it can also administer and enforce limits on a per-association and per-queue basis. Limits are another way to ensure fair behavior of systems that see a lot of simultaneous activity from a large number of users.
Hard Limits vs. Soft Limits
There are two different kinds of limits in flux-accounting: hard limits and soft limits. Hard limits will prevent an association from submitting a job altogether and will report a message as to why the job cannot proceed past validation. Soft limits will allow a job to be submitted but will prevent it from running until a prerequisite has been met.
There is just one hard limit in flux-accounting:
- (per-association) max active jobs
The max number of active jobs an association can have at any given time.
The soft limits in flux-accounting are composed of:
- (per-association) max running jobs
The max number of running jobs an association can have at any given time.
- (per-association) max resources
The max number of resources (total cores + total nodes) an association can have across their running jobs at any given time.
- (per-queue) max running jobs
The max number of running jobs an association can have in a given queue at any given time.
- (per-queue) max nodes
The max number of nodes an association can have across their running jobs in a givent queue at any given time.
Note
For more details on the difference between an active job and a running job, see the virtual states section of RFC 21.
An example
The difference between hard and soft limits might be best described by example. Let's configure an association to have a limit configuration of at most 1 running job and 2 active jobs:
$ flux account add-user --username=buster --bank=giants --max-running-jobs=1 --max-active-jobs=2
buster can submit a job and the priority plugin will generate a priority
for this job and pass it on to the scheduler to begin running. If buster
submits a second job while the first one is running, this second job will be
held until the first job completes. Specifically, a dependency will be added
to the job to hold it in the DEPEND state until the first job has
transitioned to the INACTIVE state. The name of the dependency can be found
in the job's eventlog:
$ flux job eventlog JOBID
dependency-add description="max-resources-user-limit"
After the first job has transitioned to INACTIVE, the dependency will be
removed and the job can proceed to have its priority calculated and move on to
the scheduler to be run:
dependency-remove description="max-resources-user-limit"
depend
priority priority=50000
alloc annotations={"sched":{"resource_summary":"rank0/core[0-1]"}}
If buster submits a third job while the first job is still running and
the second job is waiting in DEPEND, it will be rejected due to the
association's max active jobs limit:
$ flux submit my_job
flux-job: user has max active jobs
These limits can be configured and modified after an association or a queue
has been created in flux-accounting with the edit-user and edit-queue
commands. After modifying limits for either an association or a queue, be sure
to update the priority plugin with the new data written to the flux-accounting
database:
$ flux account-priority-update
How flux-accounting dependencies are removed from a job
When an association's currently running job finishes running and has
transitioned to INACTIVE state, an association's set of held jobs (if any)
are iterated through and checked one at a time to see if any or all of them
meet the requirements to have their dependencies removed and transition to
RUN. The workflow looks like the following: grab the held job, its
attributes, and its dependencies. Ensure that the job would not:
Put the association over the max running jobs limit for the queue the job is submitted in.
Put the association over the max nodes limit for the queue the job is submitted in.
Put the association over their max running jobs limit regardless of queue.
Put the association over their max resources limit regardless of queue.
The associated dependency is removed from the job as each requirement is met. In other words, a job can have a dependency removed from it while still possessing one or more other dependencies.
FAQ
My job is held with a flux-accounting dependency. How can I figure out why it's held?
The first thing to check is the properties associated with the dependency
placed on the job. For example, if the job has a max-run-jobs-queue
dependency, look to see:
how many jobs the association already has running in that queue:
$ flux jobs --queue=debug --filter=RUN --user=buster
what the
max_running_jobslimit is for that queue:
$ flux account view-queue debug --parsable
queue | min_nodes_per_job | max_nodes_per_job | max_time_per_job | priority | max_running_jobs | max_nodes_per_assoc
-------+-------------------+-------------------+------------------+----------+------------------+--------------------
debug | 1 | 1 | 60 | 0 | 100 | 1
If more information is required, you may also want to check the limits configured for the association:
$ flux account view-user buster
username | userid | max_running_jobs | max_active_jobs | max_nodes | max_cores | queues
---------+--------+------------------+-----------------+-----------+-----------+--------
buster | 28 | 5 | 7 | unlimited | unlimited | debug
Administrators may also be interested in looking at what information the
priority plugin currently has for each association, queue, and project with
flux jobtap query mf_priority.so. This will return what limits and held
jobs each association has according to the priority plugin, organized by
user ID.
Do attribute changes to a queue or association take effect immediately?
No. If the limits configured for a particular queue or association do not seem
to fit your needs, you can change them. However, be sure to note that these
limits need to be pushed to the priority plugin with
flux account-priority-update in order for them to take effetc. When the
plugin is updated with the new limits, the held jobs for every association are
reanalyzed to see if they now fit the requirements to be released.