flux-shell-options(7)

DESCRIPTION

On startup, flux shell examines the jobspec for shell-specific options under the attributes.system.shell.options key. These options control shell behavior and features including I/O redirection, CPU/GPU affinity, signal handling, plugin configuration, and more.

Shell options may be set via flux submit -o, --setopt=OPT option, or explicitly added to the jobspec by other means.

Options may be simple boolean switches (e.g., verbose) or may take arguments. Since jobspec is a JSON document, shell options can accept complex JSON objects as arguments, enabling flexible runtime configuration.

Option Format

Options specified without a value get the default value of 1:

$ flux run -o verbose myapp
$ flux run -o pty myapp

To specify a boolean value use true or false explicitly:

$ flux run -o bool-option=true

Options with simple scalar arguments use = syntax:

$ flux run -o verbose=2 myapp
$ flux run -o exit-timeout=5m myapp

Options with object arguments specify JSON:

$ flux run -o 'cpu-affinity={"verbose":true}' myapp

Note: Most options documented here use simplified syntax. See individual option descriptions for object-based configuration when available.

Command-Line Convenience Options

Many shell options have corresponding command-line options in the Flux submission commands (flux-run(1), flux-submit(1), flux-batch(1), flux-alloc(1)) that provide convenient syntax for common use cases. When available, these command-line options should be preferred over setting shell options directly with -o.

For example, use:

$ flux submit --signal=SIGUSR1@60s myapp

rather than:

$ flux submit -o signal.signum=10 -o signal.timeleft=60 myapp

Throughout this manual, cross-references to command-line options indicate when a more convenient interface is available.

CORE OPTIONS

Shell Configuration

verbose[=INT]

Set the shell verbosity to INT. Higher values increase verbosity:

  • 0: Errors only (default)

  • 1: Informational messages

  • 2: Debug messages including periodic resource monitoring

$ flux run -o verbose=2 myapp
initrc=FILE

Load flux shell initrc.lua file from FILE instead of the default system initrc path ($sysconfdir/flux/shell/initrc.lua).

Warning: This completely replaces the system initrc, potentially bypassing default plugin loading. Use userrc instead to extend the system configuration.

$ flux run -o initrc=/custom/initrc.lua myapp
userrc=FILE

Load an additional initrc.lua file after the system initrc. This is the recommended way to customize shell behavior without bypassing system defaults.

For details of the initrc file format and available functions, see flux-shell-initrc(5).

$ flux run -o userrc=$HOME/.flux/shell-initrc.lua myapp

PROCESS MANAGEMENT

nosetpgrp

Disable the use of setpgrp(2) to launch each job task in its own process group.

By default, the shell places each task in its own process group to ensure signals can be delivered independently. With nosetpgrp, tasks remain in the shell's process group, meaning signals will only be delivered to direct children of the shell.

This option is rarely needed but may be useful for debugging or when working with tools that expect a specific process group structure.

stop-tasks-in-exec

Stop tasks in exec() using PTRACE_TRACEME. This causes each task to stop immediately before execve(2), allowing a debugger to attach.

Used internally by debugging tools. Users should not need to set this option directly.

oom.adjust=VALUE

Adjust each task's OOM (Out Of Memory) score to influence Linux's OOM killer behavior when system memory is critically low.

  • Value range: -1000 to 1000

  • 1000: Maximize probability of being killed (prefer killing this job)

  • 0: Default system behavior

  • -1000: Minimize probability of being killed (requires privilege)

Setting negative values typically requires CAP_SYS_RESOURCE capability.

For more information, refer to oom_score_adj in proc(5).

$ flux run -o oom.adjust=500 myapp
rlimit

A dictionary of soft process resource limits to apply before starting tasks. Resource limits are specified by lowercase name without the RLIMIT_ prefix, with integer values.

Common limits include: core, cpu, data, fsize, nofile, nproc, stack.

See setrlimit(2) for available limits and flux-submit --rlimit for command-line syntax.

EXIT BEHAVIOR

exit-timeout=VALUE

When the first task in a job exits, a timer starts with duration specified by VALUE (in Flux Standard Duration format). If the timer expires before all tasks complete, a fatal exception is raised.

Valid VALUE formats: - FSD string: 30s, 5m, 1h - none: Disable timeout completely

This timeout helps detect hung jobs where some tasks fail to exit properly.

$ flux run -o exit-timeout=5m myapp
$ flux run -o exit-timeout=none myapp
exit-on-error

Raise a fatal job exception immediately if the first task to exit was signaled or exited with a nonzero status.

Without this option, the shell waits for exit-timeout to expire or all tasks to exit before exiting. With this option, the job fails fast on the first error.

Useful for parallel workflows where one task failure invalidates the entire computation.

$ flux run -o exit-on-error myapp

SIGNAL HANDLING

signal=OPTION

Configure delivery of warning signal before job time limit expiration.

This option is most easily set via flux-submit --signal, which provides a convenient syntax for common use cases. --signal option should be preferred over setting this shell option directly.

signal.signum=NUMBER

Signal number to send before time limit expiration. Must be used with signal.timeleft.

signal.timeleft=TIME

Send signal.signum this amount of time before job expiration. TIME as an integer number of seconds or a string in Flux Standard Duration format.

CPU AFFINITY

cpu-affinity=OPT

Control CPU affinity binding for tasks. If unspecified, defaults to on (bind each task to all allocated cores).

OPT may be:

on

Bind each task to the full set of cores allocated to the job. This is the default and prevents tasks from being scheduled on cores outside the job allocation.

off

Disable CPU affinity completely. Tasks may float across all system cores. Useful when using external affinity tools like mpibind.

$ flux run -o cpu-affinity=off myapp
per-task

Divide allocated cores evenly among local tasks. Each task is bound to its own subset of cores. If there are more tasks than cores, tasks share cores as evenly as possible.

$ flux run -n 4 -o cpu-affinity=per-task myapp

For a 4-task job on a node with 8 cores, each task receives 2 cores.

map:LIST

Explicitly specify CPU binding for each task using a semicolon-delimited list. Each entry can use hwloc(7) list, bitmask, or taskset format.

See hwlocality_bitmap(3) for format details.

$ flux run -n 3 -o 'cpu-affinity=map:0-1;2-3;4-5' myapp

Task 0 binds to cores 0-1, task 1 to cores 2-3, task 2 to cores 4-5.

verbose

Log the cpuset assigned to the shell and each task. Must be combined with other options using comma separation, and must appear first.

$ flux run -o cpu-affinity=verbose,per-task myapp
dry-run

Print cpusets without actually applying affinity bindings. Implies verbose. Useful for testing affinity configurations. Must appear first when combined with other options.

$ flux run -o cpu-affinity=dry-run,per-task myapp

GPU AFFINITY

gpu-affinity=OPT

Control GPU device visibility via CUDA_VISIBLE_DEVICES. If unspecified, defaults to on (each task sees all GPUs allocated to the job).

OPT may be:

on

Set CUDA_VISIBLE_DEVICES to include all GPUs allocated to the job. All tasks see the same GPU set.

$ flux run -o gpu-affinity=on myapp
off

Disable the gpu-affinity plugin. CUDA_VISIBLE_DEVICES will not be set by the shell.

$ flux run -o gpu-affinity=off myapp
per-task

Divide allocated GPUs evenly among local tasks. Each task's CUDA_VISIBLE_DEVICES includes only its assigned GPUs. If there are more tasks than GPUs, tasks share GPUs as evenly as possible.

$ flux run -n 4 -o gpu-affinity=per-task myapp
map:LIST

Explicitly specify GPU assignment for each task using a semicolon-delimited list. Format is the same as cpu-affinity=map:LIST.

$ flux run -n 2 -o 'gpu-affinity=map:0;1' myapp

Task 0 sees GPU 0, task 1 sees GPU 1.

INPUT/OUTPUT

pty

Allocate a pseudo-terminal (pty) to all task ranks. Output is captured to the same location as stdout.

Equivalent to: pty.ranks=all pty.capture

$ flux run -o pty myapp
pty.ranks=OPT

Specify which task ranks should have a pty allocated. OPT may be:

  • An RFC 22 IDset (e.g., 0-3,5)

  • A single integer rank

  • The string all for all ranks

$ flux run -o pty.ranks=0-3 myapp
pty.capture

Capture pty output to the KVS alongside stdout. This is the default unless pty.interactive is set.

pty.interactive

Enable an interactive pty on rank 0, suitable for use with flux job attach(7).

By default, only rank 0 gets a pty and output is not captured. Override these defaults with additional pty options:

$ flux run -o pty.interactive -o pty.capture myapp
output.{stdout,stderr}.type=TYPE

Set output destination for stdout/stderr. TYPE may be:

kvs

Store output in the KVS (default). Retrieved via flux job attach(7).

term

Write directly to terminal (bypasses KVS).

file

Write to a file. Requires output.<stream>.path to be set.

If only output.stdout.type is set, it applies to both streams.

See also: flux-submit --output, flux-submit --error.

output.{stdout,stderr}.path=PATH

Set file path for stdout/stderr when output.<stream>.type=file.

Supports mustache templates for dynamic paths. See MUSTACHE TEMPLATES in flux-submit(1) for full documentation.

output.limit=SIZE

Limit KVS output to SIZE bytes per stream. Once exceeded, output is truncated.

  • SIZE format: number with optional SI suffix (k, K, M, G)

  • Maximum: 1G

  • Default: 10M (multi-user instance), 1G (single-user instance)

  • Ignored for file output

$ flux run -o output.limit=50M myapp
output.mode=MODE

Set file opening mode when writing output to files. MODE may be:

truncate

Overwrite existing files (default).

append

Append to existing files.

output.{stdout,stderr}.buffer.type=[none|line]

Set buffer type for stdout or stderr to line buffered or unbuffered (none). The default is line-buffered for stdout and unbuffered for stderr. See also the flux-submit --unbuffered option.

output.client.{lwm,hwm}=N

Configure flow control for output aggregation on the leader shell.

Flow control prevents unbounded memory growth when tasks produce output faster than it can be consumed. The shell uses a credit-based protocol:

  • Each shell starts with credits equal to hwm

  • When credits drop to lwm, request more from leader

  • At zero credits, output handling stops (tasks may block)

  • When credits arrive, output handling resumes

Default values: lwm=100, hwm=1000

These defaults are suitable for most cases. Adjust if experiencing output stalls or excessive memory use.

$ flux run -o output.client.lwm=50 -o output.client.hwm=500 myapp
input.stdin.type=TYPE

Set input source for stdin. TYPE may be:

service

Read stdin from the shell's stdin service (default for interactive jobs).

file

Read stdin from a file.

See also: flux-submit --input.

$ flux run --input=/tmp/input.data myapp
input.limit=SIZE

Limit KVS input to SIZE bytes. If exceeded, a fatal job exception is raised.

  • SIZE format: number with optional SI suffix (k, K, M, G)

  • Maximum: 32M

  • Default: 10M

  • Ignored for file input

$ flux run -o input.limit=1M myapp
output.batch-timeout=FSD

Set the KVS output batch-timeout to a time period in Flux Standard Duration. This is the period over which the leader shell collects entries destined for the output eventlog before committing them to the KVS. A longer period results in less load on the KVS, while a shorter period makes output appear sooner after it was written by tasks in flux job attach. The default is 0.5s.

$ flux run -o output.batch-timeout=0.01 myapp

TASK MAPPING

taskmap

Request custom task-to-node mapping. This option is an object with required key scheme and optional key value.

The shell invokes taskmap.scheme plugin callbacks to generate the mapping. If value is set, it's passed to the plugin.

Built-in schemes: block, cyclic, hostfile, manual

See flux-submit --taskmap for command-line syntax and flux-shell-plugins(7) for implementing custom taskmap plugins.

$ flux submit --taskmap=cyclic myapp

PMI CONFIGURATION

pmi=LIST

Specify comma-separated list of PMI implementations to enable. Default: simple

Available implementations:

simple (alias: pmi1, pmi2)

The simple PMI-1 wire protocol. Passes an open file descriptor via PMI_FD. Required for Flux's libpmi.so and libpmi2.so. Preferred when Flux launches Flux (e.g., flux-batch(1)).

off

Disable PMI completely.

cray-pals

External plugin from flux-coral2.

pmix

Provided via external plugin from the flux-pmix. project

pmi-simple.nomap

Skip pre-populating flux.taskmap and PMI_process_mapping keys in the simple PMI implementation.

Reduces PMI setup overhead when these keys are not needed.

pmi-simple.exchange.k=N

Configure PMI key exchange to use a virtual tree with fanout N. Default: 2

Higher fanout reduces exchange time but increases message size and leader load.

STAGE-IN

The stage-in feature copies files from archived content into the job's temporary directory before task execution. Files must be previously archived using flux-archive(1).

stage-in

Enable stage-in. Copy files to the directory referenced by FLUX_JOB_TMPDIR that were previously archived with flux-archive(1).

$ flux run -o stage-in myapp
stage-in.names=LIST

Comma-separated list of archive names to extract. If no names are specified, main is assumed.

$ flux run -o stage-in.names=main,data myapp
stage-in.pattern=PATTERN

Filter extracted files using a glob(7) pattern.

$ flux run -o 'stage-in.pattern=*.dat' myapp
stage-in.destination=[SCOPE:]PATH

Extract to PATH instead of FLUX_JOB_TMPDIR.

SCOPE may be:

local

Local file system (default). Extraction occurs on all nodes.

global

Global file system. Extraction occurs only on first node.

Warning

When using custom destinations, you must handle cleanup yourself. FLUX_JOB_TMPDIR is automatically cleaned up and guaranteed to be unique.

$ flux run -o 'stage-in.destination=global:/scratch/job-data' myapp

HWLOC CONFIGURATION

hwloc.xmlfile

Export the job shell's hwloc XML topology to a file and set HWLOC_XMLFILE for all tasks.

This allows tasks to query the hardware topology without scanning the system directly. Also unsets HWLOC_COMPONENTS which may interfere with HWLOC_XMLFILE.

$ flux run -o hwloc.xmlfile myapp
hwloc.restrict

Restrict the exported hwloc XML to only resources allocated to the job.

Must be used with hwloc.xmlfile. Without this option, the full node topology is exported.

$ flux run -o hwloc.xmlfile -o hwloc.restrict myapp

MONITORING

sysmon

Enable system resource monitoring. Logs peak memory usage and CPU load average for each shell at job completion.

With verbose=2, also traces memory and CPU usage periodically during execution.

$ flux run -o sysmon myapp
sysmon.period=FSD

Set monitoring sample period in Flux Standard Duration format. Default: follows Flux heartbeat (typically 2 seconds)

$ flux run -o sysmon -o sysmon.period=5s myapp

REXEC OPTIONS

rexec-shutdown-timeout=FSD

Set timeout for processes launched via the rexec server to exit gracefully after all tasks complete.

When all tasks exit, rexec processes receive SIGTERM. If they don't exit within this timeout, they receive SIGKILL.

Default: 60s

$ flux run -o rexec-shutdown-timeout=30s myapp
rexec.sign-required

Require RFC 42 request signatures on all requests to the rexec subprocess server. By default, signatures are only required when the job owner differs from the instance owner (i.e., the shell is running as a guest user). This option forces signature verification in all cases, which is primarily useful for testing the signing code paths in a single-user Flux instance.

$ flux submit -o rexec.sign-required myapp

EXTERNAL PLUGIN OPTIONS

External job shell plugins may define additional options. Refer to plugin-specific documentation for details. Plugin options typically use a namespace prefix matching the plugin name (e.g., myplugin.enabled).

See flux-shell-plugins(7) for plugin development and configuration details.

RESOURCES

Flux: http://flux-framework.org

Flux RFC: https://flux-framework.readthedocs.io/projects/flux-rfc

Issue Tracker: https://github.com/flux-framework/flux-core/issues

SEE ALSO

flux-shell(1), flux-shell-initrc(5), flux-shell-plugins(7), flux-run(1), flux-submit(1), flux-batch(1)