====================== flux-jobtap-plugins(7) ====================== DESCRIPTION =========== The *jobtap* interface supports loading of builtin and external plugins into the job manager broker module. These plugins can be used to assign job priorities using algorithms other than the default, assign job dependencies, aid in debugging of the flow of job states, or generically extend the functionality of the job manager. Jobtap plugins are defined using the Flux standard plugin format. Therefore a jobtap plugin should export the single symbol: ``flux_plugin_init()``, from which calls to ``flux_plugin_add_handler(3)`` should be used to register functions which will be called for the callback topic strings described in the :ref:`callback_topics` section below. Each callback function uses the Flux standard plugin callback form, e.g.:: int callback (flux_plugin_t *p, const char *topic, flux_plugin_arg_t *args, void *arg); where ``p`` is the handle for the current *jobtap* plugin, ``topic`` is the *topic string* for the currently invoked callback, ``args`` contains a set of plugin arguments which may be unpacked with the ``flux_plugin_arg_unpack(3)`` call, and ``arg`` is any opaque argument passed along when registering the handler. Multiple plugins may be loaded in the job-manager simultaneously. In this case, all matching handlers are called in all loaded plugins in the order in which they were loaded. For more information about loading plugins see the :man5:`flux-conf-job-manager` or :man1:`flux-jobtap` manpage. JOBTAP PLUGIN NAMES =================== Jobtap plugins are loaded into the job-manager and referenced in the output of ``flux jobtap list`` by file name. If a plugin is loaded by a fully qualified path, the plugin name is shortened to the basename, such that all dynamically loaded plugins have names such as ``plugin-name.so``. Builtin plugins, on the other hand, are named with a leading ``.``, and are hidden in ``flux jobtap list``, do not match the :linux:man7:`glob` ``*`` or "all" keyword, etc. (similar to hidden filesystem files). To list builtin plugins, use the ``-a, --all`` option to ``flux jobtap list``, and to remove them use the name explicitly or include the leading ``.`` in any pattern. A plugin may optionally assign a name with ``flux_plugin_set_name(3)``, however this name is not displayed in ``flux jobtap list`` or used in matching. The internal plugin name is only used as part of the service name generated by ``flux_jobtap_service_register()``, i.e. the service name will be ``job-manager..``. If a plugin does not set a name with ``flux_plugin_set_name(3)``, then the basename of the plugin file will be used with the trailing ``.so`` removed. .. _arguments: JOBTAP PLUGIN ARGUMENTS ======================= For job-specific callbacks, all job data is passed to the plugin via the ``flux_plugin_arg_t *args``, and return data is sent back to the job manager via the same ``args``. Incoming arguments may be unpacked using ``flux_plugin_arg_unpack(3)``, e.g.:: rc = flux_plugin_arg_unpack (args, FLUX_PLUGIN_ARG_IN, "{s{s:o}, s:I}", "jobspec", "resources", &resources, "id", &id); will unpack the ``resources`` section of jobspec and the jobid into ``resources`` and ``id`` respectively. The full list of available args includes the following: ========== ==== ========================================== name type description ========== ==== ========================================== jobspec o jobspec with environment redacted R o R with scheduling key redacted (RUN state or later) id I jobid state i current job state prev_state i previous state (``job.state.*`` callbacks) userid i userid urgency i current urgency priority I current priority t_submit f submit timestamp in floating point seconds entry o posted eventlog entry, including context end_event o copy of event that cause transition to CLEANUP, if available ========== ==== ========================================== Return arguments can be packed using the ``FLUX_PLUGIN_ARG_OUT`` and optionally ``FLUX_PLUGIN_ARG_REPLACE`` flags. For example to return a priority:: rc = flux_plugin_arg_pack (args, FLUX_PLUGIN_ARG_OUT, "{s:I}", "priority", (int64_t) priority); While a job is pending, *jobtap* plugin callbacks may also add job annotations by returning a value for the ``annotations`` key:: flux_plugin_arg_pack (args, FLUX_PLUGIN_ARG_OUT, "{s:{s:s}}", "annotations", "test", value); .. _callback_topics: JOB CALLBACK TOPICS =================== The following job callback "topic strings" are currently provided by the *jobtap* interface: job.create The ``job.create`` topic notifies a jobtap plugin about a newly introduced job. This call may be made in three different situations: 1. on job submission 2. when the job manager is restarted and has reloaded a job from the KVS 3. when a new jobtap plugin is loaded In case 1 above, the job state will always be ``FLUX_JOB_STATE_NEW``, while jobs in cases 2 and 3 can be in any state except ``FLUX_JOB_STATE_INACTIVE``. In case 1, the job is not yet validated. If necessary, ``job.create`` may reject the job in the same manner as ``job.validate`` using :man3:`flux_jobtap_reject_job` and a negative return code from the callback. In cases 2 and 3, fatal errors may be handled by raising a fatal job exception, as usual. It is safe to post events from a ``job.create`` handler in all cases. .. note:: In case 3 ``job.create`` is called for active jobs in unspecified order. If a plugin requires an ordering guarantee, the plugin should call ``flux_jobtap_set_load_sort_order(3)`` from the ``flux_plugin_init()`` callback. This function takes a ``mode`` parameter of either ``state``, to sort jobs by state (then jobid), or ``-state`` to sort by reverse state (then jobid). For example :: flux_jobtap_set_load_sort_order (p, "state"); will ensure that ``job.create`` and ``job.new`` are called on jobs in PRIORITY first, then DEPEND, then SCHED, and so on. job.destroy The ``job.destroy`` topic is called after a job is rejected or becomes inactive. job.validate The ``job.validate`` topic allows a plugin to reject a job before it is introduced to the job manager. A rejected job will result in a job submission error in the submitting client, and any job data in the KVS will be purged. No further callbacks except ``job.destroy`` will be made for rejected jobs. Note: If a job is not rejected, then the ``job.new`` callback will be invoked immediately after ``job.validate``. This allows limits or other checks to be implemented in the ``job.validate`` callback, but accounting for those limits should be confined to the ``job.new`` callback, since ``job.new`` may also be called during job-manager restart or plugin reload. job.dependency.* The ``job.dependency.*`` topic allows a dependency plugin to notify the job-manager that it handles a given dependency _scheme_. The job-manager will scan the ``attributes.system.dependencies`` array, if provided, and issue a ``job.dependency.SCHEME`` callback for each listed dependency. If no plugin has registered for ``SCHEME``, then the job is rejected. The plugin should then call ``flux_jobtap_dependency_add(3)`` to add a new named dependency to the job (if necessary). Jobs with dependencies will remain in the ``DEPEND`` state until all dependencies are removed with a corresponding call to ``flux_jobtap_dependency_remove(3)``. See ``job.state.depend`` below for more information about dependencies. If there is an error in the dependency specification, the job may be rejected with :man3:`flux_jobtap_reject_job` and a negative return code from the callback. job.new The ``job.new`` topic announces a new valid job. It may be called in the same three situations listed for ``job.create``, job.state.* The ``job.state.*`` callbacks are made just after a job state transition. The callback is made after the state has been published to the job's eventlog, but before any action has been taken on that state (since the action could involve immediately transitioning to a new state) job.event.* The ``job.event.*`` callbacks are only made for plugins that have explicitly subscribed to a job with ``flux_jobtap_job_subscribe()``. In this case, all job events result in this callback being invoked on all subscribed plugins. This may be useful for plugins to get notification of events that do not necessarily result in a state transition, e.g. the ``start`` event or a non-fatal ``exception``. job.state.depend The callback for ``FLUX_JOB_STATE_DEPEND`` is the final place from which a plugin may add dependencies to a job. Dependencies are added via the ``flux_jobtap_dependency_add()`` function. This function allows a named dependency to be attached to a job. Jobs with dependencies will remain in the ``DEPEND`` state until all dependencies are removed with a corresponding call the ``flux_jobtap_dependency_remove()``. A dependency may only be used once. A second call to ``flux_jobtap_dependency_add()`` with the same dependency description will return ``EEXIST``, even if the dependency was subsequently removed. (This allows idempotent operation of plugin-managed dependencies for job-manager or plugin restart). job.state.priority The callback for ``FLUX_JOB_STATE_PRIORITY`` is special, in that a plugin must return a priority at the end of the callback (if the plugin is a priority-managing plugin). If the job priority is not available, the plugin should use ``flux_jobtap_priority_unavail()`` to indicate that the priority cannot be set. Jobs that do not have a priority due to unavailable priority or when no current priority plugin is loaded will remain in the PRIORITY state until a priority is assigned. Therefore, a plugin should arrange for the priority to be set asynchronously using ``flux_jobtap_reprioritize_job()``. See the :ref:`priority` section for more detailed information about plugin management of job priority. job.state.sched In the callback for ``FLUX_JOB_STATE_SCHED`` a plugin may set ``R`` in output args. In this case, if an ``R`` is not already assigned, then this will force ``R`` for the current job and bypass the scheduler. job.priority.get The job manager calls the ``job.priority.get`` topic whenever it wants to update the job priority of a single job. The plugin should return a priority immediately, but if one is not available when a job is in the PRIORITY state, the plugin may use ``flux_jobtap_priority_unavail()`` to indicate the priority is not available. Returning an unavailable priority in the SCHED state is an error and it will be logged, but otherwise ignored. A call of ``job.priority.get`` can be requested for all jobs by calling ``flux_jobtap_reprioritize_all()``. See the :ref:`priority` section for more information about plugin management of job priority. job.inactive-add The job has transitioned to INACTIVE state and has been added to the inactive hash. job.inactive-remove The job has been purged from the inactive hash. job.update The job has been updated with an RFC 21 ``jobspec-update`` event. CONFIGURATION CALLBACK TOPIC ============================ Jobtap plugins may register a ``conf.update`` callback. The current/proposed configuration object is present in the input arguments under the ``conf`` key. The callback is invoked in the following circumstances: - When the plugin is first loaded. If the callback returns failure, the plugin load fails. - Each time the configuration changes. If the callback returns failure, ``flux config reload`` fails. The callback should return 0 on success, and -1 on failure. On failure, it may optionally set a human readable error string in the ``errstr`` output argument. The ``flux_jobtap_error()`` convenience function may be useful here. JOB UPDATE CALLBACKS ==================== The job manager allows updates of select job attributes through a plugin-based scheme. Plugins may register a callback topic matching ``job.update.KEY``, where ``KEY`` is a period-delimited jobspec attribute, e.g. ``job.update.attributes.system.duration``. The requested updates are passed as an additional argument to the plugin in the ``updates`` key. The purpose of ``job.update.*`` callbacks to enable plugins to allow or deny the update of specific job attributes. Updates are denied by default unless a callback exists for the updated attribute and the plugin returns 0 from the callback. Plugins deny an attribute update by returning -1 from the callback, and may optionally set an error message to return to the user with ``flux_jobtap_error(3)``. After all updates in a request are allowed by plugins, then the updated jobspec is passed through the ``job.validate`` plugin stack to ensure the result is valid. Plugins can note that an update is already validated by setting a ``validated`` flag in the ``FLUX_PLUGIN_OUT_ARGS``. If all updated attributes have this flag then this validation step is skipped. This can be useful to allow an instance owner to update a job attribute beyond limits for example. Some updates may benefit from a job feasibility check before the updates are applied. This prevents a user from inadvertently causing a job that was feasible at the time of submission to become infeasible through an update. Because the update plugin is in the best position to determine if a feasibility check should be completed for an update, feasibility checks are only done if a ``feasibility`` flag in ``FLUX_PLUGIN_OUT_ARGS`` is set. If any plugin for a set of updates requires a feasibility check, then feasibility of the updated jobspec as a whole will be checked. If the updated job is determined to be infeasible, then the update is aborted and an error returned to the user. The update of one attribute may require modification of other attributes. For example, an update of ``attributes.system.queue`` may require modification of ``attributes.system.constraints`` to apply the constraints of the new queue. To support this use case, plugins may additionally push an ``updates`` object onto ``FLUX_PLUGIN_OUT_ARGS``. This object has the same form as the ``jobspec-update`` context defined in RFC 21. For example, if a plugin wishes to update ``attributes.system.foo`` to 1, it can set :: {"updates": {"attributes.system.foo": 1}} in the ``FLUX_PLUGIN_OUT_ARGS`` before returning. Updates are applied by updating the requested updates, so this method could overwrite other user- requested updates and caution is advised. PLUGIN CALLBACK TOPICS ====================== plugin.query The job manager calls the ``plugin.query`` callback topic to give a plugin the opportunity to provide extra data in response to a ``jobtap-query`` request (as used by the ``flux jobtap query PLUGIN`` command). This can be used by a plugin to export internal plugin state for inspection by an admin or user by placing the data in the output arguments of the callback, e.g.:: flux_plugin_arg_pack (p, FLUX_PLUGIN_ARG_OUT, "{s:O}" "data", internal_data); .. _priority: PRIORITY ======== Custom assignment of job priority values is one of the core features supported by the jobtap plugin interface. A builtin ``.priority-default`` plugin is always loaded in the job-manager to ensure that jobs move past the PRIORITY state when no other priority plugin is loaded. The default plugin simply assigns the priority to the same value as the current job urgency. When loading a new jobtap plugin that assigns priority, it is important to be cognizant of the fact that the ``.priority-default`` plugin may still be loaded. This will result in the ``priority`` set in the return arguments to always be initialized to the job urgency. However, since plugin ``job.state.priority`` and ``job.priority.get`` callbacks are run in order, any subsequently loaded plugin that assigns a priority will overwrite the returned default ``priority`` and thus the last loaded priority plugin will be active. To ensure the default priority is always overridden priority plugins should therefore make sure to always set a priority, or use ``flux_jobtap_priority_unavail()`` if the priority is not available, in any callback in which a priority is expected to be returned, i.e. ``job.state.priority`` and ``job.priority.get``. To fully ensure priority plugins do not conflict, the builtin priority plugin may explicitly be removed with :: flux jobtap remove .priority-default or via configuration (See :man5:`flux-conf-job-manager`) :: [job-manager] plugins = [ { remove = ".priority-default", load = "complex-priority.so" }, ] .. _perilogs: PROLOG AND EPILOG ACTIONS ========================= Plugins that need to perform asynchronous tasks for jobs after an ``alloc`` event but before the job is running, or after a ``finish`` event but before resources are freed to the scheduler can make use of job manager prolog or epilog actions. Prolog and epilog actions are delineated by the following functions: :: int flux_jobtap_prolog_start (flux_plugin_t *p, const char *description); int flux_jobtap_prolog_finish (flux_plugin_t *p, flux_jobid_t id, const char *description, int status); int flux_jobtap_epilog_start (flux_plugin_t *p, const char *description); int flux_jobtap_epilog_finish (flux_plugin_t *p, flux_jobid_t id, const char *description, int status); To initiate a prolog action, a plugin should call the function ``flux_jobtap_prolog_start()``. This will block the job from starting even after resources have been assigned until a corresponding call to ``flux_jobtap_prolog_finish()`` has been called. While the status of the prolog action is passed to ``flux_jobtap_prolog_finish()`` so it can be captured in the eventlog, the action itself is responsible for raising a job exception or taking other action on failure. That is, a non-zero prolog finish status does not cause any automated behavior on the part of the job manager. Similarly, the prolog ``description`` is used for informational purposes only, so that multiple actions in an eventlog may be differentiated. Similarly, an epilog action is initiated with ``flux_jobtap_epilog_start()``, and prevents resources from being released to the scheduler until a corresponding call to ``flux_jobtap_epilog_finish()``. The same caveats described for prolog actions regarding description and completion status of epilog actions apply. The ``flux_jobtap_prolog_start()`` function may be initiated anytime before the ``start`` request is made to the execution system, though most often from the ``job.state.run`` or ``job.event.alloc`` callbacks, since this is the point at which a job has been allocated resources. (Note: plugins will only receive the ``job.event.*`` callbacks for jobs to which they have subscribed with a call to ``flux_jobtap_job_subscribe()``). A prolog action cannot be started after a job enters the CLEANUP state. The ``flux_jobtap_epilog_start()`` function may only be called after a job is in the CLEANUP state, but before the ``free`` request has been sent to the scheduler, for example from the ``job.state.cleanup`` or ``job.event.finish`` callbacks. If ``flux_jobtap_prolog_start()``, ``flux_jobtap_prolog_finish()``, ``flux_jobtap_epilog_start()`` or ``flux_jobtap_epilog_finish()`` are called for a job in an invalid state, these function will return -1 with ``errno`` set to ``EINVAL``. Multiple prolog or epilog actions can be active at the same time. CALLING OTHER PLUGINS ===================== Plugins may invoke custom callbacks in other plugins using ``flux_jobtap_call()``. Note that topic strings starting with ``job.`` are reserved for use by the job-manager and will cause this function to fail immediately with ``errno`` set to ``EINVAL``:: int flux_jobtap_call (flux_plugin_t *p, flux_jobid_t id, const char *topic, flux_plugin_arg_t *args) Much of the jobtap API assumes a current job, so a job ``id`` argument is required. Note, that ``args`` is passed unmodified when invoking callbacks for ``topic``, so expected data listed in :ref:`arguments` for job ``id`` may not be present in newly created ``args`` unless manually added by the caller. However, when invoked from another jobtap callback, the existing ``args`` object along with ``FLUX_JOBTAP_CURRENT_JOB`` may be used in ``flux_jobtap_call()``, in which case ``args`` will still contain the expected job arguments. For example, the following will call all plugins registered for the topic ``custom.topic`` when the ``callback`` is called :: int callback (flux_plugin_t *p, const char *topic, flux_plugin_arg_t *args, void *arg) { return flux_jobtap_call (p, FLUX_JOBTAP_CURRENT_JOB, "custom.topic", args); } RESOURCES ========= .. include:: common/resources.rst SEE ALSO ======== :man1:`flux-jobtap`, :man5:`flux-conf-job-manager`