19/Flux Locally Unique ID (FLUID)
This specification describes a scheme for a distributed, uncoordinated flux locally unique ID service that generates 64 bit k-ordered, unique identifiers that are a combination of timestamp since some epoch, generator id, and sequence number. The scheme is used to generate Flux job IDs.
Editor: Mark Grondona <firstname.lastname@example.org>
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
Very low probability of collision: Like 128 bit UUIDs on a global scale, FLUIDs should provide a reasonable guarantee against collisions on a smaller scale: within a Flux instance.
Distributed: Flux job ingest should be distributed to achieve a high job ingest rate, therefore the generation of ID’s should also be capable of being distributed to promote scalability.
Loosely ordered: Job ID’s generated by legacy resource managers are typically monotonically increasing integers that reflect job submission order. This property is of debatable utility, but following the principle of least astonishment, FLUIDs should retain it if possible.
The design of FLUIDs is patterned after Twitter Snowflake, and the derived implementation Boundary Flake. The basic scheme is to couple a timestamp, machine or generator id, and sequence number into a number of bits.
Placing the timestamp into the most significant bits of the ID allows independently generated IDs to be loosely sorted (k-ordered). Use of a separate machine ID per generator ensures uniqueness without coordination, and the sequence number ensures each generator can create a certain number of IDs per timestamp unit.
FLUIDs are composed of
[ timestamp | id | sequence ] similar to Snowflake,
to allow distributed, uncoordinated ID generation across a Flux instance,
with the allocation of bits customized for the unique use case in Flux:
40 bits for timestamp since epoch in milliseconds, good for a 35 year long runtime with custom epoch set to job start time.
14 bits for generator ID (up to 16K generators). By default, the generator ID could be set to the rank. For sessions greater than 16K ranks, some generators could be idled and forward requests up the tree to keep the max generators to 16K.
10 bits for sequence number (1024 IDs per ms)
With this scheme it is theoretically possible to create a max of about 16B FLUIDs per second for 34 years.
This type of generator guarantees unique IDs, with probability of collision equal to zero, so no collision detection is required.
A FLUID is a 64-bit integer, e.g.
Representations other than decimal MAY be used where appropriate, for instance for compactness or ease of transcription over the phone.
The following sections describe the set of supported alternate representations for FLUIDs.
FLUID base58 (F58) Encoding
In order to create a compact, human readable representation of a FLUID, the main alternate encoding of a FLUID SHALL be Base58, using the alphabet
To disambiguate this Base58 representation from decimal or other
representations, FLUIDs with this encoding SHALL be prefixed with the
Unicode character U+0192
ƒ, and the final result SHALL be termed
“FLUID base58 Encoding” or F58 for short. For example, the FLUID
6731191091817518 from above SHALL be represented in F58 as
ƒuZZybuNNy. As a fallback mechanism, a FLUID prefixed with an
f will also be decoded as F58.
FLUID Hexadecimal (hex) Encoding
A hexadecimal encoding SHALL represent a FLUID in base16, including
0x prefix to unambiguously differentiate the representation from
other FLUID standard encodings.
FLUID Dotted-Hexadecimal (dothex) Encoding
In order to support indexing of FLUIDs in a hierarchical KVS namespace,
a dotted-hexadecimal encoding SHALL represent a FLUID in base16,
with each 4 hexadecimal digits separated by dots (
FLUID Mnemonic (words) Encoding
In order to ease transferring of FLUIDs via human interaction, a mnemonic representation of FLUIDS SHALL be supported by a conformant implementation.
The mnemonicode implementation converts integers to strings of pronounceable words and back again. This encoding MAY be used when a FLUID must be conveyed by speaking, e.g. over the phone.
Decoding Alternate FLUID Representations
The standard FLUID representations described in this RFC are unambiguous by design. That is, the type of FLUID encoding can be definitively determined by the string representation.
Implementations that take an encoded FLUID as a string argument SHALL use the following rules to decode the argument:
If a string contains
., then decode as “dothex”
Else if the string contains
-, then decode as “words”
Else if the string starts with
f, decode as F58
Else if the string starts with
0xdecode as “hex”
Otherwise, decode as decimal
An implementation decoding FLUID string representations SHALL ignore leading and trailing whitespace.