/spec

SCRU160: Sortable, Clock and Random number-based Unique identifier

Apache License 2.0Apache-2.0

SCRU160: Sortable, Clock and Random number-based Unique identifier

SCRU160 ID is yet another attempt to supersede UUID in the use cases that need decentralized, globally unique time-ordered identifiers. SCRU160 is inspired by ULID and KSUID and has the following features:

  • 160-bit feature-rich worry-free design suitable for general purposes
  • Sortable by generation time (in binary and in text)
  • Case-insensitive, highly portable encodings: 32-char base32hex and 40-char hex
  • More than 32,000 unique, time-ordered but unpredictable IDs per millisecond
  • Nearly 111-bit randomness for collision resistance (default configuration)

Examples in the default base32hex encoding:

05TTUP1HNCPNH30VEK64KDQT9BSNU4C4
05TTUP1HNCPNIB63R8IN5V2L3VFGNFET

Examples in the alternative hex encoding:

017bdf6431bb33750751eb63beb3c3f8c5969d86
017bdf6431bb337662412e6a5758890735c33c2b

Implementations

Specification (v0.1.0 - 2021-09-24)

Binary Layout and Byte Order

A SCRU160 ID is a 160-bit object that consists of the following four fields:

Bit Numbers Field Name Data Type
Msb 0 - 47 timestamp 48-bit unsigned integer, big-endian
Msb 48 - 63 counter 16-bit unsigned integer, big-endian
Msb 64 - 79 random16 16-bit unsigned integer, big-endian
Msb 80 - 159 random80 80-bit unsigned integer, big-endian

timestamp

timestamp MUST be filled with the unix time in milliseconds (milliseconds elapsed since 1970-01-01 00:00:00.000+00:00, excluding leap seconds).

counter

counter MUST be utilized in either of the following ways: the default counter usage or the alternative counter usage. A generator SHOULD follow the default counter usage, but it MAY follow the alternative counter usage if the unix time is reliably available in microseconds.

Default counter usage

The default counter usage employs extra randomness while guaranteeing the generation of more than 32,768 monotonically ordered IDs per millisecond.

  • Initialize counter at a 15-bit random value when timestamp has changed since the last generation.
  • Increment counter by one for each new ID generated within the same timestamp.
  • A generator MAY initialize counter at a 14-bit or less random value (including zero) if the application needs more IDs per millisecond.
Alternative counter usage

The alternative counter usage utilizes the unix time in microseconds and shorter counter to order the multiple IDs that share the same timestamp.

  • Fill in the 10 most significant bits of counter with the microsecond part of unix time represented as an unsigned integer ranging from 0 to 999.
  • Utilize the 6 least significant bits similarly to the default counter usage to guarantee the generation of more than 32 IDs per microsecond.

random16

random16 SHOULD be filled with a random value.

Alternatively, a generator MAY use some or all bits of random16 to extend counter to accommodate more IDs per millisecond and/or to allocate implementation-dependent node identifiers that guarantee the local uniqueness of IDs generated by multiple nodes within an application. The remaining bits not used in either way MUST be filled with random values.

When a part or all of random16 is used to extend counter, the counter bits MUST be placed at the most significant bits. The counter bits in counter and random16 SHOULD be combined into a single unsigned integer and used similarly to the default counter usage so that the counter bits are efficiently utilized.

random80

random80 MUST be filled with a random value to guarantee the minimum entropy of 80 bits at the specification level.

Encodings

A generator MUST support the base32hex and hex (a.k.a. base16) encodings as defined in RFC 4648 to produce textual representations. The base32hex encoding maps each 5-bit group of a 160-bit binary object to [0-9A-V] to produce a 32-character string representation. The hex encoding maps each 4-bit group to [0-9a-f] to produce a 40-character string representation.

A generator SHOULD use uppercase letters and lowercase letters in the base32hex encoding and the hex encoding, respectively. A decoder MUST ignore cases when interpreting or lexicographically sorting encoded IDs.

The base32hex and hex representations cannot be mixed when encoded IDs are sorted lexicographically.

Other considerations

Quality of random values

When a random value is used, implementations SHOULD use a cryptographically strong random or pseudorandom number generator to generate unpredictable IDs.

Sorting

At least 80 most significant bits (16 leading characters in the base32hex encoding and 20 leading characters in the hex encoding) MUST be compared when IDs are sorted by the order of generation.

Reserved IDs

A generator MUST NOT generate IDs with timestamp set at zero or 2^48 - 1 because these are reserved for special purposes.

Scope of guaranteed monotonicity

An implementation SHOULD ensure the monotonic order of generated IDs by the generation time as much as reasonably possible. This specification does not require a strict guarantee of monotonicity because it is quite often impossible or unreasonable to guarantee the monotonic order of generated IDs. Typical situations where the guarantee of monotonic order is not reasonably possible include:

  • No synchronized clock or internal state is readily available because IDs are generated at multiple distributed nodes.
  • Available unix timestamps are imprecise or do not increase monotonically.