CyberAP/nanoid-dictionary

Cookie-safe alphabet?

Opened this issue · 0 comments

tigt commented

“This seems overly-specific?”

Kinda, yes. But HTTP cookies have an interesting combination of properties:

  • They often hold identifiers that nanoid is very good at generating: short-term session IDs, long-term sync IDs, “forever” persistent IDs, etc.
  • Their decompressed length is important: if the cookie header grows beyond server/proxy maximums, an entire URL origin can become unusable for a user
  • They’re frequently repeated per request, so minor overheads are worth addressing: exceeding the allotted space in a packet (IPv6 minimum MTU of 1280 − TCP/IP overhead of 60 = 1220 bytes) can turn a 1-packet GET request into 2, greatly increasing request time variability
  • Characters allowed in them are not very intuitive, so a preset alphabet would be useful

I was picturing something like this:

import { numbers } from './numbers';
import { lowercase } from './lowercase';
import { uppercase } from './uppercase';

export const cookieValues = numbers + lowercase + uppercase +
  "`'|/_-~=+*^@#$%&()[]{}<>.?!:"; // could go in ASCII order if you prefer, but this is easier to scan for me

This would have a total alphabet size of 90, 28 more than the nanoid default. That means to get a similar collision probability, the output string would only need to be 19 (~59 billion years) or 20 (~564 billion years) characters long.

2 characters isn’t much, but my day job’s website accrues ≈41 cookies because of third-parties, gateway sessions, etc. About half are opaque random identifiers, so an alphabet like this would save about 40 bytes.