gritzko/ron

String ⊂ Atom ⊂ UUID?

Opened this issue · 7 comments

cblp commented

Can a UUID be an atom? Can an atom be a string? How does UTF-8 string fit in 16 bytes?

An atom is either (1) UUID or (2) int or (3) string or (4) float.

The internal Cursor representation uses 16 bytes for either. For strings, that is a range in the buffer. Only UUIDs must be parsed fully. Cause UUDs are read and interpreted by reducers. Values typically aren't.

see the README

An op is a tuple of four "key" UUIDs and zero or more "value" atoms.

You are mixing the formal model and implementation details here.

cblp commented

I'm reading uuid.md only. The document says:

  1. uuid is 16 bytes long
  2. if 31—30 bits of uuid are "01", then it is internal: RON atom (int, float, string)

what does it mean then?

Do formal model and implementation details contradict?

I should probably remove that.

That's how cursors keep atoms internally. Shouldn't be part of uuid.md.

cblp commented

Looks like it is atom specification. A bigger thing is described inside a smaller thing. That's why it causes misunderstanding.

So, we need a separate document describing atoms and how strings are encoded inside them.

I should make a single document explaining atoms, then ops, then frames, then po logs, then mappers, I guess. With pictures. Formal texts are heavy to read.

Started something a little bit here: https://github.com/lambdafu/swarm-doc