vlcn-io/cr-sqlite

investigate sequence / text crdts

tantaman opened this issue ยท 37 comments

Is it possible to stream patches (keystrokes even) to a column behind which is the sequence crdt? https://docs.rs/diamond-types/latest/diamond_types/

Perhaps with column wrapper functions like we discussed a while back for counter crdts?

UPDATE tbl SET doc = text_crdt_update(text, position) WHERE id = 123;
SELECT text_crdt_value(doc) FROM tbl WHERE id = 123;

I like it. That should work well -- diamond types can operate without modification and the two functions act as the bridge between worlds.

cc @josephg in case he's interested in the above proposed interface for diamond types over SQL.

Note that the doc column probably wouldn't actually store anything other than a handle to/identifier of the diamond types CRDT.

Yeah, I suspect the actual doc would be multiple text columns spread across multiple rows in its own table?

My excitement is through the roof to see text-crdts in sqlite!

If we don't target WASM I think adding diamond types will be easy.

Targeting WASM makes everything harder since WASM binaries currently can not load other binaries (AFAIK) so we have to compile Rust code with C code. Maybe I'm worried about nothing and that is actually an easy thing to do?

Yeah, I suspect the actual doc would be multiple text columns spread across multiple rows in its own table?

I have no idea at the moment. I know diamond types handles its own persistence. Maybe in some future iteration that persistence layer would be exposed as a virtual table or vfs?

Hi! Diamond types author here.

UPDATE tbl SET doc = text_crdt_update(text, position) WHERE id = 123;

You'd need to specify whether your change is an insert or delete. And for deletes, storing the deleted content is optional.

You'll probably also need some methods to sync changes with other peers. How does this work at the moment? Is there a set of methods for querying changes since some version? Is it table-specific?

Targeting WASM makes everything harder since WASM binaries currently can not load other binaries (AFAIK) so we have to compile Rust code with C code. Maybe I'm worried about nothing and that is actually an easy thing to do?

Compiling rust code to a C library (either static or dynamic) is really easy. We just need to write a simple C wrapper API around diamond types with #[no_mangle] and friends. And it shouldn't take more than an hour or two to get something simple working. This is the WASM API at the moment, which I'd base the API on. Here's some documentation on how to make a rust project into a C library. And there's even projects like cbindgen which can automatically generate C headers. Probably take me a couple hours to get something like that working if I do it, but if someone else wants to take a stab at it go for it!

Adding this stuff will also mean you need a rust compiler to be able to compile the library. Thats one nice advantage about wasm - you can ship a compiled wasm file and it works everywhere.

I have no idea at the moment. I know diamond types handles its own persistence. Maybe in some future iteration that persistence layer would be exposed as a virtual table or vfs?

Right now (v1.x) diamond types has its own encode function which packs the operations into a highly compact byte array. (And corresponding decode methods).

Re-saving the entire oplog with each change isn't very efficient, and its something I'm fixing for diamond types 2.0. You can also encode partial subsections of the oplog into chunks on disk and join them back together again later if you want.

But the changes are internally incredibly table-like either way. You could absolutely store them in a SQL table if you wanted - though size on disk would probably increase a bit. Or use diamond types but expose them via a virtual table.

The log of text operations is one big append-only table with a handful of columns.

A lot of the performance of diamond types comes from run-length encoding those changes everywhere, including in memory while we're working on things. On disk every column gets individually run-length encoded based on what will let us store it with the fewest bytes.


This description got longer than I expected, but I just kept typing anyway because I'm in a documenting mood, and in case its interesting to people. I should put this write-up somewhere.

But if you're interested, the operations look like this:

$ dt log benchmark_data/node_nodecc.dt --json
...
{"kind":"Ins","start":2360,"end":2361,"fwd":true,"content":"="}
{"kind":"Ins","start":2362,"end":2371,"fwd":true,"content":"args[0];\n"}
{"kind":"Ins","start":2373,"end":2376,"fwd":true,"content":"Str"}
{"kind":"Ins","start":2377,"end":2383,"fwd":true,"content":"ng::Ut"}
{"kind":"Ins","start":2384,"end":2390,"fwd":true,"content":"8Value"}
{"kind":"Ins","start":2391,"end":2396,"fwd":true,"content":"value"}
{"kind":"Ins","start":2397,"end":2406,"fwd":true,"content":"arg);\n\n  "}
{"kind":"Del","start":2411,"end":2413,"fwd":true}
{"kind":"Ins","start":2411,"end":2426,"fwd":true,"content":"f(\"%s\\n\", *valu"}
{"kind":"Ins","start":2427,"end":2436,"fwd":true,"content":");\n  fflu"}
{"kind":"Ins","start":2437,"end":2443,"fwd":true,"content":"h(stdo"}
{"kind":"Del","start":2444,"end":2445,"fwd":true}
{"kind":"Ins","start":2445,"end":2449,"fwd":true,"content":");\n\n"}
{"kind":"Del","start":2450,"end":2452,"fwd":true}
{"kind":"Del","start":2451,"end":2452,"fwd":true}
{"kind":"Del","start":2453,"end":2454,"fwd":true}
  • kind = Delete or Insert
  • start/end describe where in the document the characters were inserted / deleted. Half open range.
  • fwd is an optimization trick. If the user backspaces a run of characters, we mark those changes as fwd: false and we can still RLE the edits. Don't worry about it.
  • content is the inserted content. (Deleted content can also be saved if you want to)

You can see the run-length encoding in the example above. Without it, every inserted or deleted character would need its own row in the table. On disk this is way more compact, but thats another story.

There's not shown but implied autoindexing row ID in the above data. That counts from 0 at the first locally known change and goes up by 1 for each inserted or deleted character. (This ID is called a "local version" internally in diamond types).

There's 2 more columns missing from the data above: Parents and Agent Assignment.

Parents information describes when each change happened relative to other changes. Eg:

$ dt log benchmark_data/node_nodecc.dt --json --history
{"span":[0,63597],"parents":[]}
{"span":[63597,68772],"parents":[63276]}
{"span":[68772,68916],"parents":[63596,68771]}
{"span":[68916,79737],"parents":[68771]}
{"span":[79737,79868],"parents":[68915,79736]}
{"span":[79868,79999],"parents":[79736]}
{"span":[79999,80299],"parents":[79867,79998]}
...

(span here is the missing ID column from above).

Parents work the same as the parents of a git commit. The only difference is that I'm just using (local) autoincrementing integers instead of git SHAs to name each change, so I can run-length encode better. You can turn this sort of data into a DAG and for super complex data sets you can end up with something like this.

And then the third piece of data is agent assignments. Each user agent generates for itself a unique agent ID (a string). Over the network, each change is identified by a unique (agentId, sequence number) pair. Eg (seph,0). The last piece of data associates local versions with agent+seq pairs. Its used when peers communicate they can figure out what changes they need to send. And to deduplicate received changes. And its used for tie breaking when users concurrently insert at the same place in the document, at the same time.

So anyway, if you want to actually store all this stuff in a SQL table (or a couple of tables) or expose it via a virtual table let me know. There's probably some missing API methods to query all this stuff back out of diamond types.

So anyway, if you want to actually store all this stuff in a SQL table (or a couple of tables) or expose it via a virtual table let me know. There's probably some missing API methods to query all this stuff back out of diamond types.

Super interesting. With it being structured data already I'm really fond of this idea. I'm not sure if a full vtab is needed for it, or just a set of functions at the column level (behind the functions operations on the underlying tables would be executed). I guess a full vtab could allow the user greater control of the underlying type...

There is already a vtab for changesets on the LWW-registers, so (perhaps) ideally the diamond types (and other future CRDTs) changesets should be accessible and appliable through the same interface. What do you think @tantaman?

We've (iver & myself) been discussing this a bunch offline and here is a summary of where we're at so far:

  1. Queries that order by a recursive relationship in SQLite are actually rather performant (some data: https://vlcn.io/blog/recursive-ordering-in-sqlite)
  2. Because of (1), we can represent any tree-based CRDT in a table
  3. A proposed table structure is something like:
CREATE TABLE ytext (
    doc_alias,
    clock, client_alias, -- id
    origin_clock, origin_alias, -- origin
    right_origin_clock, right_origin_alias, -- rightOrigin
    content,
    length,
    tail_clock AS (clock + length - 1),
    PRIMARY KEY (client_alias, tail_clock)
  );
CREATE UNIQUE INDEX ytext_tailclock ON ytext (client_alias, tail_clock);
CREATE INDEX ytext_origin ON ytext (origin_alias, origin_clock);

Note that the above structure is for yjs given I'm a bit more familiar with yjs at this current time. I've not run any of this by the yjs folks yet so the proposed table structure could be wrong in certain ways. Tagging in case they are curious @Horusiath, @dmonad.

In the above table:

  • *_alias columns are local integer values that are mapped to uuids in a separate table. Mainly so we don't need to write the actual uuid of a site or document for each row.
  • doc_alias is the id of the document you want to store, allowing us to save any number of documents in the ytext table
  • tail_clock is a computed column so we know the clock value of the last character in the row. This, in addition to the index on tail_clock, enables efficient traversal of the tree via SQL.

Given all the state updates are rows in a table, incremental updates to and merging of the doc becomes pretty efficient. Just an insert with an ON CONFLICT clause to handle content splitting.

Next tricks are to:

  • ensure we can keep the yjs/diamond types/other crdt integrations' clocks in sync with or mapped to the crsqlite's db clock
  • return change sets from integrations with the rest of the changes from the db when calls are made to SELECT * FROM crsql_changes.

I have some ideas on these but I need to iterate a bit more.


Of course the other option for a yjs integration is to just save raw update blobs as the indexeddb integration does. Downsides of that are:

  • can't retrieve a patch set for a state vector from the db directly
  • can't reconstruct the document as a string in the db itself

Although we could timestamp the received updates with the db's clock and integrate that. The only missing piece then is reconstructing the doc as a string in the db.

This latter approach looks to be the route taken by the leveldb integration -- https://github.com/yjs/y-leveldb

The state vector (describing the state of the persisted document - see Yjs docs) is maintained in a separate field and constantly updated.

This allows you to sync changes without actually creating a Yjs document.

We don't need yjs's state vector, however, given we track all of those details via crsqlite.

Another interesting thing -- sqlite support incremental I/O against blob columns: https://www.sqlite.org/capi3ref.html#sqlite3_blob_open

If you're interested, one alternative is to use a "CRDT-quality" version of fractional indices. Specifically, strings such that:

  • They are lexicographically ordered
  • You can generate a string between any two existing strings (dense)
  • Concurrently-generated strings are unique (no ties) and do not interleave
  • String length grows logarithmically for in-order insertions

These will never be as storage-efficient as a dedicated list CRDT, but you can generate the strings in <100 LOC and put them in a "position" column that you ORDER BY.

More info, log-growth prototype

PS: Nice talk at lfw.dev!

@mweidner037 I still haven't seen any fractional indexing string which didn't have interleaving problems.. Has that been solved?

Yeah, it was also my impression that you couldn't fix the interleaving problem with fractional indexing.

btw -- we have implemented fractional indexing in cr-sqlite. An overview of the feature here: https://www.youtube.com/watch?v=BghFgK6VJIE

I'll take a look at the links you provided to see if there's a significant difference between the approaches.

@mweidner037 I still haven't seen any fractional indexing string which didn't have interleaving problems.. Has that been solved?

It should be non-interleaving in both directions. The total order is equivalent to a tree walk over a binary-ish tree; concurrently-inserted sequences end up in different subtrees, which don't interleave.

You do pay for this with longer strings, though: "averaging" two neighboring strings adds a UID (~13 chars) instead of a single bit, except when the in-order optimization kicks in.

You do pay for this with longer strings, though: "averaging" two neighboring strings adds a UID (~13 chars) instead of a single bit, except when the in-order optimization kicks in.

We've found that you could use a RECURSIVE ORDER BY (https://www.sqlite.org/lang_with.html#hierarchical_query_examples) so each index only needs a reference to its parent instead of containing the full traversal of the tree. So length of the index becomes deterministic and not dependent on the depth of the tree.

Basically what's going on here #65 (comment)

@mweidner037 - reading through https://mattweidner.com/2022/10/21/basic-list-crdt.html#semantics

If a is not a prefix of b, append a globally unique new string to a (see next paragraph) and return that plus 'R'. This will be greater than a because itโ€™s longer, and less than b for the same reason that a < b.

Any idea how long these strings end up getting in practice?

where replicaID is a reasonably short ID for the local replica (device)โ€”say 10 random charactersโ€”and counter is a local counter incremented for each causal dot.

We could do something for client-server setups that give out short IDs for clients. E.g., some auto-increment number assigned to clients by the server.

Another optimization may be to assign replicas short ids to use for that specific document when they join the document.

You can even sort them in CRDT-unaware contexts

I really like this property. It makes it easy to represent a rich text doc to other parts of a system and also make integrations with rich text editors (e.g., lexical, prosemirror) easier.

Any idea how long these strings end up getting in practice?

Stats for the long text trace used in crdt-benchmarks are here (edit: see newer numbers below).

The average position length is 423 chars, which is awkwardly long :( For the first 1,000 ops, the average is 97 chars.

Prefix compression would help a lot, but I don't think SQLite does that. For sending positions over the network, normal compression does help - the average compressed length is 148 bytes. Short IDs would indeed help too - currently, IDs are 8 random chars, and you pay that cost once per "node" (an average position has 29 nodes).

I'll see if I can optimize these further.

You can even sort them in CRDT-unaware contexts

I really like this property. It makes it easy to represent a rich text doc to other parts of a system and also make integrations with rich text editors (e.g., lexical, prosemirror) easier.

One alternative would be to have database-internal positions that use RECURSIVE ORDER BY like above, plus functions to map those positions to lexicographically-sorted strings. The function would basically find the position's path in the underlying tree, then make a string out of the node IDs on that path. That way you don't pay the length cost for positions that are "at rest".

Any idea how long these strings end up getting in practice?

Stats for the long text trace used in crdt-benchmarks are here (edit: see newer numbers below).

The average position length is 423 chars, which is awkwardly long :( For the first 1,000 ops, the average is 97 chars.

I added some optimizations that make this much better (details):

For the complete trace (182k positions, 260k total edits) typed by a single PositionSource, the average position length is 33 characters, and the max length is 55.

For a more realistic scenario with 260 PositionSources (a new one every 1,000 edits), the average position length is 111 characters, and the max length is 237. "Rotating" PositionSources in this way simulates the effect of multiple users, or a single user who occasionally reloads the page. (The extra length comes from referencing multiple IDs per position: an average of 8 IDs/position x 8 chars/ID = 64 chars/position.)

If you assign short replica IDs as @tantaman suggested, the "rotating" stats should improve noticeably.

Otherwise, I don't think I can improve this much more; does it seem good enough?

I'll have a chance to get back to this at the end of the week. Will let you know then.

btw, @mweidner037 - I got the impression during one reading that tombstones were not required in this approach but then, on another reading, that they are required.

Can you help clear that up for me? Is there an approach with fractional indices that does not require tombstones for character removals?

btw, @mweidner037 - I got the impression during one reading that tombstones were not required in this approach but then, on another reading, that they are required.

Can you help clear that up for me? Is there an approach with fractional indices that does not require tombstones for character removals?

You don't need tombstones - just the positions of present characters. It's okay to call d = source.createBetween(a, b) even if there used to some other a < c < b where c was deleted. You can even "restore" c later and it will compare to d in a reasonable way (though I'm not sure about non-interleaving).

Each PositionSource does keep a bit of state about past positions to ensure d != c in that scenario, but it's small and forgotten at the end of the session (when PositionSource falls out of scope).

Fractional indexing in general also shouldn't need tombstones, except for the usual uniqueness issues.

Makes sense and matches what I originally thought.

The slide that confused me: https://docs.google.com/presentation/d/1u8bcvfEcJ2wseH3u4P8QAMabq5VZrPR-FX8VaIIkbFQ/edit#slide=id.g11737e0938d_0_338

I probably just didn't read the entire deck carefully enough.

The slide that confused me: https://docs.google.com/presentation/d/1u8bcvfEcJ2wseH3u4P8QAMabq5VZrPR-FX8VaIIkbFQ/edit#slide=id.g11737e0938d_0_338

Ah yes, that's describing a different version of Plain Tree that does have tombstones. position-strings is more like this slide, except it does have the LtR optimizations (in a new way that avoids interleaving).

The slides are a bit older than the blog post, so they omit the string implementation that I meant to link to. Sorry for the confusion!

In general, for any tree-based list CRDT, you can choose if you want "no-tombstone mode" with longer positions (= whole paths in the tree), or "tombstone mode" with shorter positions (= pointers to nodes in the tree). The former lets you compare two positions directly, without consulting some external tree state; so position-strings (and fractional indexing) uses this mode. But tombstone mode is usually more memory-efficient, so that's what the slides recommend, and what libraries like Yjs do.

I wonder if its possible to store an entire string of characters per position instead of just a single one @mweidner037?

So inserting something within an existing node, would split that into two nodes, and insert a new text node between them. Split nodes would have to be sorted at the same position as they were before the split, while it still being possible to insert before, after or between them.

This would also require extra CRDT logic at merge for concurrent splits, as well as some functionality a substring within a node. The advantage is a lot less metadata per character.

I wonder if its possible to store an entire string of characters per position instead of just a single one @mweidner037?

My first impression is that it would be simpler to wrap a CRDT that does so internally, like the beginning of this thread discusses for diamond-types.

I think it would be possible to re-implement that optimization within a database table (e.g. columns "position" and "substring", instead of "position" and "char"), but it could be hard to query.

This is an interesting integration of yjs + sqlite -- https://github.com/samwillis/yjs-sqlite-test by @samwillis

There look like a few issues with the linked project, however:

  1. It'd require syncing yjs docs out of band rather than syncing through the normal database sync protocol
  2. Saving updates requires re-writing the whole column rather than making use of SQLite's incremental blob update API
  3. It is all JS rather than using Yrs so we can't cross compile to other platforms

exciting though.

I think I'll either go with:

Screenshot 2023-06-13 at 12 52 02 PM

or

Screenshot 2023-06-13 at 12 52 16 PM

unless @mweidner037 has come up with something else in the meantime.

Elaborating on #65 (comment) , I think we can implement a variant of Fugue [1] that stores multiple chars per row. (Untested)

Terminology

  • An item is a run of text inserted contiguously, left-to-right, by a single user.
  • A sub-item is a slice of an item. A sub-item's index is the last index that it strictly contains.
    • E.g., we could split the item (id="Alice724", content="Hey there") into sub-items (itemId="Alice724", index=3, content="Hey ") and (itemId="Alice724", index=8, content="there").

We store one sub-item per row:

CREATE TABLE fugue(
    itemId TEXT,
    index INT,
    content TEXT,
    parentItemId TEXT,
    parentIndex INT,
    PRIMARY KEY(itemId, index),
    FOREIGN KEY(parentItemId, parentIndex) REFERENCES fugue
);

Tree Order

The parentItemId and parentIndex columns reference an existing sub-item, turning the rows into a tree. This tree is sorted in depth-first order, with siblings sorted by itemId, then index.

In other words, you sort items arbitrarily by their itemIds, then sort sub-items within an item by index.

To create parents, you often need to "split" a sub-item into two sub-items. E.g., starting from the tree

Root
...(itemId="Alice724", index=8, content="Hey there")

if someone types "you " between "Hey " and "there", then we split the "Hey there" sub-item into "Hey " and "there", and we use "Hey " as the new parent:

Root
...(itemId="Alice724", index=3, content="Hey ")
......(itemId="Bob639", index=3, content="you ")
...(itemId="Alice724", index=8, content="there")

Selecting the sub-items in sorted order should look something like this (cf.):

WITH_RECURSIVE under_node(content,level,itemId,index) AS (
    VALUES('Root',0,?,?)
    UNION ALL
    SELECT fugue.content, under_node.level+1, fugue.itemId, fugue.index
        FROM fugue JOIN under_node ON fugue.parentItemId=under_node.itemId AND fugue.parentIndex = under_node.index
        ORDER BY 2 DESC, fugue.itemID, fugue.index
)
SELECT content,itemId,index FROM under_node WHERE index != -1;  -- index -1 explained below

Insertions

To insert a new character, assume you know the (itemId, index) of its left and right neighbors. You need to insert it somewhere that will be between these neighbors.

There are a few cases:

  1. If allowed, extend the left neighbor's item.
    • Allowed if the local device created that item, the left neighbor is the last char in that item, and the sub-item (leftItemId, leftIndex) doesn't have any children.
    • In this case, append to (leftItemId, leftIndex)'s content and increment its index.
  2. Else, if the sub-item (leftItemId, leftIndex) does not already have children, create a new child of that sub-item.
    • If the sub-item doesn't exist, create it by splitting (example above).
  3. Else, create a new sub-item (rightItemId, -1) with a "fake" index of -1 and NULL content, then create a child of that sub-item.
    • -1 makes the new sub-item go to the left of the right neighbor. It corresponds to Fugue's left-side children.

Merging

To merge two tables (or process an update), you mostly need to take the union of their rows. However, you also need to "clean up" sub-items that got split.

E.g., to process the "you " update above, a remote user needs to:

  • Add the two new sub-items to its table:
(itemId="Alice724", index=3, content="Hey ")
(itemId="Bob639", index=3, content="you ")
  • Trim the old sub-item (itemId="Alice724", index=8, content="Hey there") so that it doesn't overlap with (itemId="Alice724", index=3, content="Hey "):
(itemId="Alice724", index=8, content="there")

Deletions

These should involve tombstones; otherwise I haven't thought it through.


[1] https://arxiv.org/abs/2305.00583 . The alg described here is more similar to Braid's Sync9 list CRDT, which is apparently order-equivalent to Fugue.

@mweidner037 - The rules for extending an item make sense to me.

I'm a bit unsure about item splitting. Wouldn't splitting an item require re-parenting its children? If we have the string Hey there bob where Hey there is a single item and bob is a child of it, what is Bob's parentId after splitting Hey there into Hey & there?

@tantaman If we start with sub-item (itemId="Alice724", index=8, content="Hey there") for Hey there, then bob will have (parentItemId="Alice724", parentIndex=8). After splitting, Hey there becomes two sub-items:

(itemId="Alice724", index=3, content="Hey ")  <-- New row
(itemId="Alice724", index=8, content="there")  <-- Old row with truncated content

Then bob still has (parentItemId="Alice724", parentIndex=8), i.e., its parent is the same row as before.

Here we used the facts that:

  • Sub-items of the same item share the same itemId.
  • We label sub-items by their last index, not their first. Indeed, children really attach to the last character, which is identified by (itemId, char's index) regardless of how the parent item splits into sub-items.

@mweidner037 - Got it.

The last case I'm still fuzzy on is an item being split concurrently by many peers. Is this an accurate description of the cleanup algorithm?

Peer 1:

(itemId="Alice724", index=3, content="Hey ")
(itemId="Alice724", index=8, content="there")

Peer 2:

(itemId="Alice724", index=2, content="Hey")
(itemId="Alice724", index=8, content=" there")

Peer 3:

(itemId="Alice724", index=6, content="Hey the")
(itemId="Alice724", index=8, content="re")

On merge we union:

(itemId="Alice724", index=3, content="Hey ")
(itemId="Alice724", index=8, content="there")
(itemId="Alice724", index=2, content="Hey")
(itemId="Alice724", index=8, content=" there")
(itemId="Alice724", index=6, content="Hey the")
(itemId="Alice724", index=8, content="re")

Then "clean up" would be:

  1. Take the item with the smallest index
  2. Find items (with same itemId) that overlap this item by:
    1. finding items with startIndex < index where startIndex = index - content.length - 1
  3. Substring these overlapping items from [the index found in step 1] + 1 to the end of the string
  4. Repeat with the next smallest index

Example:

Pass 1 of cleanup:

(itemId="Alice724", index=2, content="Hey") -- smallest index

-- overlapping items (index - content.length - 1 < 2) and thus trimmed
`Hey ` -substring(2+1)-> (itemId="Alice724", index=3, content=" ")
`Hey the` -> (itemId="Alice724", index=6, content=" the")

-- non-overlapping
(itemId="Alice724", index=8, content="there")
(itemId="Alice724", index=8, content=" there")
(itemId="Alice724", index=8, content="re")

Pass 2 of cleanup:

(itemId="Alice724", index=2, content="Hey") -- complete
(itemId="Alice724", index=3, content=" ") -- next smallest index

-- overlapping items (index - content.length - 1 < 3) and thus trimmed
` the` -> (itemId="Alice724", index=6, content="the")
` there` -> (itemId="Alice724", index=8, content="there")

-- non-overlapping
(itemId="Alice724", index=8, content="there")
(itemId="Alice724", index=8, content="re")

etc.

I'm excited for this approach. Seems like it could work well & without crazy storage requirements.

@tantaman Yes, that looks correct.

When you do the union, I believe it is okay to skip duplicate primary keys (itemId, index). E.g. above, you only need one of the three rows with (itemId="Alice724", index=8); whichever you pick, the cleanup algorithm will give the same final answer. So it is okay to stick with the one you already have and ignore the new ones.

On a separate topic, @mweidner037, are you aware of the work on Portals in Sync9? It is a new concept to support the notion of replacing text. The really awesome thing is that it seems to solve the rich text problem as rich text edits become replace operations. E.g., foo -- replace --> **foo**

The idea is that a portal represents a view into a document at a prior version so you can get the exact text the user intended to move or replace. It is presented in more detail here: https://braid.org/meeting-62

On a separate topic, @mweidner037, are you aware of the work on Portals in Sync9?

Thanks, I had not seen these before. Portals look interesting, though they might be hard to implement efficiently - you may need to track versions/concurrency with vector clocks, which get large quickly.

What do you think about deletes?

Seems like there's two ways:

  1. Insert a new item to indicate characters are removed from its parentItem
  2. Update and tombstone the items themselves. Splitting into sub-items as needed.

The latter seems nice given previously inserted characters can fully be scrubbed out of the document.

Yes, I think you can make deletes work using "tombstone sub-items", represented as rows with content=NULL (or similar).

  • When you delete a single element, split its sub-item as needed into sub-items that are either purely deleted (tombstone sub-items) or purely present (normal sub-items). E.g.:
    Starting state:
    (itemId="Alice724", index=6, content="Hey you")
    
    After deleting the 2nd "y":
    (itemId="Alice724", index=3, content="Hey ")
    (itemId="Alice724", index=4, content=NULL)
    (itemId="Alice724", index=6, content="ou")
    
  • Merging updates starts normally: union the rows (skipping duplicate primary keys) and trim contents so they don't overlap. Next, for each primary key (itemId, lastIndex) whose content is NULL in either source table, set its content to NULL in the merged table.
    • I believe this will work even if you have "overlapping" splits. E.g.:
      Alice:
      (itemId="Alice724", index=6, content="Hey you")
      (itemId="Alice724", index=12, content=NULL) # Tombstone for " there"
      
      Bob:
      (itemId="Alice724", index=3, content="Hey ") # Split due to a child item (not shown)
      (itemId="Alice724", index=12, content="you there)
      
      Merge Alice into Bob, normal pass:
      (itemId="Alice724", index=3, content="Hey ")
      (itemId="Alice724", index=6, content="you")
      (itemId="Alice724", index=12, content=" there")
      
      Next do tombstone pass:
      (itemId="Alice724", index=3, content="Hey ")
      (itemId="Alice724", index=6, content="you")
      (itemId="Alice724", index=12, content=NULL)
      
      Merging Bob into Alice gives the same result.
      
  • Deleting chars one at a time will initially create a separate tombstone sub-item for each char. But adjacent deletions within the same item are logically equivalent to one big tombstone sub-item. You are free to "coalesce" these: delete the non-rightmost tombstone sub-items. Except, don't delete any sub-items that have children. E.g.:
    Resulting of deleting "you" one char at a time from (itemId="Alice724", index=6, content="Hey you"):
    (itemId="Alice724", index=3, content="Hey ")
    (itemId="Alice724", index=4, content=NULL)
    (itemId="Alice724", index=5, content=NULL)
    (itemId="Alice724", index=6, content=NULL)
    
    Coalesced version:
    (itemId="Alice724", index=3, content="Hey ")
    (itemId="Alice724", index=6, content=NULL)
    
    Coalesced version if (itemId="Alice724", index=5) was referenced elsewhere as a parent:
    (itemId="Alice724", index=3, content="Hey ")
    (itemId="Alice724", index=5, content=NULL)
    (itemId="Alice724", index=6, content=NULL)
    
    I believe it is okay to do coalescing at-will: eagerly, lazily, whenever. Merges should give some correct result whether or not the inputs are coalesced, and of course you can coalesce the merged state at-will too. (Have not checked this thoroughly.)

Thanks @mweidner037. This has been super helpful.

Closing this as I think enough research has been done and I'm ready to start implementing -- #323