rocicorp/replicache-old

Feature Request: JSONPatch for ClientView

Closed this issue · 3 comments

Problem

The basic model of Replicache of client view always returning a snapshot is easy and correct.

But commonly, developers find constantly re-requesting the entire snapshot from authoritative store is too expensive. For example, consider an example application built against a third-party API like Google Calendar. Fetching all calendar data for a user every time the client view is queries could quickly exhaust usage limits against the API, and also limits responsiveness of sync.

Proposal

Allow developers to respond to a client view request with a JSON patch. We apply the patch against the existing client view to yield the result.

Concerns

This makes it easier for developers to create incorrect client views:

  • The developer must correctly store information associated with each state and carefully compute a correct patch.
  • If the schema of the client view changes, developers must remember to send a snapshot or the appropriate wider-ranging diff.
  • Being able to store small bits of data in the client view as in #74 can make this easier, but it is still much more difficult to get right than the snapshot-based client view.

Alternative

Developers can also cache last known snapshot on their side (e.g., in memory), then fetch only changes from their backend resources, apply the patch, and then return the snapshot to Replicache. However, at that point the developer is duplicating some of the work that diff-server is doing.

Out of band we have been discussing the difficulty of keeping a checksum of the map in the data layer in a way that doesn't require the data layer to store all the data. A merkle-izing approach seems promising but might require the data layer to store some state for each place in the "tree" of json values that something could be patched. JSON patch enables you to make changes lots of places including into the middle of an array so that's potentially a lot of places where things could change. It might be helpful if we could constrain the patch points to be named object fields/properties. That is, to require the patch path to end in a string eg for:

{
   "foo": [ { }, { "key": true } ],
}

we would allow you to patch /foo/1/key but not /foo/1 or /foo/-.

Is this a reasonable constraint?

In the scheme I'm contemplating for each let's call them "patch point" -- field where we might patch data in -- we have to keep a record of its path and checksum, and the checksums of each step up the path to the root. So eg if we want to patch at /foo/1/bar/3/baz we (the data layer, eg) would have to keep the records:

{
   "/": <4 byte value>,
   "/foo": <4 byte value>,
   "/foo/1": <4 byte value>,
   "/foo/1/bar": <4 byte value>,
   "/foo/bar/3": <4 byte value>,
   "/foo/1/bar/3/baz": <4 byte value>,
}

This could get pretty expensive if there are a lot of patch points.

In an ideal world they could declare their patch points in the data itself, eg with a naming convention or special property. Or maybe we could limit the depth of the points at which you can patch? Hmmm.

Proposal for now to do incremental checksumming with field level patching: rocicorp/repc#207