How to avoid re-fetching objects after syncing changes?

Question

How to avoid re-fetching objects after syncing changes?

judofyr opened this issue 4 years ago · 5 comments

Here's a scenario I've been wondering about:

The client is currently synced up to state=s1 for an object type.
Something triggers the client to fetch a new object (with ID o1). The response includes state=s5.
The client needs to sync so it invokes /changes with sinceState=s1 to get all of the changes. The response returns oldState=s1, newState=s7 and o1 is marked as created.

We know that o1 was created between s1 and s7, and we already have the object at state s5, but we don't know if the object has changed between s5 and s7.

What's the recommended way of dealing with this?

Some alternatives I've been thinking about:

We keep the object and once the /changes returns we invalidate it if the newState from /changes doesn't match up with the state from /get. The idea is that it's unlikely for the state to change between the response from the /get until you invoke /changes. This will however always invalidate the object if the server decides to use the fancy "serve newest changes first" as then newState will always be a different state.
We can do two calls to /changes for both s1 and s5, and then subtract one from the other. This has the disadvantage that they might contain a lot of duplicate data, and that subtraction might not even be possible in a single call if they return hasMoreChanges.

Thoughts? Am I overcomplicating this issue?

Answer 1 · 2020-04-28T12:12:42.000Z

I would do something like your first option. If /changes returns o9 as updated and o1 as created (say), I would just invalidate both if they are cached in my store. Most of the time, anything in created will not be cached (especially if you are using push; you are most likely to get a push, call /changes and possibly fetch the created in the same call using back references, but the objects will still be returned after the /changes in the call), so there's not much to gain from trying to do something more clever.

If you are keeping track of state strings on a per-object rather than per-type basis (I don't, because you don't really need to in general) you could skip invalidating it if the object's state string matches the newState of the /changes response, as you suggest.

Does that make sense?

Answer 2 · 2020-04-28T14:27:24.000Z

I would do something like your first option. If /changes returns o9 as updated and o1 as created (say), I would just invalidate both if they are cached in my store. Most of the time, anything in created will not be cached (especially if you are using push; you are most likely to get a push, call /changes and possibly fetch the created in the same call using back references, but the objects will still be returned after the /changes in the call), so there's not much to gain from trying to do something more clever.

The context here is in a webapp where you have cached the objects in IndexedDB. If the user clicks on a link to an object that is not cached (e.g. /items/123) then it would be preferable if I can kick away a separate /get request (potentially with fetching more data using back references) to improve the first load, and then in the background also do the regular syncing of existing cached data (so that if the user clicks on another page, then it's fast and swift). I'm aware that this isn't the use-case which JMAP was originally designed for, but it would be nice if there was a way to accomplish it.

If you are keeping track of state strings on a per-object rather than per-type basis (I don't, because you don't really need to in general) you could skip invalidating it if the object's state string matches the newState of the /changes response, as you suggest.

Even if I have per-objet state tracking it's not possible to detect this today in the example I gave. The /changes call returns the differences between [s1, s7] and o1 is included there. I still don't know if it was changed after s5 (which is the object's latest state). This will also be a lot worse if the user uses the "return recent changes first" since it will create intermediate states that will not match neither s5 nor s7.

Answer 3 · 2020-04-30T12:27:33.000Z

Yes, this sounds fine. It depends on the application of course, but for most it should be fine to fetch the object and use the cached data while you do the rest of the sync in the background.

Even if I have per-objet state tracking it's not possible to detect this today in the example I gave. The /changes call returns the differences between [s1, s7] and o1 is included there.

Sorry, I must not have been clear. I'm saying the common case is likely to be the object from the /get was s7 and your /changes call gives you [s1,s7]. In this case, you don't need to invalidate the object. But it's really only a minor optimisation: yes you have to refetch any mutable properties and probably none have changed, but it's unlikely to be for many objects and unless each one is particularly large/expensive to fetch it won't add much overhead.

Example.

Store has objects cached: o1, o2, o3 … o99 with state s1.
User clicks link with id o123.
App fetches o123 and gets the data in state s3.
App shows the data while in the background doing /changes from s1. The ultimate state is s5 and o123 is included as created in one of the pages of changes.
The app has to invalidate o123 as it may have been updated since it was fetched, however refetching the mutable properties of one object is generally negligible, especially if you're bundling it in a request for updates to other invalidated objects.

Answer 4 · 2020-05-05T07:18:09.000Z

Then it seems like I've understood how JMAP works today. I have some ideas for a slightly modified /changes-semantics which could handle more complicate state-changes, but I'll leave that for now and will investigate it later if it turns out that over-fetching is a problem for my use case.

Answer 5 · 2020-05-05T07:20:57.000Z

Sure, always happy to hear more ideas. Generally, there's a trade-off in terms of extra state data having to be transferred with every request in order to potentially save some data in certain resync cases later. Whether that's worth it will depend on the use case to a certain extent, but we've tried to pick a broadly applicable model for the base JMAP standard.