Wishlist feature: offline-first

Question

Wishlist feature: offline-first

Opened this issue 9 years ago · 29 comments

See React.js Conf 2016 - Aditya Punjani - Building a Progressive Web App.

This requires some careful scaffolding that can easily be reused between apps, so makes a prime candidate for meatier inclusion IMHO.

Basically requires:

Service worker that handles caching and maybe notifications
Encryption
Redux state serialization?
Forgiving server access layer in the app?
Also, best practice: show a "loading" view where appropriate. 0 items vs no-items-yet, placeholder views for photos and text, that kind of thing.

For service workers, see https://changelog.com/essential-reading-list-for-getting-started-with-service-workers/.
For encryption, there is already #80.

The big advantage of offline-first is native-like loading speed:

("SW + App Shells" is Service Worker + pre-rendered-and-cached page views that look the same for all visitors. Basically, don't put user-specific data in pre-rendered views. Perhaps a result cache based on url?)

jamesdwilson commented 9 years ago

+1

ferologics commented 9 years ago

+1

Answer 1 · 2016-03-06T15:02:05.000Z

A PR does infinitely more good than a +1 😉

Answer 2 · 2016-03-06T17:42:41.000Z

I understand the sentiment but I'd argue a +1 demonstrates interest for
others to do PRs as well.

On Sun, Mar 6, 2016 at 9:02 AM, Matt Krick notifications@github.com wrote:

A PR does infinitely more good than a +1 [image: 😉]

—
Reply to this email directly or view it on GitHub
#101 (comment).

Thank you,

James D. Wilson
http://jameswilson.name
http://lnked.in/technologist
https://github.com/jamesdwilson
https://twitter.com/thejamesdwilson https://twitter.com/thejamesdwilson

Answer 3 · 2016-03-06T18:29:25.000Z

yeah... that's what's broken in a bunch of OSS. There's no white knight going around making PRs on issues that have the most "+1"s, awesome as that sounds. As a positive example, look at linux or nodejs. Someone needs a feature, their company sponsors them to write the PR, others validate it, and you've got a healthy feature added.

I've got no problems if someone files a feature request as an issue, but saying "+1" tells me the feature is not desirable enough to spend your own time & brain juice on it, which ultimately tells me it's not that important. Even if this project were venture backed & looking to prioritize a next sprint, I'd personally select the next sprint based on the issue with the most thoughtful comments. We're all in this thing together. That's what makes this a community 😄

Answer 4 · 2016-03-06T19:22:47.000Z

What's missing is a "star" or a more integrated +1 feature. Then tools to
show the most requested features of all projects in aggregate. I do agree
with most of your sentiment.

On Sun, Mar 6, 2016 at 12:29 PM, Matt Krick notifications@github.com
wrote:

yeah... that's what's broken in a bunch of OSS. There's no white knight
going around making PRs on issues that have the most "+1"s, awesome as that
sounds. As a positive example, look at linux or nodejs. Someone needs a
feature, their company sponsors them to write the PR, others validate it,
and you've got a healthy feature added.

I've got no problems if someone files a feature request as an issue, but
saying "+1" tells me the feature is not desirable enough to spend your own
time & brain juice on it, which ultimately tells me it's not that
important. Even if this project were venture backed & looking to prioritize
a next sprint, I'd personally select the next sprint based on the issue
with the most thoughtful comments. We're all in this thing together.
That's what makes this a community [image: 😄]

—
Reply to this email directly or view it on GitHub
#101 (comment).

Thank you,

James D. Wilson
http://jameswilson.name
http://lnked.in/technologist
https://github.com/jamesdwilson
https://twitter.com/thejamesdwilson https://twitter.com/thejamesdwilson

Answer 5 · 2016-03-21T21:19:34.000Z

This probably also needs service worker invalidation mechanics. Getting rid of a borked service worker can be really painful. A well-tested service worker wrapper would go a long way to making sure we can always bypass/remove the service worker if necessary.

The includeStates option on rethink changefeeds makes it easy to set a loading view.
I assume auth state would need to be cached as well.

Answer 6 · 2016-04-18T12:51:12.000Z

offline-plugin seems to work okay for react-boilerplate. Would a PR with something like that work?

Answer 7 · 2016-04-18T14:31:14.000Z

@wtgtybhertgeghgtwtg I've seen that, but haven't played with it much. I think it would work, but ultimately it'd be nice to arrive at a solution like what's described here: www.pocketjavascript.com/blog/2015/11/23/introducing-pokedex-org where it sends a toast when you're offline, etc. Not sure if that's possible with the webpack plugin or if more work is required on the sw.js. I've still got a stack of offline-first books I need to read 😄

Answer 8 · 2016-04-18T15:05:48.000Z

On that note, have you given any consideration to taking a progessive web app approach?

Answer 9 · 2016-04-18T21:30:55.000Z

@mattkrick The Pokedex.org example is awesome, but it's really showcasing the multi-master replication of CouchDB, using PouchDB to synchronize a local database (the better adapter is selected depending on the browser) from a remote CouchDB, handling possible conflicts based on the documents' _rev.

I agree that this "progressive" webapp is basically the better user experience and would be great to have something similar with RethinkDB.

However, how would the offline-plugin or sw.js solve the data querying problem? As I understand it, they wouldn't, right?

I guess the offline cache should be something aware of how GraphQL queries work, so would it be something to solve in the cashay project?

Answer 10 · 2016-04-19T01:03:45.000Z

You'd probably have to synch Rethink with a local IndexedDB or something. I was under the impression that the Service Worker would just handle the shell or static content.

Answer 11 · 2016-04-19T02:52:53.000Z

Resumable changefeeds are being tracked in rethinkdb/rethinkdb#3471

A graphql+crdt example could be neat. If there's interest I can contribute one.

I haven't actually looked at the cachecontrol headers on the webpack chunks. If they're not being fingerprinted and set to a long term expiry we should do that first.

Answer 12 · 2016-04-19T15:25:54.000Z

I've had some good chats about data expiration recently (gah that sounds nerdy). There is some data that can be long lived, such as a list of countries.Typically, long lived stuff won't come in through a changefeed. Other stuff, say a list of Kanban lanes, will probably be invalidated on every visit. I think it's the job of the cache to provide a timestamp (eg receivedAt), but it's the job of the service worker or something near to it to provide the expiry logic (if Table = Countries, then expiry = receivedAt + 100 days).

On the cache side, GraphQL actually keeps us from being cache-efficient. That's because if I request getTop5Posts then I don't know what posts it will send back. It may be that I already have all 5 posts stored locally, but I don't know that they're the top 5. For this reason, it's generally not useful to store frequently invalidated data because when we refresh the page, etc, we'll have to rerun all those queries to make sure they're accurate. The doomsday scenario is us writing buggy code, then client refreshes the page to try to fix the bug, and the bug is stored locally & we just give em the bug again.

@wenzowski i'm really curious about your use with graphql + crdt. What's it look like?? The only time I've used crdt is with swarm.js, and it's not document based & frankly i'm not sure how to make it document based. That said, I'd love to build a client cache that supports infrequent queries, frequent document updates (subscriptions), and frequent collaborative changes (CmRDT). I still dunno what that'd look like...

Answer 13 · 2016-04-19T17:26:39.000Z

If you want to sync arbitrary documents over a high-latency/interruptible network connection, I highly recommend ShareJS. In another (experimental) app, I have a few Riak buckets providing special fields for a few object types, where each field has its own set of mutation methods.

That doesn't sound like GraphQL preventing us from being cache-efficient, but rather an application concern that prevents it. If you were to, say, getUserById you have some reasonable expectation that subsequent fetches will return the same user object, but have no guarantee that the user won't have updated their attributes. Avatars for instance are much more volatile than usernames. One solution to this is documented in the DNS RR format: assign TTL value to each field, indicating a refetch frequency.

Particularly long-lived data (like a country list) could easily be compiled into webpack chunks. Speaking of which, PR for maxAge coming up.

Answer 14 · 2016-04-19T18:31:23.000Z

@wenzowski wow, it's like you're in my head...so here's the thing with getUserById. That GraphQL query supposedly returns the document from the DB, but we don't know if it does anything else to it (eg divide a field by 2). So, even if I cache getTop5Posts and 1 of those is Post:123, if I call getPostById(id:'123'), there's no way i can know locally that it means "get the local item in the Post collection with id == 123". I could build that functionality into a cache, but then it'd still be up to the user to say getTop5PostIds and make a 2nd call saying getPostsByIds(['123','124', '125'...]). Basically, we'd need a way to say "hey, getPostsByIds is a special query that only returns documents with those Ids"...

Answer 15 · 2016-04-19T20:40:40.000Z

Query-level TTLs came up in facebook/relay#720. Possibly elsewhere?

If we were to define reasonable field-level cache expectations with a TTL, then perhaps these could go directly in the schema, allowing the graphql http server to correctly set maxAge to the lowest field value and enabling the websocket server to provide an equivalent mechanism.

This would allow a key-value cache as appears in relay docs to be as consistent as specified by the schema.

Answer 16 · 2016-04-19T21:05:39.000Z

ehhh i think that's getting too apollo-ish. the meteor folks are solving this (i think) by setting up an invalidation server. Personally, i think we should keep as much logic on the client as possible & detached from the data & data transport layer.

it'd be amazing if firefox had an equivalent to chrome's console.memory, but without it, we could still run an invalidation check ever eg 5 minutes. After that expiration, every Cashay listener that redux calls will get a true flag. Then, we just roll through the false ones and delete them.

The same logic holds true for a TTL. After 5 mins, roll through each document, make a queue of queries to invalidate, and then do a refresh. I'd keep it at the document level instead of the field level to be somewhat performant, but the logic is dead simple & pretty performant.

Answer 17 · 2016-04-19T23:31:16.000Z

Skipping over two hard problems, your comment reminds me that there are a few gems mixed in with The State of Meteor Part 1 and Part 2 should you wish to explore the rabbit hole.

I firmly believe mapping GraphQL queries to TTLs is the right way to go, and I hope you'll permit me a brief time warp to explain why I suspect this is the case.

The ANSI-SPARC three-level architecture has had a lasting impact on both database design and, by extension, data-driven document generation.

🙇 upload.wikimedia.org

ANSI/SPARC Database Architecture

the external or user view which is concerned with the way data is viewed by end users,

the conceptual or community view which amalgamates diverse external views into a consistent
and unifed composite, and

the internal or implementation view which is concerned with the way that data is actually
stored.

(Sandhu '94)

If we were to describe GraphQL/Relay in these terms, its role is to both define a set of composable (mostly)immutable conceptual schemae and to handle mappings between the external and internal views, decoupling both.

The magic of meteor is in seamlessly synchronizing document state between users, observers, correctly piping changes that occur internally (mongo documents) to external observers (loaded html documents) by way of DDP.

With GraphQL, each client only ever sees an external representation (its fetched/subscribed document) yet cache invalidation happens on the client based on remote server-side changes to the internal representation (rethinkdb documents in this case). If we are able to rewind each client's graphql subscription to the point where that client lost connectivity by going offline and replay all remote changes, resolving conflicts or generating siblings for future resolution, then we can cache all subscribed queries indefinitely and throw out the concept of a validity threshold. If we're talking about subscribing to absolutely everything like, say, derby does, and are talking about caching GraphQL fetches, then I think perhaps the TTL route is necessary.

Going with a bubble-up TTL approach would allow developers some kind of knowledge about necessary propagation delays that modifications to the internal model will inherently be subjected to: beyond the TTL threshold an offline client will have purged the stale data and will be forced to reconnect before taking any action that relies on it. Without this foreknowledge, I think we open ourselves up to production-only heisenbugs.

Given the nested nature of GraphQL schema definitions, I would suspect that operating at the field-level is necessary even if only one TTL is allowed per GraphQLObjectType: a field could easily be a GraphQLList of a different GraphQLObjectType that requires a lower TTL than the parent collection we are requesting it by, and thus queries to the same collection which request different fields could easily require differing TTL settings.

The best frameworks are in my opinion extracted, not envisioned.
(@dhh '07)

I think I have a need for the TTL mechanics in an app I'm working on. If this turns out to be the case, I'll extract.

Answer 18 · 2016-04-20T04:35:14.000Z

I like where this is headed, but one thing bothers me:

If we are able to rewind each client's graphql subscription to the point where that client lost connectivity by going offline and replay all remote changes

We don't know exactly when they go offline. for example, meatier has a heartbeat every 15 seconds. DDP is similar. lost connectivity that is < heartbeat means that we can't be guaranteed that the document made it to the client (unless we use durable messaging, but there goes our scaling).

If you put a TTL on every rootObject (stripping away the non-nulls & the Lists), and invalidate a single one, you still don't have a way to refetch that particular doc unless the client provides you with a function to do so. Basically, for every GraphQLObjectType the client cache would have to be given a getXById function that it could call. And then, how do we know how much of the object to get? The fields pulled should not depend on what's currently in the cached store, but what's at the view layer (eg if they visited a data-intensive page yesterday, why bother refetching those queries today in a non-lazy fashion?).

Answer 19 · 2016-05-16T19:47:44.000Z

I think we should split this issue in two: full offline support is different from applying progressive web application techniques to combat latency.

We're never going to know exactly when any event happens. I'd like to estimate clock offset as part of socket initiation, but haven't opened an issue for that yet. If we have a static object order then the server doesn't need to know which messages have reached the client: the client can advertise the last object received though an additional parameter when it reconnects.

Basically, for every GraphQLObjectType the client cache would have to be given a getXById function that it could call.

The relay way leverages globally unique object IDs, something this app already has, and a Node interface accessible by the node field on the root query type to allow refetches of any object that implements Node.

Thinking about subscribe, if we're to implement something better than random-writes-are-lost (last-write-wins) for concurrent object updates then we need to ship around the mutation itself, not just the results from the server's perspective of applying the mutation.

Answer 20 · 2016-05-16T21:05:14.000Z

@wenzowski i considered adding in a clock offset into the socket handshake protocol, but we decided against it. At the end of the day, it's still a heuristic & that latency is going to change, especially on a mobile connection.

WRT the relay way, that GUID contains an opaque Type, which essentially tells it what database table to use. Even though I'm using UUIDs, it'd still be wildly inefficient to scan each table until i get a hit. Ideally, I'd have a client cache that sends in the Type + ID (obfuscated or not). The problem here is deciding how much we should dictate what the developer does to their server. Relay makes you change your entire server & client, and that heavy all-or-nothing cost is IMO why not many have adopted it.

I think for 95% of use cases, last write wins is perfectly fine. For those other 5%, it's best not to rely on a single source of truth (ie use CmRDT). I imagine carving out a separate piece of state for CmRDT and letting something like swarm.js do the heavy lifting. That way, we have document-level changes for regular stuff, and... i dunno what to call it, field location level changes? for collaboration pieces.

Answer 21 · 2016-05-16T21:59:47.000Z

Regarding clock offset and mobile jitter causing significant variance: an offset prevents client timestamps from being off by hours (client timezone set incorrectly or never ntp synced). Cold start over Edge vs established socket puts RTT variance in the range of ~4s. Yes, network instability could make that worse, though I think it could be useful to have at least minute-level consistency in client-generated timestamps.

I really strongly dislike the "last write wins" term: in a carefully designed distributed system you can record causality yet still cannot necessarily determine operation order. If you cannot determine operation order, then you cannot determine "last" and your system is now very clearly a "some arbitrary writes lost" consistency mechanic.

So, I will agree that "some writes lost" is perfectly fine if your system is designed to accommodate this; if all your objects are immutable and your collections append-only, for instance, consistency can eventually be obtained by detecting and re-requesting the lost writes. If inconsistency is tolerable (streaming logs, for instance) then then data loss is tolerable to your application-defined threshold. Going the CmRDT route can be useful for the particular field types that CmRDT operations can be defined upon, provided you need automatic merging. Another approach is to store conflicting writes as siblings.

The idea of sets of field-level mutators is exactly what I was expecting would be needed to support something akin to swarm.js and I agree that some mutations simply need to be emitted while online: they need to wait on acceptance by whatever the single source of truth is, and need to block certain additional operations until they're accepted.

The nice thing about building an example app is that we don't have to predict the future and guess the proportion of use cases that require one thing or the other; we can focus instead on the specific use cases in the example app.

Yes, in order to provide node(id: $id) there ought to be some map of id <=> type. A client cache makes sense to me. Can we think of a reason such a cache would be an implementation burden?

Answer 22 · 2016-05-17T01:31:11.000Z

@wenzowski yeah, i think this convo is getting a little too abstract for me to add much value without a concrete example. let's touch base thursday.

Answer 23 · 2016-05-17T02:39:31.000Z

Great! In terms of actually taking a stab at an offline-first architecture, shall we start with toast messages to indicate offline state?

Answer 24 · 2016-07-09T08:21:35.000Z

Guys, I enjoyed your discussion, but I'm afraid I missed quite a stuff. My simple question is - is it possible to bundle a Meatier app in Cordova/Electron and use Service Worker to cache files client-side and do updates when files change on server? Is this implemented and works out of the box, or it is possible just in theory? Personally I'm not looking about caching any real data, just JS/CSS files.

Answer 25 · 2016-07-09T19:30:57.000Z

I don't think service workers are meant for native implementations, you'd
just push a new version of the app and patch it in, yeah?

On Sat, Jul 9, 2016, 1:21 AM mishoboss notifications@github.com wrote:

Guys, I enjoyed your discussion, but I'm afraid I missed quite a stuff. My
simple question is - is it possible to bundle a Meatier app in
Cordova/Electron and use Service Worker to cache files client-side and do
updates when files change on server? Is this implemented and works out of
the box, or it is possible just in theory? Personally I'm not looking about
caching any real data, just JS/CSS files.

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#101 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFQjv4lsz9JJnVoJrwL7W63ZSyz_4FF5ks5qT1oTgaJpZM4Hkb10
.

Answer 26 · 2016-07-26T08:04:03.000Z

I have question on what is the plan on offline feature? will it be built on meatier or ride on relay/cashay . How soon can we start to see alpha version of this?

Answer 27 · 2016-07-26T15:24:28.000Z

Yeah, this absolutely will exist in the application layer (ie meatier).
The reason is that there are a lot of things you DON'T want to persist (the router, socket connections, subscription streams stored inside cashay, etc).
To do so, I use a fork of redux-storage. It persists the redux store any time a cashay action is dispatched:
https://github.com/ParabolInc/action/blob/master/src/client/makeStore.js#L10-L19

Then, when I rehydrate from localStorage/localForage, I only include certain reducers:
https://github.com/ParabolInc/action/blob/master/src/universal/redux/storageMerger.js#L5

Service workers will also live in the application layer, I'll get to them after we get an MVP out the door (and seed $$ from investors 😄)