Consider relying on eTags (or other headers) for service worker dependencies to check for updates
delapuente opened this issue · 57 comments
For the sake of modularization and isolation. Could it be possible to improve the update algorithm to rely on the eTag / content-size / other headers sent by the server to decide when a service worker changed? Now we need to include a mark of change in the sw and this forces a lot of developments to postprocess service worker files.
If the request is made with If-Modified-Since
or If-None-Match
, and the response is 200
, we could assume this is a new SW even if it's byte identical. Although this would cause unnecessary service worker updates on servers that send out Etag
or Last-Modified
but don't correctly return 304
.
Else we just add Service-Worker-Force-Update: 1
or something.
More than checking if the main sw file changed, I'm referring to its dependencies imported via importScripts()
.
Yep, understand that, you'd be sending an etag (or last-modified) that represented the SW and its dependencies.
Maybe that's too hacky?
I think that undermines the purpose of the ETag about digesting the content being served. It would do the trick but I think is better to come up with something more pluggable in the actual ecosystem not requiring to hack with semantics. ;)
Yeah, I think you're right. Service-Worker-Force-Update: 1
would work.
I think we should have a JS API for this, reg.update({force: true})
perhaps.
F2F: agreement on serviceworker.skipWaiting()
and reg.update({force: true})
- but may need to look at naming of force
.
Lots of problems using etags for this, eg if CDN stop serving etags for some reasons.
Rough idea around some-header-name: value
where value is a digest like etags.
reg.update({force: true})
will leave you with the existing worker if the update fails, as usual.
Was anyone able to dig up the previous proposal from @ehsan and I? I can't find it.
+1 for this
There should be a way to update the imported scripts even when the main code of the service worker is unchanged.
This is also a big issue for BaaS whose scripts are imported in the customers' service workers: I have more extensively described our situation in this blog post.
@jakearchibald Can you remind me why we don't just check importScripts() for byte changes as well? I seem to recall it was since importScripts could be called async way back when, but now we've made that throw. Since we only allow sync importScripts maybe we can just include it in the byte check.
See issue #639 for the importScripts discussion, which roughly concluded with "let's revisit this later".
Can you remind me why we don't just check importScripts() for byte changes as well?
It could mean a lot of network requests just for an update check. I was worried about that. But also it means a change to a third party script would result in a whole new SW install, which sounds a bit… invasive.
I like reg.update({force: true})
as it give us a script-land way to do Chrome's "update on reload", but maybe for third party scripts we need to revisit the idea of providing access to the cache SW uses to store its scripts.
importScripts('//example.com/whatever.js');
// then later…
self.registration.getCache().then(cache => {
cache.add('//example.com/whatever.js');
});
My big worry here is security. We'd only want the SW to be able to access this, as giving access from pages would turn a minor XSS into a huge problem.
It could mean a lot of network requests just for an update check.
Sure, but this is a trade off sites can make for themselves. Do I want the convenience of structuring my code in separate files? Or do I want to minimize the number of SW script 304s my server has to send?
I could see smaller shops opting for developer convenience while huge sites compacting into a single file to minimize server load.
But also it means a change to a third party script would result in a whole new SW install, which sounds a bit… invasive.
What I'm hearing is that this is what developers expect to happen and they are surprised when it doesn't.
I like reg.update({force: true}) as it give us a script-land way to do Chrome's "update on reload"
I like this too, but I think its a different use case. From what I can tell developers want to compose their service worker scripts from decoupled sources and have things just update to the latest. Every step we add to get the updates to trigger creates friction and requires more tight coupling between modules.
At the very least it seems we could do this as an opt-in to register()
. Something like update-checks-imports:true
or something.
update-checks-imports:true
is interesting. Or something called during the install event to set which scripts should be checked.
If I show a "please refresh for latest version" message when there's an update waiting, I'm not sure I'd want that just because Google Analytics or whoever had updated something.
Then again, third party services like Analytics will live in Foreign Fetch instead.
this is what developers expect to happen and they are surprised when it doesn't.
Exactly! IMHO The web is so great (and Javascript is becoming pervasive) for its simplicity. Please don't create a giant and complex monster: leave that to native apps. And please don't fall into premature optimization.
update-checks-imports:true is interesting
I agree. But I don't think that that choice should be left to the user. What if he denies? Then he would never get updates and the scripts will finally break.
I think that if you don't want all scripts to be refreshed by default you should leave to the developer the choice. For example: importScripts('//example.com/whatever.js', check-for-updates: true);
. So the developer can prevent the refresh for large files (like Analytics) and allow small and more useful scripts to be refreshed automatically.
update-checks-imports:true is interesting. Or something called during the install event to set which scripts should be checked.
I was thinking about this:
importScripts('//3rd-party.com/whatever.js', { forceUpdate: true }); // makes update algorithm to byte-to-byte compare the dependency.
This way, the developer can mark the dependencies causing updates, this trade off @wanderview was talking about is made explicit at the same time you can preserve file sanity via modularization.
We could make forceUpdate
to be true
by default (so we should change current spec but it will end with a more predictable API) or false
(preserving current spec).
Furtheremore, if, at some time, the developer want a dependency to be part of the checking no longer, she simply flips the flag.
Having the option in importScripts
makes real sense. Nice. Unfortunately it doesn't cover JavaScript modules so well.
But do we want to make a service worker specific importScripts()
interface? This would not work if someone uses a library that then uses the existing importScripts()
API internally. The top level import would get updated but not any of its dependencies.
I think it would be better to put this on the install or activate event personally. It can then automatically apply for all importScripts
, modules, or other future added methods of bringing in script.
Sure. The thing I liked about the importScripts
solution is it was at a resource level, but I'm sure we can achieve that via another API.
pondering
If the API was something like alsoCheckTheByteEqualityOfThese(requests)
, could you include things that you weren't using in modules and importScripts
? That would enable you to have a single resource that echoed the version number. Dunno if that's useful.
Sure. The thing I liked about the importScripts solution is it was at a resource level, but I'm sure we can achieve that via another API.
That was the idea.
But do we want to make a service worker specific importScripts() interface? This would not work if someone uses a library that then uses the existing importScripts() API internally.
Well, what I would find extremely uncomfortable is to re-declare my imports for marking purposes only. Perhaps introducing a new import function (importScriptForcingUpdate(...)
)?
Dealing with ES6 modules is complicated but what about a pragma:
import "my-library";
"force update";
import "my-other-library";
I don't really like it and I don't really know if there is a standard mechanism to introduce "use strict"-like pragmas in ES6 but declarative APIs are this kind of inconvenient.
F2F:
- Should we check all imported scripts by default? Yes
- Check the flattened imported scripts & the main script, if any of them are byte-by-byte different, including !ok responses, trigger an update (where a !ok response will fail the update)
- The browser may optimise for this, eg if the main script has changed it doesn't need to check its imported scripts
- No opt-out of this
FWIW, as a service worker tooling author, I'm really happy about this.
sw-precache
currently forces developers to use its output as a top-level service worker script, because it relies on the byte-by-byte check of the versioning information it includes inline to trigger the Install flow.
It sounds like after this change is implemented, developers can start pulling in the sw-precache
output via importScripts()
while maintaining the same ability to trigger the Install flow.
Same feedback that @jeffposnick, we have created some tools gulp-sww right now we have to do nasty tricks (like creating an unused variable with the timestamp to ensure change in the registered script).
This will be pretty useful :)
I created a bug for this functionality at https://bugs.chromium.org/p/chromium/issues/detail?id=648295 in case anyone can pick that up on the Chromium side of things.
Heads up that I'm starting to look into implementing this in Chromium due to high demand. It'd be nice to have spec text.
We had an implementation question about this today actually.
Currently the spec allows importScripts()
to run asynchronously up until the install event waitUntil()
resolves. See:
- Step 2 of https://w3c.github.io/ServiceWorker/#importscripts
- And step 15 of https://w3c.github.io/ServiceWorker/#install
We don't currently implement this in Firefox. We still require importScripts()
to run synchronously at script evaluation time. @mattto, does chrome implement async importScripts()
like this spec text yet?
If not, I think it would be helpful to set the "imported scripts updated flag" immediately after script evaluation time instead of waiting until after the end of install processing.
Thoughts?
Edit: I mean, it might be helpful in order to implement the update check. We can simply evaluate the script again. Or maybe its orthogonal to this issue.
Chrome does sync importScripts. I'm not sure I understand how the spec allows async importScripts() since https://w3c.github.io/ServiceWorker/#importscripts doesn't mention "in parallel" so I'd assume the steps run in sequence. If I understand correctly, the "updated flag" just indicates "read from network" vs "read from installed service worker storage".
Well, I think its more about when that flag is set. AFAICT, its set when the install processing completes which is an async job. Or is it set elsewhere and I am confused?
Ah, yes. I don't understand why it's set at that point.
I think Chrome expects the importScript to run on initial script evaluation and will error or not work correctly otherwise. Would have to test that.
@mattto @wanderview I've opened #1021 for the importScripts
flag discussion.
Ok. Happy to see the "importScripts() after script eval" issue tackled separately from this issue. We will likely implement the byte-for-byte check for importScripts() first and then consider the delayed importScripts() issue later.
So, this thread is now for #839 (comment).
Check the flattened imported scripts & the main script, if any of them are byte-by-byte different, including !ok responses, trigger an update (where a !ok response will fail the update)
The timing of a byte-by-byte different check depends on the decision of #1021. But I'll assume, during the work for this thread, the check will be done before invoking Install.
I think the byte-by-byte check should happen before running the SW. I think author's wouldn't expect a new SW to be spawned on each soft update attempt, and it'd be quite wasteful.
I think the timing doesn't necessarily depend on whether delayed importScripts() is supported. When a SW is installed, you can store the URL of each script (main and imported) and their bytes. Then when an update starts, you fetch those URLs and compare. If they are the same you abort the update.
Then when an update starts, you fetch those URLs and compare. If they are the same you abort the update.
Isn't the resolution of this issue to include imported scripts to the byte-by-byte check for updates? How can we detect whether imported scripts have been byte-changed or not without calling importScripts()
?
Also, based on the resolution here, I think the byte-by-byte check should be done before invoking Install: #1021 (comment).
How can we detect whether imported scripts have been byte-changed or not without calling importScripts()?
Maybe I'm missing something, but I thought you'd just fetch the script and then compare it to the one on disk. That is, when a SW is installed you have on disk all its scripts (both main and imported) and those URLs. To do a byte-to-byte comparison, you'd fetch all the scripts and compare the bytes on disk.
I think I got your idea. That'd be certainly better!
I worried about the case where the imported scripts sequences are changed.. but in that case the original script should have been altered.
Are the byte-for-byte checks shared across different service workers? (If two service workers import the same script, are the etags/hashes "shared"?)
If so, in some circumstances the browser will immediately know (i.e. without needing to do a network byte-for-byte check) that a dependency in the "middle" of a dependency graph has been updated. What happens next?
A
+- B
+ +- C
+- D
If the browser knows that B
has been updated (because some other service worker also had a dependency on B
) which parts of the tree are re-checked? Also, for how long is a freshness check considered to be valid?
Are the byte-for-byte checks shared across different service workers? (If two service workers import the same script, are the etags/hashes "shared"?)
When a service worker fetches a script (either the main script or imported), it will go to the network (optionally) via the HTTP cache. It won't go to the cache API or the script cache of any other service workers.
@jakearchibald Can that lead to a situation where the resources A
and B
are both updated at the origin, but the browser only notices that B
has changed (because of timing-related artefacts of the HTTP cache), and so updates the SW using the "old" version of A
and the "new" version of B
?
@ithinkihaveacat yep. Same is true for HTML documents. We have made the HTTP cache opt-in because of developer confusion around this (#893). Developers who opt into the HTTP cache should understand how it works.
Yea, I agree they are orthogonal issues.
An interesting point has been raised internally - is it possible we could damage sites relying on caching by making this change. I'll reach out to our biggest users and see how they feel. Worst comes to worst, we could make no-cache opt-in.
@jakearchibald When you talk to these sites, can you also mention there is a work around if they are serving unique hashed resources? They can set cache-control:immutable
with a very large max-age
to avoid these network requests at all in firefox/chrome.
@wanderview Sites that are able to generate unique hashed resources wouldn't really need this feature though, right?
I thought the point was to make it possible for service workers to do e.g. importScripts('https://www.gstatic.com/firebasejs/firebase-app.js');
and be able to quickly and reliably pick up changes to https://www.gstatic.com/firebasejs/firebase-app.js
even if the service worker itself remained byte-for-byte identical.
If the default for all network activity related to service worker update checks becomes no-cache
(as per #839 (comment)) then that's going to result in a lot of 304s for any widely deployed resource. (However, not doing this would lead to browsers potentially getting themselves into an inconsistent state #839 (comment).)
I thought the point was to make it possible for service workers to do e.g. importScripts('https://www.gstatic.com/firebasejs/firebase-app.js'); and be able to quickly and reliably pick up changes to https://www.gstatic.com/firebasejs/firebase-app.js even if the service worker itself remained byte-for-byte identical.
I guess I thought people typically versioned 3rd party dependencies. Allowing external dependencies to float at-will in production seems kind of crazy to me.
I guess I thought people typically versioned 3rd party dependencies. Allowing external dependencies to float at-will in production seems kind of crazy to me.
I suppose it depends on the use case. Something like https://www.google-analytics.com/ga.js
isn't versioned, and that works out fine.
https://www.hodinkee.com/OneSignalSDKWorker.js
consists of one line:
importScripts('https://cdn.onesignal.com/sdks/OneSignalSDK.js');
Obviously whatever's in OneSignalSDK.js
could be inlined into OneSignalSDKWorker.js
(would even save a network request), but then OneSignal need to get Hodinkee to deploy a new version every time they update their SDK.
Is this already implemented in Chrome or Firefox?
Is this already implemented in Chrome or Firefox?
Updating based on importScripts()
in FF has been started, but not completed:
https://bugzilla.mozilla.org/show_bug.cgi?id=1290951
Related to this, defaulting updates to no-cache
is already implemented in FF53:
@KenjiBaheux & I should email SW users to make sure big users of SW are aware of this.
The F2F resolution was to check importScripts in the byte-for-byte comparison; however issue #893 changed so that useCache would specify caching the importScripts by default.
While working on this issue with @mattto, I found out we need to discuss about when to fetch and compare the imported classic scripts for Update. (See #1283 (comment).) Now we have two options:
- Fetch imported scripts during the first evaluation of the main script in Update.
- Fetch imported scripts (of newestWorker) before evaluating the main script.
(2) allows us to return early even before starting a worker. @jakearchibald seemed to be concerned about double-download (https://github.com/w3c/ServiceWorker/pull/1023/files#r92201798) here, but we can avoid importScripts()
in the main script downloading the scripts from the network because we fill in the cache before that time.
But if the imported scripts in (2) have errors, we can't avoid running the main script and the cached scripts before catching those errors anyway.
Thoughts?
/cc @jakearchibald @wanderview @aliams @cdumez
EDIT: I tried it with (1) in #1283.
I responded in #1283 (comment), but to reiterate here, I'm much more concerned about needlessly starting a service worker in the common case than in the error case. I think we should avoid starting a service worker until the byte-to-byte update check (including importScripts) shows that an update is possible. Otherwise almost every navigation will start a new service worker to do an update check which will usually just be wasted.
That's a fair point. I agree to (2). We can do this without doing a double-download.