apache/incubator-pagespeed-ngx

etag not set on gzip-encoded resources in 1.3.6

jeffkaufman opened this issue · 9 comments

To reproduce:

curl -D- -H 'Accept-Encoding: gzip' http://localhost:8050/mod_pagespeed_example/styles/yellow.css.pagespeed.ce.lzJ8VcVi1l.css| head -n 15

vs:

curl -D- http://localhost:8050/mod_pagespeed_example/styles/yellow.css.pagespeed.ce.lzJ8VcVi1l.css| head -n 15

Note that the latter has Etag: W/"0" while the former does not.

This fails in nginx 1.3.6 but not in 1.2.4.

New versions of nginx have a call to ngx_http_clear_etag(r) at the end of their ngx_http_gzip_header_filter(). We're setting a weak etag, and then nginx is removing it.

While nginx would be right to remove a strong etag (because it's gzip-compressing and so changing bytes) I think it should not be removing a weak one.

On the other hand, these resource are all cacheable for a year and have a content-hash in the url, so the etag shouldn't be needed. Looking at the mod_pagespeed source and talking to people here it's to work around an image caching bug with IE8. It's possible that this only applies if we also send a Vary header, which neither we nor mod_pagespeed currently do on images. I'm going to run some tests on IE8's caching.

Test page:

http://www.jefftk.com/test/image-caching/p.jpg

Serves with headers:

HTTP/1.1 200 OK
Date: Tue, 20 Nov 2012 21:07:00 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Mon, 12 Nov 2012 21:03:35 GMT
Content-Length: 18833
Cache-Control: max-age=31536000
Expires: Tue, 12 Nov 2013 21:03:35 GMT
Content-Type: image/jpeg

These are identical to what mod_pagespeed gives you, except that would contain an Etag:

Etag: W/"0"

In IE8, through webpagetest, we get:

http://www.webpagetest.org/result/121120_7T_MGX/

which shows that IE8's caching worked properly.

Here's the same image loaded directly from mod_pagespeed.com, served with Etag:

http://www.webpagetest.org/result/121120_1S_MJT/

Caching also works, which is what we would expect.

This suggests to me that we can remove the Etag-setting from mod_pagespeed and remove checking for it from the system test.

To be thorough I also tested on the other browsers webpagestest supports:

All of them cache the image for the repeat view.

I've taken this up internally; on hold for now.

  • I remember that PageSpeed Insight may complain about not having an Etag: W/"0", we will loose points :-)
  • Images should not have a gzip encoding, I would expect the etag to be left as-is by nginx? Your test case does not have the vary header in it, so I would expect it to succeed.
  • For compressible content types, I would expect nginx not only to remove (or alternatively recompute or weaken) the etag, but also append a vary: accept-encoding header. That vary header is where some versions of IE http caching will have trouble behaving I think.

Another solution would be to add a new filter that runs as late as possible, after gzipping, and ads etags for .pagespeed. resources.

working on this one

Fixed by #340, released in 1.5.27.3

I'm still getting this problem. Using nginx-extras from dotdeb.