laravel-mix/laravel-mix

Versioning no longer Load Balancer friendly

bbluemel opened this issue Β· 54 comments

Hi

Upgraded from 0.10 today to 1.2.2 due to some issues I was having on our live servers, and have realised it has created another issue, now with the versioning.

Let's say you have 2...n servers, that all individually update their code, and run npm run production, they are all set up so they share a session state so, for the most part, it doesn't matter which server they hit.

In the pre 1.0 versioning you might get a code update that for a short period of time creates an inconsistent state:
The browser gets told by server 1 /css/app.5ee7141a759a5fb7377a.css is current, due to true Load Balancing it might try getting it from server 2 which hasn't quite finished npm run production yet. It gets a 404, no big deal, the browser requests that file next time and either manages to get it from a different server, or the other one had finished processing it.

Post 1.0, this creates a larger issue:
The browser gets told by server 1 /css/app..css?id=5ee7141a759a5fb7377a is current, if server 2 hasn't finished compiling yet, it might serve that client the old CSS code, the browser will cache this old code for days/weeks/months.

If we throw a CDN in the mix (something that I'm planning on doing), this creates an even larger issue, because it is possible in that small window, that it caches the old code (even though cachebusting on the new ID), and will serve that to many browsers.
CDN even means that any Load Balancer stickiness (for browsers) is completely off the table as well.

It would be great to have the old versioning back (even as an option),
I realise the query string keeps things a bit tidier, but it does mean in multi-server setups there's no guarantee that the file version being served is actually the right version - having the file existing/not existing is really the only guarantee here.

Thanks

Ben

Hmm, let me think for a bit. We can probably return a production-specific way to do file-based versioning. I'll try to knock it out soon.

Was going to open an issue but then saw this. I prefer the chunk hash way that was done pre version 1.0.

Maybe in Api.js add a method for selecting a version strategy, query_string (default) or chunkhash. That way it doesn't introduce a BC but then gives people the choice.

Seems like a job for your deploy strategy. Similar issues are perpetually discussed on stackoverflow. One example: https://stackoverflow.com/questions/22620971/deployment-race-condition-causing-cdn-to-cache-old-or-broken-files

@devcircus the deploy strategy is fine.
With the current versioning setup in mix there is not a solution that doesn't require you to go down to a single server serving the content.
Because the reverse is also an issue.
Browser hits Server 2 (assuming server 2 is currently going through the deploy), gets told to grab the old JS/CSS code. Server 1 (updated) still has the old versions as well so can serve them, in the new versioning server 1 will serve the new js/css for an old version of the application, as it isn't version aware.

Bottom line, it isn't versioning - you cannot use an "old version" on the querystring to get an old version of the underlying code, it would need to rely on it being cached elsewhere.

Your deploy should be setup to fully deploy and make sure those apps are working before rolling your LB to sending requests to them... ServerC and ServerD should have exact same state before put in LB and then take ServerA and ServerB out of rotation at the same time as to avoid this issue.. It really shouldn't be on asset bundler to be aware of your deployment situation.

@joshmanders It doesn't have to be aware of the deployment situation.

Even way back in Elixir, versioning was done with the filename - it removed old versions, but it still meant that there was an exact correlation between the version being requested by the client and the version served, there was never any chance of a client being served the wrong version.
That versioning has been completely lost and replaced with simple cache busting, but apparently, it's my deployment setup's fault?

Yes because even if they make a request for app.as23r2ga.js on a server that doesn't have it yet, it's broken.

Ignoring the deployment part, which does have some issues. Query string versioning can make some load balancers think the content is dynamic e.g like a search and not cache.

Laravel Mix is a great tool and I preferred the hash file name. It's the main reason I started to use it.

@joshmanders yes, and that's more desirable than app.js?id=as23r2ga being requested and having version bt34s3hb being served, which would be cached for days/weeks/months.

app.as23r2ga.js not being found is a temporary issue, and the browser will keep retrying to fetch the file from then on because it does not cache a 404. but that bt34s3hb version being served when as23r2ga was requested makes it broken for the lifetime of that file being cached, or until the next version iteration.

The difference being is the app.as23r2ga.js not being served is easy to diagnose and fix, having bt34s3hb served in its place is almost impossible.

Kinda related, but is query strings versioning considered okay for everything now even though some proxies don't handle it correctly (https://gtmetrix.com/remove-query-strings-from-static-resources.html)? Maybe my information is outdated.

Yeah, I'd like to see a return to filename based versioning rather than query string versioning. I dropped my 2 cents in when this was discussed on Twitter but I think it's more appropriate to use distinct files and allows for better caching. I understand it might be a little less pretty under the hood but I think it's something important that needs to be reconsidered.

Was this fixed? I'm using v1.4.2 and still uses the new versioning instead of file based!

AWS CDN does not forward query strings to the origin server, or cache by query strings by default - this setting needs to be specifically enabled. Just flagging this as it's a popular CDN and people might get caught out if upgrading laravel-mix and not knowing to change their CDN config.

Edit: - Saw this was mentioned in the 1.0 docs with Cloudflare as an example. A hash option avoids needing to change CDN settings though.

v1.4.0 is the (current) latest version. From what I can see the other relating issue was closed as a dupe of this. As this is still open I would assume its not fixed.

I hope it is soon though it was the main reason I used it.

Jeffrey usually comments on issue when he fixes it.
This is definitely not fixed by now.

He may be working on something as a convenience, however there nothing broke here.

@devcircus not sure how you can claim nothing is broken here.

Previously you could reference old versions of the compiled code, now you cannot. Query string cache busting does not guarantee the version requested matches the version built for the version.
The file based versioning from previous versions did.

@bbluemel that is not broken.

@joshmanders umm, yes it is, something that worked previously, and doesn't anymore is broken.

Just because you didn't use it like that doesn't make it any less broken. It is clear from this issue and the other one linked that others were also relying on this feature that was removed in favour of an inferior cache busting solution.

The previous behavior wasn't necessarily intended. It just happened to work with this particular deploy strategy. It surely can't be the responsibility of the package to make sure you don't have conflicting file versions in a custom, multiple server, load balancing setup. This is a common situation that has to be taken into consideration when considering your deployment. On the other hand, maybe Jeff will consider a PR.

@devcircus it's plugged as versioning. If it was plugged as just cache busting then I would agree.

Do a npmjs.com search for asset versioning and the top results (most of the first page) are based on file name based hashes rather than query-strings (and usually if it's default is query-string you can change it)

Then do a search for asset cache busting and you will get the majority of those using query-string parameters, and a few of them with file name based hashes.

The main thing here is that name based hashes are valid Cache Busting techniques, but varying the query-string is not a valid versioning technique.

If I were to do a PR it would be to restore the old functionality πŸ˜ƒ

It surely can't be the responsibility of the package to make sure you don't have conflicting file versions in a custom, multiple server, load balancing setup.

/thread

@joshmanders feel free to have that rather narrow-minded viewpoint.

Scalability is important for a lot of decent developers. Bottom line, query string cache busting is not scalable, filename hashes are.

I have seen no comments in this issue stating any benefit of the query string cache busting either.

Rather than arguing, feel free to submit a PR. Otherwise, I'll add a file-based option soon.

@bbluemel it's not a narrow-minded viewpoint, it's called separation of concerns. Why is my asset bundling system aware of how my deployment is set up?

Your use case is flawed, and you should look for better ways to achieve what you want.

@joshmanders it isn't aware... The asset bundling spits out a version, and you can look on as a way to reference the compiled code as a dependency for the site, meaning the html (or server side code) can be linked to specific versions for the client code, ensuring that both are in absolute harmony. Old site/server side code references an old version still works for as long as necessary whilst a new version of the site/server side code can reference the new version of the client side code without issue.

Having it as a query string makes that versioning moot, as the only way to guarantee the requested version actually matches the version provided at compile time is if it is stored as a file version.

Your package.json/yarn.lock states what version of the dependencies are to be used, the way I'm using the version (previously) provided by mix is not really any different to that.

My use case may be flawed, but it scales well, it can work with 2, 20 or 2000 servers. Query string based cache busting does not, and to guarantee a version match would require dropping to a single server set up to do so during an update. Which in many multi-server setups is not an option. Even your solution from a few weeks ago has the possibility of error in this regard if the LB switch out happens in-between the time that the page and assets are requested - highly likely on busy sites.

I could improve my set up if I wished (on the older version of mix) to have a single server that gets the updated code base first solely dedicated to bundling the assets prior to any other server, so when another server (in or out of service) requests the file version, it is already readily available. Again, something that query string based cache busting doesn't allow for as it will only serve the current version as any version requested. This could be fronted by a CDN to attempt to mitigate the issue, but wouldn't necessarily guarantee that the version requested matched the version compiled, and in the cases where it isn't, will cause further issues as the wrong version would be served and cached.

The only way I can see a workable case with using QueryStrings is with some Rewrite trickery - check if the version requested exists in its cache (likely a file system folder) if it does serve it; if it doesn't, checksum the currently available file, if it matches, cache it and serve it, if not, 404. And with that, we are doing a hacky solution to do essentially the same as the old mix versioning, and what the webserver does by default.

@JeffreyWay Thank you, I'm fine with waiting at the moment πŸ˜ƒ I currently do have the scope to drop to a single server if I have a deployment that I have concerns that the code deployed will cause issues for our users.

I will try to get a PR in today or tomorrow.

@JeffreyWay In #1070 you said:

I try to shy away from making every single thing a config option.

For this issue, should I make my PR with a config option (since there are clearly some people who want to keep this the way it is), or should I just put it back to the way it used to be with no option?

@kohenkatz

Rather than arguing, feel free to submit a PR. Otherwise, I'll add a file-based option soon

Had problems deploying to production because of this behavior.

Looks like Jeffrey's planning on adding an option to use the old behavior, however, this can be handled when configuring your deployment.

For those looking for a temporary fix - you can bypass the .version method and use .then to write the logic for renaming the files w/ hashes and writing a proper mix-manifest.json once the build is complete.

The fastest way is to manually change the css and js name then edit the public/mix-manifest.json file (Yes, i felt stupid, but this is the fastest fix), and waiting for newly fix come out.

I just dont understand why making this change. Is a laravel main package, and the author can never thought developer will using it for CDN?

Can you share a CDN that doesn't work with this? This possibility was considered before the change was made. I personally use cloudflare and it works great.

If it's the load balancing issue, that should be on the developer to correctly configure the deployment.

@devcircus - I disagree, the load balancing issue is one of the key reasons to use versioning. How do you suggest to fix it with the deployment without taking the website offline during deployment? Without taking the site down for deployment - which you would not want to do on a super-busy site - you run the risk that the CDN will cache an old css or js with the new query string - the odds of which increase with a super-busy site. I can add a call to clear the CDN cache after deployment, but even that will result in a length of time where the stale CSS is being served with the new site. You've mentioned this twice without explanation, which means you seem to think the solution is obvious - I am just apparently dense enough to not be able to see it.

Bottom line for me, this is really simple to do in raw webpack - e.g. js:

output: {
	filename: 'js/[name]-[chunkhash].js'
},

css:

new ExtractTextPlugin("css/[name]-[contenthash].css"),

and impossible to do with mix in its current state.

As for CDNs, Both CloudFront and Highwinds don't handle the querystring as a variation by default, and need extra configuring to do so.

Your solution may not necessarily be simple or obvious. Managing a load balancer, utilizing zero downtime deployments, and guaranteeing your static assets are "current" when a page loads, can actually be quite complex with many possible solutions. None of which, in my opinion, should be the responsibility of a javascript package. If we want to use all the shiny tools of modern web development, sometimes we have to put in the work. Server and deployment configuration is part of that.

@bbluemel You are correct, there are some CDNs that require a quick, simple configuration change for querystrings to work.

*On a side note, one thing that IS quite simple, is writing a script that creates a "lookup" file for your assets. You can do just as @bbluemel did above, create the manifest with a custom script, and go right back the the old way of versioning.

Also, in the very near future, I'll be using load balancing for a new project. I'll likely be using Laravel and Mix. I look forward to being able to dig in to this. Maybe my eyes will be opened to this issue....

The thing is, it is implied that app.css?id=123 is a different file than app.css?id=234, but to the web server if you ask for app.css?id=234 and it has app.css?id=123, you don't actually get app.css?id=234. Heck, forget CDNs, even the cache-busting strategy fails if you have multiple servers without LB stickiness.

@devcircus actually the bits I mention above do not work with mix. If you try using the webpack way of doing thing in mix via webpackConfig you create an unusable mess, if I recall (as it was some months ago I tried it), you end up with 2 manifest files and neither of which are particularly useful.

Options are basically either:
a) use mix and battle with not being able to proper load balance your code or
b) don't use mix.

For me on a new project I was working on recently had requirements that were far beyond mix's capabilities so I had to bite the bullet and properly learn webpack.

@devcircus In my case, I am using qiniu.com, they have the options to enable CDN parameters, BUT if this option is enabled, the image process functionality will not work, I use image process function a lot, this is not what I want.

https://dn-phphub.qbox.me/uploads/images/201709/08/16295/z7M84mSaC8.jpeg?imageView2/1/w/224/h/224

The above link is the image that uses this image process function, qiniu.com using the parameters to accomplish this.

I agree @jdavidbakr , adjust the service settings in a busy website is not a good idea, but event if its a less busier web site, perform this kind of action is rather redundant. I have couple dozen website use CDN that I have to configure this. If u think this is personal, think of how many people there have to change they setting.

100% of CDN and load balancer will clear the cache if the file name is changes by default, how many are they default will support parameters cache? Why not chose a 100% solution?

Another problem pls see - #331 .

I donβ€˜t get how people have still to be educated that correct (!) caching canβ€˜t work with query parameters. There are enough race conditions in the process of deploying a new version and a client accessing the application leading to caching the wrong file with the new version parameter. Only distinct files for every version can solve all problems. And the hash approach of mix was the best because it used content adressing, not changing e.g. css but js leaded to the css still cached by client.

Hello,
could anyone suggest me how to use mix.then () to fix "this problem" and get (for example) style.hash.css?
I suppose something like:

.version() .then((stats) => { // do something here })

Thank you

I am doing it like this:

const fs = require('file-system');
const del = require('del');
const _ = require('lodash');
const jsonfile = require('jsonfile');

const mixManifest = 'public/mix-manifest.json';

...

.version()
.then(function () {
        jsonfile.readFile(mixManifest, function (err, obj) {
            const newJson = {};
            _.forIn(obj, function (value, key) {
                const newFilename = value.replace(/([^\.]+)\.([^\?]+)\?id=(.+)$/g, '$1.$3.$2');
                const oldAsGlob = value.replace(/([^\.]+)\.([^\?]+)\?id=(.+)$/g, '$1.*.$2');
                // delete old versioned file
                del.sync(['public' + oldAsGlob]);
                // copy as new versioned
                fs.copyFile('public' + key, 'public' + newFilename, function(err) {
                    if (err) console.error(err);
                });
                newJson[key] = newFilename;
            });
            jsonfile.writeFile(mixManifest, newJson, {spaces: 2}, function (err) {
                if (err) console.error(err);
            });
        });
});

It reads the mix-mainfest.json file and copies e.g. app.css to app.hash.css and then writes a new manifest. Also deletes old versioned files. Onyl works if every entry in manifest is versioned.

@m-lotze-7o7 it works like a charm

@m-lotze-7o7 thank you!! That was a super easy fix to apply and works great!

I hope @JeffreyWay can maybe use this to add it built-in to the app.. I have been having a ton of issues with cache busting when using the query hash since v1.0.. hopefully this makes it better

Thanks @m-lotze! The .then() solution is working well. Will implement this until a config option is added.

We ran into a limitation with the query string parameter when using S3 behind CloudFront for CDN and setting 'Expires' and 'Cache-Control' headers - even when configuring CloudFront to treat query strings as separate cache entries, S3 does not return a fresh object as it has not "expired" even if the source changes. Pretty annoying... Anyone else experienced this issue?

If anyone prefers AirBNB style JavaScript >= ES6, here's the same snippet reformatted:

const _ = require('lodash')
const del = require('del')
const fs = require('fs')
const jsonFile = require('jsonfile')
const mix = require('laravel-mix')

const mixManifest = 'public/mix-manifest.json'

...

        .version()
        .then(() => {
            // Parse the mix-manifest file
            jsonFile.readFile(mixManifest, (err, obj) => {
                const newJson = {}

                _.forIn(obj, (value, key) => {
                    // Get the hash from the ?id= query string parameter and move it into the file name e.g. 'app.abcd1234.css'
                    const newFilename = value.replace(/([^.]+)\.([^?]+)\?id=(.+)$/g, '$1.$3.$2')
                    // Create a glob pattern of all files with the new file naming style e.g. 'app.*.css'
                    const oldAsGlob = value.replace(/([^.]+)\.([^?]+)\?id=(.+)$/g, '$1.*.$2')
                    // Delete old versioned file(s) that match the glob pattern
                    del.sync([`public${oldAsGlob}`])
                    // Copy as new versioned file name
                    fs.copyFile(`public${key}`, `public${newFilename}`, (err) => {
                        if (err) console.error(err)
                    })
                    newJson[key] = newFilename
                })
                // Write the new contents of the mix manifest file
                jsonFile.writeFile(mixManifest, newJson, { spaces: 4 }, (err) => {
                    if (err) console.error(err)
                })
            })
        })

I'm now using the above, which works well, but it doesn't delete the old unversioned file, which is no longer needed. Not sure the reason for this. I've added the following 2 lines.

    oldFiles.push(key)
...
    _.forEach(oldFiles, (key) => {
        del.sync([`public${key}`]);
    });

So heres my tweaked version:

const _ = require('lodash')
const del = require('del')
const fs = require('fs')
const jsonFile = require('jsonfile')
const mix = require('laravel-mix')

const mixManifest = 'public/mix-manifest.json'

...

        .version()
        .then(() => {
            // Parse the mix-manifest file
            jsonFile.readFile(mixManifest, (err, obj) => {
                const newJson = {}
                const oldFiles = []

                _.forIn(obj, (value, key) => {
                    // Get the hash from the ?id= query string parameter and move it into the file name e.g. 'app.abcd1234.css'
                    const newFilename = value.replace(/([^.]+)\.([^?]+)\?id=(.+)$/g, '$1.$3.$2')
                    // Create a glob pattern of all files with the new file naming style e.g. 'app.*.css'
                    const oldAsGlob = value.replace(/([^.]+)\.([^?]+)\?id=(.+)$/g, '$1.*.$2')
                    // Delete old versioned file(s) that match the glob pattern
                    del.sync([`public${oldAsGlob}`])
                    // Copy as new versioned file name
                    fs.copyFile(`public${key}`, `public${newFilename}`, (err) => {
                        if (err) console.error(err)
                    })
                    newJson[key] = newFilename
                    oldFiles.push(key)
                })

                _.forEach(oldFiles, (key) => {
                    del.sync([`public${key}`]);
                });

                // Write the new contents of the mix manifest file
                jsonFile.writeFile(mixManifest, newJson, { spaces: 4 }, (err) => {
                    if (err) console.error(err)
                })
            })
        })

@m-lotze I got an idea from you without using .version(), in this way, you can control your filename hash length and async components built from /* webpackChunkName: "xxx" */ also perform well~:

const _ = require('lodash');
const jsonfile = require('jsonfile');
const mixManifest = 'public/mix-manifest.json';
mix.webpackConfig({
  output: {
    filename: '[name].[hash:6].js',
    chunkFilename: '[name].[hash:6].js'
  }
})
.then(function() {
    // hash
    jsonfile.readFile(mixManifest, function(err, obj) {
      const newJson = {};
      _.forIn(obj, function(value, key) {
        const removeHashFromKeyRegex = /\.(.+)\.(.+)$/g;
        key = key.replace(removeHashFromKeyRegex, '.$2');
        newJson[key] = value`;
      });
      jsonfile.writeFile(mixManifest, newJson, { spaces: 2 }, function(err) {
        if (err) console.error(err);
      });
    });
  });

ouput:

{
  "/tig/js/components/login.js": "/tig/js/components/login.ac01dc.js",
  "/tig/js/app/homepage.js": "/tig/js/app/homepage.ac01dc.js",
  "/tig/js/app/login.js": "/tig/js/app/login.ac01dc.js"
}

😝😝😝

Did somebody have a look at the user extensions feature of laravel-mix 2.1 and whether it could be used to move the filename hash to a plugin.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This is still something I'd like to see resolved in a future version. Maybe it will be easier to get going once the Webpack 4 version has shipped.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ctf0 , your package works fine, thanks... This is still something I'd like to see resolved in a future version.