lix-pm/lix.client

Race condition when multiple instances of `lix` are installing the same lib simultaneously

Closed this issue · 1 comments

This is something that can happen in an environment where multiple builds are running concurrently, e.g. on a CI agent.
If multiple instances of lix download are running concurrently and are installing the same lib at the same time, it's highly likely that one of them will fail. Here's an example of an error message when this happens:

15:38:00  Failed to move /opt/jenkins_agents/haxe_agent_1_lubet/haxe/downloads/download@1690810677266/ to /opt/jenkins_agents/haxe_agent_1_lubet/haxe/haxe_libraries/modular/0.14.0/haxelib_http%3A%2F%2F95.128.124.110%3A2000

Possible solutions:

  • Before downloading a specific version of a package from a specific location, use a lockfile per package/version/location combination. The first instance of lix to obtain the lock does the actual downloading&unpacking, while the second (third, etc) lix should block, waiting until the lock is released (could be good to also add a timeout just in case, if possible)
  • Only allow running one instance of lix download at a time, also using one global lockfile. The first instance of lix to obtain the lock does its job, while the second (third, etc) instance block and wait for their turn. This is more crude but should be slightly easier to implement, and probably quite good enough.

For cross-platform lockfile management, a module such as proper-lockfile can be used.

@kevinresol @back2dos Please review and if ok, merge/release a new version. Thanks!