fpco/cache-s3

createSymbolicLink: already exists (File exists) - this time with docker in the path

Closed this issue · 4 comments

With cache-s3 v0.1.2, this is still happening:

cache-s3: /home/xxxxxxxx/xxxxxxxxxxx/.stack-work/docker/_home/.ssh: createSymbolicLink: already exists (File exists)

The build is using Stack's Docker support.

Under normal conditions It should remove any existing symlink with the same name before creating a new one. I don't have any project currently setup with docker to test it myself, so could you please check the ownership and permissions of that link and check if current user has enough privileges to remove it?

At the same time, symlinks' permissions are always 0o777, so this unlikely to be the cause of the problem.
Here is the code that is responsible for this error: https://github.com/snoyberg/tar-conduit/blob/master/src/Data/Conduit/Tar/Unix.hs#L78-L79

So, the only way to debug this problem is to try to remove that link manually from ghci and see what error will it throw:

λ> import System.Posix.Files as Posix
λ> Posix.removeLink "/home/xxxxxxxx/xxxxxxxxxxx/.stack-work/docker/_home/.ssh"

I suspect you're right and there were some permissions issues on that runner. The runner is gone so I can't investigate the exact conditions. However, the error message in case of a permissions error should be different.

I've worked with docker many times and I am more than sure this issue is caused by permissions. More specifically ownership and permissions on this folder /home/xxxxxxxx/xxxxxxxxxxx/.stack-work/docker/_home/. The error message, although maybe a bit misleading, is exactly what expected: Link is already there and cannot be replaced. Here is the reason why, consider a folder bar and a file bar/foo owned by root:

λ> Posix.removeLink "bar/foo"
*** Exception: bar/foo: removeLink: permission denied (Permission denied)
λ> _ <- tryAny $ Posix.removeLink "bar/foo"
λ> Posix.createSymbolicLink "baz" "bar/foo"
*** Exception: baz: createSymbolicLink: already exists (File exists)

Point is, currently we ignore any errors when trying to remove the link, because we define restore operation as best effort.

Regardless of the error message, I don't think there is a workaround that can be implemented in cache-s3 or tar-conduit. The only thing I can think of is adding an extra command: cache-s3 rm, which would remove all files/folders that would be cached, so it can be executed separately with sudo. I'll add a separate ticket for that.