TheDiscordian/ipfs-sync

Check for existing files

kallisti5 opened this issue · 3 comments

I'm playing with this on our big software mirror... running the sync daemon and adding a test file to trigger ipfs-sync I get pages and pages of...

2021/10/01 11:04:42 Error adding file: cp: cannot put node in path /haikuports/master/x86_gcc2/current/packages/neocd_libretro_x86_source-0.5_20210425-1-source.hpkg: directory already has entry by that name
2021/10/01 11:04:42 Adding file from /home/ipfs/haikuports/master/x86_64/current/packages/assimp_debuginfo-5.0.1-1-x86_64.hpkg to /haikuports/master/x86_64/current/packages/assimp_debuginfo-5.0.1-1-x86_64.hpkg ...
2021/10/01 11:04:42 Error adding file: cp: cannot put node in path /haikuports/master/x86_64/current/packages/assimp_debuginfo-5.0.1-1-x86_64.hpkg: directory already has entry by that name
2021/10/01 11:04:42 Adding file from /home/ipfs/haikuports/master/x86_64/current/packages/libreoffice_cs-7.1.0.3-2-any.hpkg to /haikuports/master/x86_64/current/packages/libreoffice_cs-7.1.0.3-2-any.hpkg ...
2021/10/01 11:04:42 Error adding file: cp: cannot put node in path /haikuports/master/x86_64/current/packages/libreoffice_cs-7.1.0.3-2-any.hpkg: directory already has entry by that name

It seems like checking for the existence of a file in MFS would be better than letting the additional fail? (maybe also comparing the size of the file in MFS and the file on disk)

Ideally it would check the CID of the file on disk vs the CID of the file in MFS, but that might not work out well :-)

@kallisti5 is this a duplicate of #45? If so, I'll tag that issue for 0.8.0 release, and try to get 0.8.0 out sooner.

Ideally it would check the CID of the file on disk vs the CID of the file in MFS, but that might not work out well :-)

Now I totally can do this, perhaps I should do-away with xxhash, and store CIDs instead. Would enable behaviour like this to be easily thrown in. (I need to think about this more, as xxhash is very fast, and less strict, which allows it to be so fast AFAIK)

Also if you're copying your IPFS db along with ipfs-sync, as long as the paths are identical and you also copied the ipfs-sync db over, it should not be going over all the existing files again. Though #45 still totally needs to be fixed, as it'd be WAY faster to not hit MFS with all those fails.