[guide] handling ARCs across clones / machines

Question

[guide] handling ARCs across clones / machines

Opened this issue a year ago · 1 comments

Brilator commented a year ago

Handling ARCs across machines (laptop, desktop, server, hpc, etc.) is not intuitive.

We need a practical guide / recommendation, e.g.:

DataHUB as "original" clone with all data
Laptop / desktop for small data, to create the ARC, for structuring and metadata annotation, creating scripts, etc.
Clone to server / HPC (ideally via git) where large data is stored for computations

Needs to include work with LFS, git / ARC commander handling.
Reminders about arc syncing (keep clones up-to-date)
notes on working with branches

Answer 1 · 2023-12-12T11:14:05.000Z

I think the above are a great basis for recommendations!
I have worked on an ARC across machines and as suggested using git pull worked well to pull the changes made on the respective other machine, so this could also be a good general recommendation to use regularly.

However, this was before I added the raw data to the ARC. Now I am wondering how to e.g. synchronize my laptop without pulling all those large files. If we find a solution, this would also be an important recommendation.