[guide] handling ARCs across clones / machines
Opened this issue · 1 comments
Brilator commented
Handling ARCs across machines (laptop, desktop, server, hpc, etc.) is not intuitive.
We need a practical guide / recommendation, e.g.:
- DataHUB as "original" clone with all data
- Laptop / desktop for small data, to create the ARC, for structuring and metadata annotation, creating scripts, etc.
- Clone to server / HPC (ideally via git) where large data is stored for computations
- Needs to include work with LFS, git / ARC commander handling.
- Reminders about arc syncing (keep clones up-to-date)
- notes on working with branches
micwij commented
I think the above are a great basis for recommendations!
I have worked on an ARC across machines and as suggested using git pull
worked well to pull the changes made on the respective other machine, so this could also be a good general recommendation to use regularly.
However, this was before I added the raw data to the ARC. Now I am wondering how to e.g. synchronize my laptop without pulling all those large files. If we find a solution, this would also be an important recommendation.