EleutherAI/gpt-neox

files in multi-node training

mackmake opened this issue · 2 comments

i want to trin on multi-node. is it necessary to copy preprocessed dataset file and tokenizer to all nodes? or all nodes read these files from master node?

Typically we store them in a shared file storage system that every node has access to, but if you are training multi-node and require locally storage data then yes you'll need to share them across the storage systems.

thnks shared memory was a nice idea ^_^