/embeddings

Trying to shard big embedding tables in multiple devices paying attention to the communication aspects of parallel inference

Primary LanguagePython

Watchers