aashiqmuhamed/GRASS
GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
MIT
Issues
- 0
About the LLaMA-1B trained on C4
#1 opened by shixiangsong
GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
MIT