Pinned Repositories
FedScale
FedScale is a scalable and extensible open-source federated learning (FL) platform.
Hydra
Hydra adds resilience and high availability to remote memory solutions.
Infiniswap
Infiniswap enables unmodified applications to efficiently use disaggregated memory.
Justitia
Justitia provides RDMA isolation between applications with diverse requirements.
Leap
Prefetching and efficient data path for memory disaggregation
ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
Oobleck
A resilient distributed training framework
Oort
Oort: Efficient Federated Learning via Guided Participant Selection
Salus
Fine-grained GPU sharing primitives
Tiresias
Tiresias is a GPU cluster manager for distributed deep learning training.