Pinned Repositories
cachew
ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).
cachew_experiments
ML Input Data Processing as a Service
cloud-comp-arch-project
Starter code for semester project in Cloud Computing Architecture course at ETH Zurich
cloudlab_extension
Copy node connection information easily
deltazip
Compression for Foundation Models
dirigent
Dirigent: Lightweight Serverless Orchestration
fmengine
Utilities for Training Very Large Models
mixtera-model-eval
Nuts and bolts for evaluation of models trained in context of mixtera
modyn
Modyn is a research-platform for training ML models on growing datasets.
orion
An interference-aware scheduler for fine-grained GPU sharing
eth-easl's Repositories
eth-easl/orion
An interference-aware scheduler for fine-grained GPU sharing
eth-easl/fmengine
Utilities for Training Very Large Models
eth-easl/cachew
ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).
eth-easl/modyn
Modyn is a research-platform for training ML models on growing datasets.
eth-easl/dirigent
Dirigent: Lightweight Serverless Orchestration
eth-easl/deltazip
Compression for Foundation Models
eth-easl/cachew_experiments
ML Input Data Processing as a Service
eth-easl/cloud-comp-arch-project
Starter code for semester project in Cloud Computing Architecture course at ETH Zurich
eth-easl/cloudlab_extension
Copy node connection information easily
eth-easl/adaptdl
Resource-adaptive cluster scheduler for deep learning training.
eth-easl/airflow
eth-easl/cglm-metadata
Hosts CGLM metadata
eth-easl/CheckFreq
eth-easl/elastic
PyTorch elastic training
eth-easl/elastic-learning-rate-evaluation
eth-easl/memcache-perf-dynamic
Load generator for memcached (multi threaded, multi machine)
eth-easl/mixtera-model-eval
Nuts and bolts for evaluation of models trained in context of mixtera
eth-easl/mlibc
Portable C standard library
eth-easl/ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
eth-easl/rWasm
A cross-platform high-performance provably-safe sandboxing Wasm-to-native compiler
eth-easl/serving
Kubernetes-based, scale-to-zero, request-driven compute
eth-easl/varuna
eth-easl/nanotron
Minimalistic large language model 3D-parallelism training
eth-easl/pccheck
eth-easl/pecan-experiments
Contains instructions and scripts for the ATC'24 Pecan artifact evaluation.
eth-easl/triteia
Useful Kernels for ML in Triton
eth-easl/vidur
A large-scale simulation framework for LLM inference