EPFL Machine Learning and Optimization Laboratory

Lausanne, Switzerland

Pinned Repositories

attention-cnn
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
Language:Python1.1k 27 10127
collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
Language:Python151 14 621
disco
DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
Language:TypeScript167 11 32530
dynamic-sparse-flash-attention
Language:Jupyter Notebook149 7 56
federated-learning-public-code
Language:Python162 8 1347
landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
Language:Python425 38 1535
ML_course
EPFL Machine Learning Course, Fall 2025
Language:Jupyter Notebook1.9k 91 21987
OptML_course
EPFL Course - Optimization for Machine Learning - CS-439
Language:Jupyter Notebook1.3k 73 13335
powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
Language:Python147 10 1831
sent2vec
General purpose unsupervised sentence representations
Language:C++1.2k 38 109261

EPFL Machine Learning and Optimization Laboratory's Repositories

epfml/ML_course
EPFL Machine Learning Course, Fall 2025
Language:Jupyter Notebook1.9k 91 21987
epfml/OptML_course
EPFL Course - Optimization for Machine Learning - CS-439
Language:Jupyter Notebook1.3k 73 13335
epfml/landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
Language:Python425 38 1535
epfml/disco
DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
Language:TypeScript167 11 32530
epfml/collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
Language:Python151 14 621
epfml/dynamic-sparse-flash-attention
Language:Jupyter Notebook149 7 56
epfml/powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
Language:Python147 10 1831
epfml/llm-baselines
nanoGPT-like codebase for LLM training
Language:Python107 8 833
epfml/schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
Language:Python83 3 08
epfml/DenseFormer
Language:Python82 6 09
epfml/optML-pku
summer school materials
46 6 05
epfml/error-feedback-SGD
SGD with compressed gradients and error-feedback: https://arxiv.org/abs/1901.09847
Language:Jupyter Notebook31 7 29
epfml/llm-optimizer-benchmark
Benchmarking Optimizers for LLM Pretraining
Language:Python281
epfml/getting-started
Language:Python23 5 216
epfml/pam
Language:Python17 4 04
epfml/REQ
Language:Python17 5 00
epfml/relaysgd
Code for the paper “RelaySum for Decentralized Deep Learning on Heterogeneous Data”
Language:Jupyter Notebook9 7 03
epfml/CoTFormer
Language:Python5
epfml/easy-summary
difficulty-guided text summarization
Language:Python5 5 05
epfml/personalized-collaborative-llms
Exploration on-device self-supervised collaborative fine-tuning of large language models with limited local data availability, using Low-Rank Adaptation (LoRA). We introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based.
Language:Python5 3 10
epfml/ghost-noise
Language:Python4 4 00
epfml/fineweb2-hq
Code for the paper "Enhancing Multilingual LLM Pretraining with Model-Based Data Selection"
Language:Python31
epfml/CoMiGS
Language:Python1 4 00
epfml/TiMoE
A time aware language modeling framework
Language:Python1
epfml/CoBo
Language:Python3 0
epfml/DoGE
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
1 0
epfml/epfml-utils
Tools for experimentation and using run:ai. The aim is for these to be small self-contained utilities that are used by multiple people.
Language:Python8 1
epfml/getting-started-lauzhack
Language:Python0 0
epfml/grad-norm-smooth
Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
Language:Jupyter Notebook
epfml/semester-project-personalization
Language:Python3 0