Pinned Repositories
attention-cnn
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
disco
DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
dynamic-sparse-flash-attention
federated-learning-public-code
landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
ML_course
EPFL Machine Learning Course, Fall 2025
OptML_course
EPFL Course - Optimization for Machine Learning - CS-439
powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
sent2vec
General purpose unsupervised sentence representations
EPFL Machine Learning and Optimization Laboratory's Repositories
epfml/ML_course
EPFL Machine Learning Course, Fall 2025
epfml/OptML_course
EPFL Course - Optimization for Machine Learning - CS-439
epfml/landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
epfml/disco
DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
epfml/collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
epfml/dynamic-sparse-flash-attention
epfml/powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
epfml/llm-baselines
nanoGPT-like codebase for LLM training
epfml/schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
epfml/DenseFormer
epfml/optML-pku
summer school materials
epfml/error-feedback-SGD
SGD with compressed gradients and error-feedback: https://arxiv.org/abs/1901.09847
epfml/llm-optimizer-benchmark
Benchmarking Optimizers for LLM Pretraining
epfml/getting-started
epfml/pam
epfml/REQ
epfml/relaysgd
Code for the paper “RelaySum for Decentralized Deep Learning on Heterogeneous Data”
epfml/CoTFormer
epfml/easy-summary
difficulty-guided text summarization
epfml/personalized-collaborative-llms
Exploration on-device self-supervised collaborative fine-tuning of large language models with limited local data availability, using Low-Rank Adaptation (LoRA). We introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based.
epfml/ghost-noise
epfml/fineweb2-hq
Code for the paper "Enhancing Multilingual LLM Pretraining with Model-Based Data Selection"
epfml/CoMiGS
epfml/TiMoE
A time aware language modeling framework
epfml/CoBo
epfml/DoGE
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
epfml/epfml-utils
Tools for experimentation and using run:ai. The aim is for these to be small self-contained utilities that are used by multiple people.
epfml/getting-started-lauzhack
epfml/grad-norm-smooth
Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
epfml/semester-project-personalization