ml-efficiency

There are 3 repositories under ml-efficiency topic.

mosaicml/composer
Supercharge Your Model Training
Language:Python5.4k 49 563452
stsxxx/MoDM
MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.
Language:Python3
MyDarapy/SmolLM-experiments-with-grouped-query-attention
(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)
Language:Python1 1 00