ml-efficiency

There are 3 repositories under ml-efficiency topic.

  • mosaicml/composer

    Supercharge Your Model Training

    Language:Python5.4k49563452
  • stsxxx/MoDM

    MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.

    Language:Python3
  • MyDarapy/SmolLM-experiments-with-grouped-query-attention

    (Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

    Language:Python1100