/composer

Supercharge Your Model Training

Primary LanguagePythonApache License 2.0Apache-2.0

Composer fork for Mamba models

This repository is a fork of the Composer library to train Mamba models with the following features:

  • Custom Block-wise activation checkpointing
  • Custom FSDP layer wrapping for Mamba
  • The WSD scheduler
  • FLOPs computation for Mamba
  • Custom and efficient dataloading
  • Improved logging

More details and instructions can be found in the dedicated mamba directory on how to use and train Mamba models with the provided codebase.