/spotty

Train Deep Learning models on AWS Spot Instances

Primary LanguagePythonMIT LicenseMIT

Documentation PyPI PyPI - Python Version PyPI - License

Spotty drastically simplifies training of deep learning models on AWS:

  • it makes training on AWS GPU instances as simple as training on your local computer
  • it automatically manages all necessary AWS resources including AMIs, volumes, snapshots and SSH keys
  • it makes your model trainable on AWS by everyone with a couple of commands
  • it uses tmux to easily detach remote processes from their terminals
  • it saves you up to 70% of the costs by using Spot Instances

Documentation

Installation

Requirements:

Use pip to install or upgrade Spotty:

$ pip install -U spotty

Get Started

  1. Prepare a spotty.yaml file and put it to the root directory of your project:

    • See the file specification here.
    • Read this article for a real-world example.
  2. Create an AMI. Run the following command from the root directory of your project:

    $ spotty aws create-ami

    In several minutes you will have an AMI with NVIDIA Docker that Spotty will use for all your projects within the AWS region.

  3. Start an instance:

    $ spotty start

    It will run a Spot Instance, restore snapshots if any, synchronize the project with the running instance and start the Docker container with the environment.

  4. Train a model or run notebooks.

    You can run custom scripts inside the Docker container using the spotty run <SCRIPT_NAME> command. Read more about custom scripts in the documentation: Configuration: "scripts" section.

    To connect to the running container via SSH, use the following command:

    $ spotty ssh

    It runs a tmux session, so you can always detach this session using Ctrl + b, then d combination of keys. To be attached to that session later, just use the spotty ssh command again.

Contributions

Any feedback or contributions are welcome! Please check out the guidelines.

License

MIT License