/fedscale

Primary LanguagePythonApache License 2.0Apache-2.0

FedScale

FedScale is a scalable and extensible open-source federated learning (FL) engine. It provides high-level APIs to implement FL algorithms, deploy them at scale across diverse hardware and software backends, and evaluate them at scale. FedScale also includes the largest FL benchmark that contains FL tasks ranging from image classification and object detection to language modeling and speech recognition. Moreover, it provides datasets to faithfully emulate FL training environments where FL will realistically be deployed.

http://fedscale.ai

Getting Started

Installing From Source

FedScale requires Python 3.7 or better. We use Anaconda environments to manage its dependencies. FedScale also requires FEDSCALE_HOME environment variable set to the path of the FedScale installation directory.

Once you have Anaconda installed, here are the instructions assuming that the current directory is where you have cloned FedScale to.

export FEDSCALE_HOME=$PWD

conda init bash
. ~/.bash_profile
conda env create -f environment.yml
conda activate fedscale
pip install -e .

Finally, install NVIDIA CUDA 10.2 or above if you want to use FedScale with GPU support.

Quick installation on Linux

You can simply run install.sh without CUDA.

source install.sh  
pip install -e .

You can add --cuda if you want CUDA 10.2.

source install.sh --cuda
pip install -e .

Update install.sh if you prefer different versions of conda and/or CUDA.

Tutorials

Now that you have FedScale installed, you can start exploring FedScale following one of these introductory tutorials.

  1. Explore FedScale datasets
  2. Deploy your FL experiment
  3. Implement an FL algorithm

FedScale Datasets

We are adding more datasets! Please contribute!

FedScale consists of 20+ large-scale, heterogeneous FL datasets covering computer vision (CV), natural language processing (NLP), and miscellaneous tasks. Each one is associated with its training, validation, and testing datasets. Please go to the ./dataset directory and follow the dataset README for more details.

FedScale Runtime

FedScale Runtime is an scalable and extensible deployment as well as evaluation platform to simplify and standardize FL experimental setup and model evaluation. It evolved from our prior system, Oort, which has been shown to scale well and can emulate FL training of thousands of clients in each round.

Please go to ./core directory and follow the README to set up FL training scripts.

Repo Structure

Repo Root
|---- dataset     # FedScale benchmarking datasets
|---- fedscale    # FedScale source code
  |---- core      # Experiment platform of FedScale
|---- examples    # Examples of new plugins
|---- evals       # Backend for FL job submission
    

References

Please read and/or cite as appropriate to use FedScale code or data or learn more about FedScale.

@inproceedings{fedscale-icml,
  title={FedScale: Benchmarking Model and System Performance of Federated Learning at Scale},
  author={Fan Lai and Yinwei Dai and Sanjay S. Singapuram and Jiachen Liu and Xiangfeng Zhu and Harsha V. Madhyastha and Mosharaf Chowdhury},
  booktitle={ICML},
  year={2022}
}

and

@inproceedings{oort-osdi21,
  title={Oort: Efficient Federated Learning via Guided Participant Selection},
  author={Fan Lai and Xiangfeng Zhu and Harsha V. Madhyastha and Mosharaf Chowdhury},
  booktitle={USENIX Symposium on Operating Systems Design and Implementation (OSDI)},
  year={2021}
}

Contributions and Communication

Please submit issues or pull requests as you find bugs or improve FedScale.

If you have any questions or comments, please join our Slack channel, or email us (fedscale@googlegroups.com).