/FedCompass

[ICLR 2024] FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices Using a Computing Power-Aware Scheduler

Primary LanguagePythonMIT LicenseMIT

šŸ§­ FedCompass - Efficient Cross-Silo Federated Learning with a Computing Power Aware Scheduler.

discord paper project project project

Table of Contents

Introduction

FedCompass is a semi-asynchrnous federated learning (FL) algorithm which addresses the time-efficiency challenge of other synchronous FL algorithms, and the model performance challenge of other asynchronous FL algorithms (due to model stalenesses) by using a COMputing Power Aware Scheduler (COMPASS) to adaptively assign different numbers of local steps to different FL clients and synchrnoize the arrival of client local models.

This repository is built upon the open-source and highly extendible FL framework APPFL and employs gRPC as the communication protocol to help you easily launch FL experiment using FedCompass among distributed FL clients.

Installation

Users can install by cloning this repository and installing the package locally. We also highly recommend to create a virtual environment for easy dependency management.

conda create -n fedcompass python=3.8
conda activate fedcompass
git clone https://github.com/APPFL/FedCompass.git && cd FedCompass
pip install -e .

Launch First Example Experiment

In the examples folder, we provide example scripts to train a CNN on the MNIST dataset using federated learning by running serial simulation, MPI simulation, and gRPC deployment. Specifically, in this repository, we refer

  • simulation as federated learning experiments that can only run on a single machine or a cluster
  • deployment as federated learning experiments that can run only multiple distributed machines

Serial Simulation

Please go to the examples folder first, and then run the following command

python serial/run_serial.py \
    --server_config config/server_fedavg.yaml \
    --client_config config/client_1.yaml \
    --num_clients 5

where --server_config is the path to the configuration file for the FL server. We currently provide three configuration files for the FL server, corresponding to three different FL algorithms. However, it should be noted at the beginning that serial simulation is only suitable and making sense for synchrnous federated learning algorithms.

  • config/server_fedcompass.yaml: FL server for the FedCompass algorithm
  • config/server_fedavg.yaml: FL server for the FedAvg algorithm
  • config/server_fedasync.yaml: FL server for the FedAsync algorithm

--client_config is the path to the base configuration file for the FL clients, and --num_clients is the number of FL clients you would like to simulate.

MPI Simulation

Please go to the examples folder first, and then run the following command

mpiexec -n 6 python mpi/run_mpi.py \
    --server_config config/server_fedcompass.yaml \
    --client_config config/client_1.yaml 

where mpiexec -n 6 means that we start 6 MPI processes, and there will be 6-1=5 FL clients, as one MPI process will serve as the FL server.

gRPC Deployment

Please go to the examples folder first. To launch a server, users can run the following command,

python grpc/run_server.py --config config/server_fedcompass.yaml

The above command launches an FL server at localhost:50051 waiting for connection from two FL clients. To launch two FL clients, open two separate terminals and go to the examples folder, and run the following two commands, respectively. This will help you start an FL experiment with two clients and a server running the specified algorithm.

python grpc/run_client_1.py
python grpc/run_client_2.py

Features

  • Server aggregation algorithm customization
  • Server scheduling algorithm customization
  • Client local trainer customization
  • Synchronous federated learning
  • Asynchronous Federated Learning
  • Semi-asynchronous federated learning
  • Model and dataset customization
  • Loss function and evaluation metric customization
  • Heterogeneous data partition
  • Lossy compression using SZ compressors
  • Single-node serial federated learning simulation
  • MPI federated learning simulation
  • Real federated learning deployment using gRPC
  • Authentication in gRPC using Globus Identity
  • wandb visualization

Citation

If you find FedCompass and this repository useful to your research, please consider cite the following paper

@article{li2023fedcompass,
  title={FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler},
  author={Li, Zilinghan and Chaturvedi, Pranshu and He, Shilan and Chen, Han and Singh, Gagandeep and Kindratenko, Volodymyr and Huerta, Eliu A and Kim, Kibaek and Madduri, Ravi},
  journal={arXiv preprint arXiv:2309.14675},
  year={2023}
}
@inproceedings{ryu2022appfl,
  title={APPFL: open-source software framework for privacy-preserving federated learning},
  author={Ryu, Minseok and Kim, Youngdae and Kim, Kibaek and Madduri, Ravi K},
  booktitle={2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)},
  pages={1074--1083},
  year={2022},
  organization={IEEE}
}