š§ FedCompass - Efficient Cross-Silo Federated Learning with a Computing Power Aware Scheduler.
Table of Contents
FedCompass is a semi-asynchrnous federated learning (FL) algorithm which addresses the time-efficiency challenge of other synchronous FL algorithms, and the model performance challenge of other asynchronous FL algorithms (due to model stalenesses) by using a COMputing Power Aware Scheduler (COMPASS) to adaptively assign different numbers of local steps to different FL clients and synchrnoize the arrival of client local models.
This repository is built upon the open-source and highly extendible FL framework APPFL and employs gRPC as the communication protocol to help you easily launch FL experiment using FedCompass among distributed FL clients.
Users can install by cloning this repository and installing the package locally. We also highly recommend to create a virtual environment for easy dependency management.
conda create -n fedcompass python=3.8
conda activate fedcompass
git clone https://github.com/APPFL/FedCompass.git && cd FedCompass
pip install -e .
In the examples
folder, we provide example scripts to train a CNN on the MNIST dataset using federated learning by running serial simulation, MPI simulation, and gRPC deployment. Specifically, in this repository, we refer
- simulation as federated learning experiments that can only run on a single machine or a cluster
- deployment as federated learning experiments that can run only multiple distributed machines
Please go to the examples
folder first, and then run the following command
python serial/run_serial.py \
--server_config config/server_fedavg.yaml \
--client_config config/client_1.yaml \
--num_clients 5
where --server_config
is the path to the configuration file for the FL server. We currently provide three configuration files for the FL server, corresponding to three different FL algorithms. However, it should be noted at the beginning that serial simulation is only suitable and making sense for synchrnous federated learning algorithms.
config/server_fedcompass.yaml
: FL server for the FedCompass algorithmconfig/server_fedavg.yaml
: FL server for the FedAvg algorithmconfig/server_fedasync.yaml
: FL server for the FedAsync algorithm
--client_config
is the path to the base configuration file for the FL clients, and --num_clients
is the number of FL clients you would like to simulate.
Please go to the examples
folder first, and then run the following command
mpiexec -n 6 python mpi/run_mpi.py \
--server_config config/server_fedcompass.yaml \
--client_config config/client_1.yaml
where mpiexec -n 6
means that we start 6 MPI processes, and there will be 6-1=5 FL clients, as one MPI process will serve as the FL server.
Please go to the examples
folder first. To launch a server, users can run the following command,
python grpc/run_server.py --config config/server_fedcompass.yaml
The above command launches an FL server at localhost:50051
waiting for connection from two FL clients. To launch two FL clients, open two separate terminals and go to the examples
folder, and run the following two commands, respectively. This will help you start an FL experiment with two clients and a server running the specified algorithm.
python grpc/run_client_1.py
python grpc/run_client_2.py
- Server aggregation algorithm customization
- Server scheduling algorithm customization
- Client local trainer customization
- Synchronous federated learning
- Asynchronous Federated Learning
- Semi-asynchronous federated learning
- Model and dataset customization
- Loss function and evaluation metric customization
- Heterogeneous data partition
- Lossy compression using SZ compressors
- Single-node serial federated learning simulation
- MPI federated learning simulation
- Real federated learning deployment using gRPC
- Authentication in gRPC using Globus Identity
- wandb visualization
If you find FedCompass and this repository useful to your research, please consider cite the following paper
@article{li2023fedcompass,
title={FedCompass: Efficient Cross-Silo Federated Learning on Heterogeneous Client Devices using a Computing Power Aware Scheduler},
author={Li, Zilinghan and Chaturvedi, Pranshu and He, Shilan and Chen, Han and Singh, Gagandeep and Kindratenko, Volodymyr and Huerta, Eliu A and Kim, Kibaek and Madduri, Ravi},
journal={arXiv preprint arXiv:2309.14675},
year={2023}
}
@inproceedings{ryu2022appfl,
title={APPFL: open-source software framework for privacy-preserving federated learning},
author={Ryu, Minseok and Kim, Youngdae and Kim, Kibaek and Madduri, Ravi K},
booktitle={2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)},
pages={1074--1083},
year={2022},
organization={IEEE}
}