/fair_flearn

Fair Resource Allocation in Federated Learning (ICLR '20)

Primary LanguagePythonMIT LicenseMIT

Fair Resource Allocation in Federated Learning

This repository contains the code and experiments for the paper:

Fair Resource Allocation in Federated Learning

ICLR '20

Preparation

Download Dependencies

pip3 install -r requirements.txt

Generate Datasets

See the README files in separate data/$dataset folders for instructions on preprocessing and/or sampling each dataset.

For example,

under fair_flearn/data/fmnist, we clearly describe how to generate and preprocess the Fashion MNIST dataset.

In order to run the following demo on the Vehicle dataset, please go to fair_flearn/data/vehicle, download, and generate the Vehicle dataset following the README file under that directory.

Get Started

Example: the Vehicle dataset

[We provide a quick demo on the Vehicle dataset here. Don't need to change any default parameters in any scripts.]

First specify GPU ids (we can just use CPUs for Vehicle with a linear SVM)

export CUDA_VISIBLE_DEVICES=

Then go to the fair_flearn directory, and start running:

bash run.sh $dataset $method $data_partition_seed $q $sampling_device_method | tee $log

For Vehicle, $dataset is vehicle, $data_partition_seed can be set to 1, q is 0 for FedAvg, and 5 for q-FedAvg (the proposed objective). For sampling with weights proportional to the number of data points, $sampling_device_method is 2; for uniform sampling (one of the baselines), $sampling_device_method is 1. The exact command lines are as follows.

(1) Experiments to verify the fairness of the q-FFL objective, and compare with uniform sampling schemes:

mkdir log_vehicle
bash run.sh vehicle qffedavg 1 0 2 | tee log_vehicle/ffedavg_run1_q0
bash run.sh vehicle qffedavg 1 5 2 | tee log_vehicle/ffedavg_run1_q5
bash run.sh vehicle qffedavg 1 0 1 | tee log_vehicle/fedavg_uniform_run1

Plot to re-produce the results in the manuscript:

(we use seaborn to draw the fitting curves of accuracy distributions)

pip install seaborn
python plot_fairness.py

We can then compare the generated fairness_vehicle.pdf with Figure 1 (the Vehicle subfigure) and Figure 2 (the Vehicle subfigure) in the paper to validate reproducibility. Note that the accuracy distributions reported (both in figures and tables) are the results averaged across 5 different train/test/validation data partitions with data parititon seeds 1, 2, 3, 4, and 5.

(2) Experiments to demonstrate the communication-efficiency of the proposed method q-FedAvg:

bash run.sh vehicle qffedsgd 1 5 2 | tee log_vehicle/ffedsgd_run1_q5

Plot to re-produce the results in the paper:

python plot_efficiency.py

We can then compare the generated efficiency_qffedavg.pdf fig with Figure 3 (the Vehicle subfigure) to verify reproducibility.

Run on other datasets

  • First, config run.sh based on all hyper-parameters (e.g., batch size, learning rate, etc) reported in the manuscript (appendix B.2.3).
  • If you would like to run on Sent140, you also need to download a pre-trained embedding file using the following commands (this may take 3-5 minutes):
cd fair_flearn/flearn/models/sent140
bash get_embs.sh
  • We use different models for different datasets, so you need to change the model name specified by --model. The corrsponding model associated with a dataset is described in fair_flearn/models/$dataset/$model.py. For instance, if you would like to run on the Shakespeare dataset, you can find the model name under fair_flearn/models/shakespeare/, which is stacked_lstm, and pass this parameter to --model='stacked_lstm'.
  • You also need to specify total communication rounds using --num_rounds. Suggested number of rounds based on our previous experiments are:
Vehicle: default
synthetic: 20000
sent140: 200
shakespeare: 80
fashion mnist: 6000
adult: 600

For fairness and efficiency experiments, we use four datasets: Vehicle, Sythetic, sent140 and Shakespeare. method can be chosen from [qffedavg, qffedsgd]. $sampling is 2 (with weights of sampling devices proportional to the number of local data points).

mkdir log_$dataset
bash run.sh $dataset $method $seed $q $sampling | tee log_$dataset/$method_run$seed_q$q

In particular, $dataset can be chosen from [vehicle, synthetic, sent140, shakespeare], in accordance with the data directory names under the fair_flearn/data/ folder.

Compare with AFL. We compare wtih the AFL baseline using the two datasets (samplaed Fashion MNIST and Adult) following the AFL paper.

  • Generate data. (data generation process is as described above)
  • Specify parameters. method should be specified to be afl in order to run AFL algorithms. data_partition_seed should be set to 0, such that it won't randomly partition datasets into train/test/validation splits. This allows us to use the same standard public testing set as that in the AFL paper. track_individual_accuracy should be set to 1. Here is an example run.sh for the Adult dataset:
python3  -u main.py --dataset=$1 --optimizer=$2  \
            --learning_rate=0.1 \
            --learning_rate_lambda=0.1 \
            --num_rounds=600 \
            --eval_every=1 \
            --clients_per_round=2 \
            --batch_size=10 \
            --q=$4 \
            --model='lr' \
            --sampling=$5  \
            --num_epochs=1 \
            --data_partition_seed=$3 \
            --log_interval=100 \
            --static_step_size=0 \
            --track_individual_accuracy=1 \
            --output="./log_$1/$2_samp$5_run$3_q$4"

And then run:

bash run.sh adult qffedsgd 0 5 2 | tee log_adult/qffedsgd_q5
bash run.sh adult afl 0 0 2 | tee log_adult/afl
  • You can find the accuracy numbers in the log files log_adult/qffedsgd_q5 and log_adult/afl, respectively.

References

See our Fair Federated Learning manuscript for more details as well as all references.