FedNAS: A Python repository from tianxiang999

Deployment

PyTorch 1.4.0, MPI4Py 3.0.3 (https://pypi.org/project/mpi4py), Python 3.7.4, wandb, Anaconda 4.9.2

Setup

Create environment space

conda create --name fednas --file spec-list.txt

conda activate fednas

Other packages: Wandb & torchsummaryX

pip install --upgrade wandb; pip install torchsummaryX

NFS (Network File System) Configuration in your cluster
SSH log in without password
Wandb init

wandb init $api-key

Config MPI host file based on deployment settings (?)

Modify the hostname list in "mpi_host_file" to correspond to your actual physical network topology. An example: Let us assume a network has a management node and four compute nodes (hostname: node1, node2, node3, node4). If you want use node1 and node2 to run our program, the "mpi_host_file" should be:

node1
node2
node3

Experiments

Once the hardware and software environment are both ready, you can easily use the following command to run FedNAS. Note:

you may find other packages are missing. Please install accordingly by "conda" or "pip".
Our default setting is 16 works. Please change parameters in "run_fed_nas_search.sh" based on your own physical servers and requirements.

Homogeneous distribution (IID) experiment:

# search
sh run_fednas_search.sh 4 darts homo 50 5 64

# train
sh run_fednas_train.sh 4 darts homo 500 15 64

Heterogeneous distribution (Non-IID) experiment:

# search
sh run_fednas_search.sh 4 darts hetero 50 5 64

# train
sh run_fednas_train.sh 4 darts hetero 500 15 64

We can also run code in a single 4 x NVIDIA RTX 2080Ti GPU server. In this case, we should decrease the batch size to 2 to guarantee that the total 17 processes can be loaded into the memory. The running script for such setting is:

# search
sh run_fednas_search.sh 4 darts hetero 50 5 8

tianxiang999/FedNAS

Deployment

Setup

Config MPI host file based on deployment settings (?)

Experiments