Federated Learning Demo

Demo of federated learning.

Setting up requirements

These instructions assume you are on Ubuntu 16.04, they may be different for other operating systems.

Clone this repository and cd to it.

Install mosquitto, a MQTT broker.

Install virtualenv if you don't have it already.

pip install virtualenv

Then create a new virtualenv and switch to it.

virtualenv ENV
source ENV/bin/activate

Then install the requirements

pip install -r requirements.txt

Running

Starting the server

source ENV/bin/activate
python server.py

Starting clients (add as many clients as you want)

source ENV/bin/activate
python client.py

Starting a task

Navigate to http://localhost:5000 and configure your task.

Contributing

Server/Client code

Changing the learning task

The model used is the PersonBinaryClassifier object defined in common/models.py.

As long as you are using a PyTorch Module object, there should not be any other changes that need to be made.

The train and test function are defined in the same file, common/models.py.

In general, the ModelRunner object is used throughout the project to run the learning task. The project uses the get_model_runner function in the common/person_classifier.py file to get this ModelRunner object. To find where we use this function, you can use any IDE/GitHub's find references feature to see all the locations that it is being used.

Changing the training data

The client and server code to read in training / test data expect the data to be in a certain format. Any dataset can be used as long as the images are preprocessed into the following format. The data is expected to be in a .pkl (Python pickle) format, and contain an array of tuples. The first argument in each tuple is the image itself (ex. 2D array of image pixels) and the second argument is an array of labels. The labels are specific to the learning task, but for our case either "person" or "no-person" must exist (only one). In addition, this array of labels also contains all clusters that this image can be a part of (ex. "sky", "water", "ground").

The Jupyter Notebook used by us to preprocess the Coco dataset into our format is provided for reference in data/data_setup/.

In server.py, there is a global variable CLUSTER_NAMES. This variable is an array of cluster names that the server is allowed to assign to clients. It is set by the developer and is required to be changed when the dataset / cluster labels change.

Miscellaneous

After making changes, make sure to format your code.

python3 -m autopep8 --in-place --aggressive --aggressive --recursive *.py common utils

React

Install dependencies. Make sure you have Node and NPM. Our configuration uses Node v10.16.3 and npm v6.14.4.

cd website
npm install

To make changes, it is recommended to use the live server.

npm start

After making changes, create a minified compiled version that Flask will serve.

npm run build

davidzlchen/federated-learning-model