Demo of federated learning.
These instructions assume you are on Ubuntu 16.04, they may be different for other operating systems.
Clone this repository and cd to it.
Install mosquitto, a MQTT broker.
Install virtualenv if you don't have it already.
pip install virtualenv
Then create a new virtualenv and switch to it.
virtualenv ENV
source ENV/bin/activate
Then install the requirements
pip install -r requirements.txt
source ENV/bin/activate
python server.py
source ENV/bin/activate
python client.py
Navigate to http://localhost:5000 and configure your task.
Changing the learning task
The model used is the PersonBinaryClassifier object defined in common/models.py.
As long as you are using a PyTorch Module object, there should not be any other changes that need to be made.
The train and test function are defined in the same file, common/models.py.
In general, the ModelRunner object is used throughout the project to run the learning task. The project uses the get_model_runner function in the common/person_classifier.py file to get this ModelRunner object. To find where we use this function, you can use any IDE/GitHub's find references feature to see all the locations that it is being used.
Changing the training data
The client and server code to read in training / test data expect the data to be in a certain format. Any dataset can be used as long as the images are preprocessed into the following format. The data is expected to be in a .pkl (Python pickle) format, and contain an array of tuples. The first argument in each tuple is the image itself (ex. 2D array of image pixels) and the second argument is an array of labels. The labels are specific to the learning task, but for our case either "person" or "no-person" must exist (only one). In addition, this array of labels also contains all clusters that this image can be a part of (ex. "sky", "water", "ground").
The Jupyter Notebook used by us to preprocess the Coco dataset into our format is provided for reference in data/data_setup/.
In server.py, there is a global variable CLUSTER_NAMES. This variable is an array of cluster names that the server is allowed to assign to clients. It is set by the developer and is required to be changed when the dataset / cluster labels change.
Miscellaneous
After making changes, make sure to format your code.
python3 -m autopep8 --in-place --aggressive --aggressive --recursive *.py common utils
Install dependencies. Make sure you have Node and NPM. Our configuration uses Node v10.16.3 and npm v6.14.4.
cd website
npm install
To make changes, it is recommended to use the live server.
npm start
After making changes, create a minified compiled version that Flask will serve.
npm run build