This work "A Framework for Federated Reinforcement Learning with Interaction and Communication Efficiency" has been submitted in INFOCOM 2024.
Momentum-assisted Federated Policy Optimization (MFPO), capable of jointly optimizing both interaction and communication complexities. Specifically, we introduce a new FRL framework that utilizes momentum, importance sampling, and extra server-side updating to control the variates of stochastic policy gradients and improve the efficiency of data utilization.
- Python == 3.7 (Recommend to use Anaconda or Miniconda)
- PyTorch == 1.8.1
- MuJoCo == 2.3.6
- NVIDIA GPU (RTX A6000) + CUDA 11.1
- Clone repo
git clone [https://github.com/HansenHua/MFPO-Online-Federated-Reinforcement-Learning.git](https://github.com/HansenHua/MFPO-INFOCOM24.git) cd MFPO-Online-Federated-Reinforcement-Learning
- Install dependent packages
pip install -r requirements.txt
Get the usage information of the project
cd code
python main.py -h
Then the usage information will be shown as following
usage: main.py [-h] [--env_name ENV_NAME] [--method METHOD] [--gamma GAMMA] [--batch_size BATCH_SIZE]
[--local_update LOCAL_UPDATE] [--num_worker NUM_WORKER] [--average_type AVERAGE_TYPE] [--c C]
[--seed SEED] [--lr_a LR_A] [--lr_c LR_C]
mode max_iteration
positional arguments:
mode train or test
max_iteration maximum training iteration
optional arguments:
-h, --help show this help message and exit
--env_name ENV_NAME the name of environment
--method METHOD method name
--gamma GAMMA gamma
--batch_size BATCH_SIZE
batch_size
--local_update LOCAL_UPDATE
frequency of local update
--num_worker NUM_WORKER
number of federated agents
--average_type AVERAGE_TYPE
average type (target/network/critic)
--c C momentum parameter
--seed SEED random seed
--lr_a LR_A learning rate of actor
--lr_c LR_C learning rate of critic
Test the trained models provided in MFPO-Momentum-assisted Federated Policy Optimization.
python main.py CartPole-v1 MFPO test
We provide complete training codes for MFPO.
You could adapt it to your own needs.
```
python main.py CartPole-v1 MFPO train
```
The log files will be stored in [MFPO-Online-Federated-Reinforcement-Learning/code/log](https://github.com/HansenHua/MFPO-INFOCOM24/tree/main/code/log).
- Testing
python main.py CartPole-v1 MFPO test
- Illustration
We alse provide the performance of our model. The illustration videos are stored in MFPO-Online-Federated-Reinforcement-Learning/performance.
If you have any question, please email xingyuanhua@bit.edu.cn
.