Read/download the finished thesis here
Channel Selection and Power Control for D2D Communication via Online Reinforcement Learning implementations
Individiual actor critic implementation - IAC.py
Multi-headed individiual actor critic implementation - MHIAC.py
Centralized COMA implementation - coma_centralized.py
Partially Centralized COMA implementation - coma_partially_centralized.py
Dual-Critic Implementation - dual_critic.py
Partially Centralized Dual-Critic Implementation - dual_critic_partially_centralized.py
NOTE: Some TRFL files were edited for this work. These files will need to be replaced with the edited files (of the same name) which are available here base_ops.py, discrete_policy_gradient_ops.py