Channel Selection and Power Control for D2D Communication via Online Reinforcement Learning implementations
D2D_A2C_multi.py - benchmark code
D2D_A2C_multi_two_out.py - working code for dual policy distribution output model
D2D_A2C_multi_two_out_BN.py - same as "D2D_A2C_multi_two_out.py" bit using batch normalisation
D2D_A2C_multi_two_out_deeper.py- same as "D2D_A2C_multi_two_out.py" but with altered network architecture (best convergence performance)
D2D_env_discrete.py - environment simulation code (provdided by supervisor)