CSKrishna/Optimal-bidding-policy-using-Policy-Gradient-in-a-Multi-agent-Contextual-Bandit-setting
We use policy gradient to help agents learn optimal policies in a competitive multi-agent contextual bandit setting
Jupyter Notebook
No issues in this repository yet.