Implementation of Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling.
Some of the code were reused from Catalyst demo notebook and higgsfield's RL-Adventure, it helps a lot.
TODO:
- Implement validation
- Change training process (switch to sessions)
- Add Prioritized Experience Replay
Special TODO:
- Add Ornstein–Uhlenbeck noise for better exploration
Model | nDCG@10 | hit_rate@10 |
---|---|---|
DDPG with OU noise | 0.280 | 0.502 |
DDPG | 0.254 | 0.454 |
Neural Collaborative Filtering | 0.238 | 0.430 |
Random (for comparison) | ~0.05 | ~0.1 |