Bug report
matthewyuhb opened this issue · 0 comments
I'm using this project to train my reinforcement learning Agent. I found the agent was trapped in a local optimum while training and I found the following phenomenal.
I use the trace which has the fixed capacity of 600k and the duration of 180s:
I first manually changed the bandwidth of the RL-agent always be 1000k, it made sense(the base rtt is about 200ms):
However, my trained RL-agent trapped into this:
The RTT becomes a minRTT at a very high sending rate! What's more the receiving rate observed by sender side is constantly about 500k and the loss rate is 0%. The pretty high receiving rate and the very low delay made the RL agent think it has learned a nice model so it won't go on optimizing...
Is this a bug of the gym?