Fix CartPoleCost not going to reference problem
Closed this issue · 2 comments
rickstaa commented
Describe the bug
When training in the CartPoleCost environment the trained algorithm doesn't steer the CartPole to the reference (zero) position.
To Reproduce
Steps to reproduce the behaviour:
- Activate the blc conda environment.
- Train the LAC or SAC algorithms in the CartPoleCost-v0 environment.
- Watch the performance using the test_policy utility.
- See that the CartPole does not converge to zero.
Expected behaviour
The Algorithm should hold the Pole upright while also steering the cart to the reference position.
Screenshots
rickstaa commented
Debug Report
Steps
- Check if this was also the case in Minghoas code.
- Change the cost function to give more importance to getting the cart to zero.
- Check if the episode length is long enough for the algorithm to be able to get the card to zero.
Check Minghoas code
In Minghoas latest repository the cart also did not go to zero.
Performance looks better but his is since I trained longer.
rickstaa commented
This is no longer the case with the new CartPoleCost environment. It was probably caused by an incorrect cost function for what I was trying to achieve.