dickreuter/neuron_poker

Last raiser should not take action again if all call/fold afterwards

unonth opened this issue · 8 comments

Describe the bug
When running the dqn training, I found something strange happened, like the last raiser would fold though nobody re-raise.. i.e. Last raiser can take action again even if all call/fold afterwards.

To Reproduce
Steps to reproduce the behavior:

  1. Run python main.py dqn_train
  2. From the log you will see the last raiser can take action again after all call/fold.

Take below log as an example, only Seat 4 and Seat 3 are still on the table, Seat 4 was still able to make a CALL action, when Seat 4 raised 3bb and Seat 3 called.(Sometimes the last raiser would FOLD)

INFO - ===ROUND: Stage.FLOP ===
INFO - Seat 3 (Random): Action.CHECK - Remaining stack: 46, Round pot: 0, Community pot: 280, player pot: 0
INFO - Seat 4 (Random): Action.RAISE_3BB - Remaining stack: 35, Round pot: 6, Community pot: 280, player pot: 6
INFO - Seat 3 (Random): Action.CALL - Remaining stack: 40, Round pot: 12, Community pot: 280, player pot: 6
INFO - Seat 4 (Random): Action.CALL - Remaining stack: 35, Round pot: 12, Community pot: 280, player pot: 6

I looked at the code and seems something wrong with the max steps checker after raiser, which should be >= rather than >, at below line in gym_env/env.py#PlayerCycle.next_player

if self.max_steps_after_raiser and (self.counter > self.max_steps_after_raiser + raiser_reference):
=>
if self.max_steps_after_raiser and (self.counter >= self.max_steps_after_raiser + raiser_reference):

ok thanks, will merge this but ideally you could add a test

Heey first of all thank you for sharing your code @dickreuter. I think through this fix is a new bug occured. You can see the bug in preflop stage. If nobody raise in a round, is the last guy who can do a action the big blind... But with this fix is the small blind the last one. With your old occurs this problem not.

@dickreuter thank you for your fast answer
Here a simple example for the described bug after fix.
Preflop-Round end before Seat 0 (Big Blind) can do any action.
The fix for my described bug is very simply just go back to your old code, but then we get the old bug. I will try implementing 2 tests for both scenarios and a fix suggestion.

INFO - ++++++++++++++++++
INFO - Dealer is at position 0
INFO - Player 0 got ['8H', '7D'] and $124
INFO - Player 5 got ['3D', '3C'] and $476
INFO -
INFO - ===Round: Stage: PREFLOP
INFO - Seat 5 (keras-rl): Action.SMALL_BLIND - Remaining stack: 475, Round pot: 1, Community pot: 0, player pot: 1
INFO - Seat 0 (equity/50/70): Action.BIG_BLIND - Remaining stack: 122, Round pot: 3, Community pot: 0, player pot: 2
INFO - Previous action reward for seat 5: -24
INFO - Chosen action by keras-rl 1 - probabilities: [0.08428754 0.36692145 0.10843648 0.14432381 0.10479218 0.16336744
0.0278711 ]
INFO - Seat 5 (keras-rl): Action.CALL - Remaining stack: 474, Round pot: 4, Community pot: 0, player pot: 2
INFO - End round - no current player returned
INFO - Cards on table: ['4H', 'TS', '6H']
INFO - --------------------------------
INFO - ===ROUND: Stage.FLOP ===

Added pull request. I will look for the base problem of this issue after accepting of pull request.

Now i will try to fix the base problem i found following bug in the current code. See the following log.
Seat 3 Raise_3BB and Seat 0 call normally now is the round over, but Seat 3 check again.
I will try implement a test and a fix.

INFO - Dealer is at position 0
INFO - Player 0 got ['6H', 'AC'] and $352
INFO - Player 3 got ['7D', '8H'] and $248
INFO -
INFO - ===Round: Stage: PREFLOP
INFO - Seat 3 (Random): Action.SMALL_BLIND - Remaining stack: 247, Round pot: 1, Community pot: 0, player pot: 1
INFO - Seat 0 (equity/50/70): Action.BIG_BLIND - Remaining stack: 350, Round pot: 3, Community pot: 0, player pot: 2
INFO - Seat 3 (Random): Action.RAISE_3BB - Remaining stack: 239, Round pot: 11, Community pot: 0, player pot: 9
INFO - Seat 0 (equity/50/70): Action.CALL - Remaining stack: 344, Round pot: 17, Community pot: 0, player pot: 8
INFO - Seat 3 (Random): Action.CHECK - Remaining stack: 239, Round pot: 17, Community pot: 0, player pot: 9
INFO - End round - no current player returned
INFO - Cards on table: ['7C', '9D', '3H']
INFO - --------------------------------
INFO - ===ROUND: Stage.FLOP ===