A snake that knows how to win
For training:
- You can modify the model architecture in
model.py
and the model parameters inagent.py
- Specify the trained model name in
agent.py
- Run
python agent.py train
For evaluating
- Specify the model name and path in
agent.py
- Run
python agent.py eval
- eat: +10
- lose: -10
- else: 0
[1,0,0] -> straight [0,1,0] -> turn right [0,0,1] -> turn left
This is danger is close
- Straight Danger
- Right Danger
- Left Danger
Where is the snake facing
- Up Direction
- Down Direction
- Left Direction
- Right Direction
Where is the food wrt to the snake
- Up Food
- Down Food
- Right Food
- Left Food
State (11) --> Hidden (?) --> Action (3)
- Init with some Q values
- Predict Action (or Random for exploration)
- Perform Action
- Measure Reward
- Update Q value and train with following params:
- NewQ(S,a) = Q(S,a) + alpha*[R + gamma*maxQ'(S',a')-Q(S,a)]
- loss = (NewQ(S,a)-Q(S,a))^2