duyminh1998/pycmo

Create custom rewards handler

Opened this issue · 1 comments

Why

As a

user of pyCMO

I want

to be able to specify different reward models for my scenarios

So that

I can train RL agents

Acceptance Criteria

Given

we currently only export the player's side's total score as the reward

When

we implement a way for users to specify a reward model

Then

we get closer to being able to train RL agents

Notes

One idea is to create a custom RewardHandler class that gets passed into CMOEnv that can calculate the reward based on the current observation

gymnasium provides reward wrappers