garlicdevs/Fruit-API

number of actions that vary

TyrandeWhisperwind opened this issue · 2 comments

i have a number of actions that varies and get less with each step, does the MODQNLearner takes this in consideration and randoms from the possible actions left in each step?

in my engine i'm using this function to calculate action space
def get_action_space(self):
return range(len(self.get_possible_actions()))

each time it gets less, so if the MODQNLearner calls it each time it performs an action, i guess it takes it in consideration

No, because the network output cannot change overtime, you can do some tricks to fix this, such as limit the possible values of action overtime, I think get_action_space is called only once

what should b modified the code is huge :l ... i want it to random from possible actions remaining ...