This is a modular architecture for model based reinforcement learning using search.
The components are separated, facilitating the creation of the agents and the extension of the existing components
The current implementation learns the hidden-states for planning and an action-mask.
to try it, run test.py: it will ask you to choose from different components and then run.