The taxi problem solved with reinforcement learning (MAXQ value fonction decomposition)
Primary LanguagePython