Open Gym Taxi v3 environment solved using sarsamax algorithm(Q-Learning)
Primary LanguageJupyter Notebook