Performance evaluation of Q-learning and SARSA on Taxi ride problem

This project explores a Taxi environment where there is a requirement to pick and drop passengers to specific locations, maneuvering through open roads with the aim of efficiently scheduling trips. There is a mega-reward for successfully picking and dropping passengers and penalties for bumping into walls or wrong drops. We use prinicples of Reinforcement Learning to train an agent with different algorithms and meaure the performance at this task. We implement a random agent, a SARSA agent, a Q-learning agent and a smart Q-learning agent for this problem and present a performance comparison analysis. The smart Q-learning agent has few improvements(over the standard version) to better handle the exploration-exploitation trade off and help achieve good performance within less iterations. Moreover, experiments are done to choose the best set of hyper parameters for this problem using Grid search technique.

Slides