Reinforcement Learning algorithm for learning the most advantageous strategy for a simplified version of Blackjack

Blackjack variation used in the project

It is assumed that there is an infinite deck of cards and there are only 2 available actions to perform by the player and the dealer: Hit or Stand.