What is Reinforcement Learning?
In reinforcement learning, an agent takes action in the environment, the environment changes the state, and the agent gets rewards, like:
This process is repeated over and over again to understand what actions lead to positive rewards and favorable states and what actions lead to negative rewards and unfavorable states.
Apply the Bellman Equation to the states collected through actions to calculate the probability of each action being taken and then take the action. Additionally, Living Penalty is applied to avoid infinite loops.