Home Living Penalty
Post
Cancel

Living Penalty

What is Living Penalty?

A living penalty is applied to avoid infinite loops in Reinforcement Learning. Let’s assume that we are trying to find a way out of a maze:



At first, we set the living penalty to 0 ($R(s) = 0$). Then, the $V(s)$ will be:



If we set the living penalty to -0.04($R(s) = -0.04$).



One thing to note is that the living penalty is applied when the agent leaves the box. Then, the V(s) will be:



If we set the living penalty to -0.5 ($R(s) = -0.5$), then the $V(s)$ will be:



This is more suitable than $R(s) = -0.04$ or $R(s) = 0$. But if we set $R(s) = -2.0$, then the $V(s)$ will be:



This is because living for a long time incurs a bigger penalty than going through the red box.

This post is licensed under CC BY 4.0 by the author.