Living Penalty

Posted Apr 23, 2023 Updated Jun 14, 2024

1 min read

What is Living Penalty?

A living penalty is applied to avoid infinite loops in Reinforcement Learning. Let’s assume that we are trying to find a way out of a maze:

At first, we set the living penalty to 0 ($R(s) = 0$). Then, the $V(s)$ will be:

If we set the living penalty to -0.04($R(s) = -0.04$).

One thing to note is that the living penalty is applied when the agent leaves the box. Then, the V(s) will be:

If we set the living penalty to -0.5 ($R(s) = -0.5$), then the $V(s)$ will be:

This is more suitable than $R(s) = -0.04$ or $R(s) = 0$. But if we set $R(s) = -2.0$, then the $V(s)$ will be:

This is because living for a long time incurs a bigger penalty than going through the red box.

This post is licensed under CC BY 4.0 by the author.