Bayes’ Theorem

n events $B_{i}, i=1, 2, \ldots , n$, are mutually exclusive, as shown in the image below. Therefore, $P(B_{i}, B_{j})=0 \ \forall \ i \neq j$.

Considering the entire sample space, we have $\sum_{i=1}^{n}P(B_{i})=1$. Then, the probability of a random event $A$ is expressed as shown in equation [1]:

$P(A) = \sum_{i=1}^{n}P(A, B_{i})$

$=\sum_{i=1}^{n}P(A|B_{i})P(B_{i})$ [1]

Equation [1] is called the total probability theorem.

In order to derive Bayes’ theorem, we should understand conditional probability. The conditional probability is:

$P(B_{i} \mid A) = \frac{P(A,B_{i})}{P(A)}$

$=\frac{P(A \mid B_{i})P(B_{i})}{P(A)}$ [2]

If we substitute the law of total probability into equation [2], we get:

$\frac{P(A \mid B_{i})P(B_{i})}{\sum_{i=1}^{n}P(A \mid B_{i})P(B_{i})}$ [3]

Equation [3] is called the Bayes’ theorem.

It can also be expressed using the probability density function as shown in equation [4].

$p_{X \mid Y}(x \mid y) = p_{Y \mid X}(y \mid x)p_{X}(x)$

$=\frac{p_{Y \mid X}(y \mid x)p_{X}(x)}{\int_{-\infty}^{\infty}p_{Y \mid X}(y \mid x)p_{X}(x)dx}$ [4]

$= \frac{p_{XY}(x,y)}{p_{Y}(y)}$

where $p_{X}(x)$ is a prior probability density function, $p_{X \mid Y}(x \mid y)$ is a posterior probability density function.

When events and random variables are mixed, Bayes’ theorem is given by equation [5].

$P(B_{i} \mid y) = \frac{p_{Y \mid B_{i}}(y \mid B_{i})P(B_{i})}{\sum_{j=1}^{n}p_{Y \mid B_{j}}(y \mid B_{j})P(B_{j})}$ [5]

Bayes' Theorem

Bayes’ Theorem

Further Reading

Random Vector

Random Process

Estimating The State of a Static System