Home MMSE Estimator
Post
Cancel

MMSE Estimator

MMSE(Minimum Mean-Square error) Estimator

When a set of measurement vectors is given as $Z_{k} = z_{k}$, the MMSE estimator is defined as an estimator that minimizes the conditional average estimation error or conditional mean-square error as follows:

$J = \mathbb{E}[(X-\hat{x})^{T}(X-\hat{x}) \mid Z_{k} = z_{k}]$
$= \int_{-\infty}^{\infty}(x-g(z_{k}))^{T}(x-g(z_{k}))p_{X \mid Z_{k}}(x\mid z_{k})dx$


To simplify, the MMSE is expressed as follows:

$\hat{x}^{MMSE} = \arg\min(\mathbb{E}[(X-\hat{x})^{T}(X-\hat{x}) \mid Z_{k} = z_{k}])$


Since $\hat{x} = g(z_{k})$ is a constant value, $J = \mathbb{E}[X^{T}x-X^{T}\hat{x} - \hat{x}^{T}X + \hat{x}^{T}\hat{x} \mid Z_{k} = z_{k}] \ = \mathbb{E}[X^{T}X \mid Z_{k} = z_{k}] - 2\hat{x}^{T}\mathbb{E}[X \mid Z_{k}=z_{k}]+\hat{x}^{T}\hat{x}$.

Because the MMSE estimator is a function of a quadratic functional shape that is convex downward, we can get the minimum value when $\frac{dJ}{d\hat{x}} = 0$.

Therefore, $\frac{dJ}{d\hat{x}} = -2( \mathbb{E}[X\mid Z_{k}=z_{k}] + \hat{x})$. Note that $\frac{d\hat{x}}{d\hat{x}} = \frac{d\hat{x}^{T}}{d\hat{x}}$

$\therefore \hat{x}^{MMSE} = \mathbb{E}[x \mid Z_{k}=z_{k}]$ $= \int_{-\infty}^{\infty}xp_{X \mid Z_{k}}(x \mid z_{k})dx$ $= \frac{\int_{-\infty}^{\infty}xp_{Z_{k} \mid X}(z_{k} \mid x)p_{X}(x)dx}{\int_{-\infty}^{\infty}p_{Z_{k} \mid X}(z_{k} \mid x)p_{X}(x)dx}$

Similar to the formula above, if the measurement vectors are random vectors, the MMSE estimator is estimated as follows and the value will also be a random vector.

$\hat{X}^{MMSE} = \mathbb{E}[X \mid Z_{k}]$

There are four kinds of MMSE Estimators. Which are :

The Performance of The MMSE Estimator



Mean of The MMSE Estimator

The estimation error $\tilde{X}$ is defined as follows:

$\tilde{X} = X - \hat{X}^{MMSE}$


Then, the average of $\tilde{X}$ is:

$\mathbb{E}[\tilde{X}] = \mathbb{E}[X-\hat{X}^{MMSE}]$
$= \mathbb{E}[X] - \mathbb{E}[\mathbb{E}[X \mid Z_{k}]]$
$= \mathbb{E}[X] - \mathbb{E}[X] = 0$

Because the mean of $\tilde{X}$ is 0, the expectation value of the estimation value of the MMSE equals the expectation value of the unknown random vector $X$. That is, the MMSE estimator is an unbiased estimator.



Covariance of The Estimation Error

The covariance of the estimation error $\tilde{X}$ is given as the mean of the conditional covariance of $X$ conditioned on the set of measurement vectors $Z_{k}$, as follows:

$P_{\tilde{X}\tilde{X}} = \mathbb{E}[(\tilde{X}-\mathbb{E}[\tilde{X}])(\tilde{X}-\mathbb{E}[\tilde{X}])^{T}]$
$=\mathbb{E}[\tilde{X}\tilde{X}^{T}]$
$=\mathbb{E}[\mathbb{E}[\tilde{X}\tilde{X}^{T} \mid Z_{k}]]$
$=\mathbb{E}[P_{XX\mid Z_{k}}]$


If the measurement vector is given as $Z_{k} = z_{k}$, the covariance of the estimation error $\tilde{X}$ is as follows:

$P_{\tilde{X}\tilde{X}} = \mathbb{E}[\tilde{X}\tilde{X}^{T}]$
$= \mathbb{E}[(X-\mathbb{E}[X \mid Z_{k}=z_{k}])(X-\mathbb{E}[X \mid Z_{k}=z_{k}])^{T}]$
$=P_{XX \mid Z_{k}}$

Also, the estimation error is always orthogonal with $g_{j}$ which is composed by measurement variable. It is written as an equation as :

$\mathbb{E}[X - \hat{X}^{MMSE}g^{T}(Z)] = 0$

where $Z$ is a measurement vector.

To prove this,

$\mathbb{E}[(x - \hat{X}^{MMSE}g^{T}(Z))] = \mathbb{E}[Xg^{T}(Z)]-\mathbb{E}[\mathbb{E}[X \mid Z]g^{T}(Z)]$
$=\mathbb{E}[Xg^{T}(Z)]-\mathbb{E}[\mathbb{E}[Xg^{T}(Z) \mid Z]]$
$= \mathbb{E}[Xg^{T}(Z)] - \mathbb{E}[Xg^{T}(Z)] = 0$

Then, let $g^{T}(Z) = Z$, $\mathbb{E}[X-\hat{X}^{MMSE}Z^{T}] = 0$.

According to the equation above, \(\hat{X}_{i}^{MMSE}\) is given as the value of projecting $X_{i}$ into a span consisting of a linear combination of measurement variables $Z_{i}$ and the measurement error is orthogonal to this span.





Joint Gaussian MMSE Estimator

Let’s get the MMSE estimation value $\hat{X}^{MMSE}$ of random vector $X$ when the two random vectors $X$ and $Z$ have a joint Gaussian distribution.
If the two random vectors are joint Gaussian vectors, each random vector follows a Gaussian distribution. Let $X$ and $Z$ have the following probability density functions:

$X \sim N(\mu_{X}, P_{XX}),$
$Z \sim N(\mu_{Z}, P_{ZZ})$

Also, assume that the joint probability density function of two random vectors is given as:

$Y = \begin{bmatrix} X \\ Z \end{bmatrix} \sim N(\mu_{Y}, P_{YY})$

Then, $\mu_{Y}$ and $P_{YY}$ are:

$\mu_{Y}= \begin{bmatrix} \mu_{X} \\ \mu_{Z} \end{bmatrix}, P_{YY} = \begin{bmatrix} P_{XX} & P_{XZ} \\ P_{ZX} & P_{ZZ} \end{bmatrix}$

where $P$ represents the covariance matrix.

To obtain $\hat{X}^{MMSE}$, we need the conditional probability density function of $X$ given $Z = z$. The conditional probability density function $p_{X\mid Z}(x \mid z)$ is given as:

$p_{X \mid Z}(x \mid z) = \frac{p_{XZ}(x, z)}{p_{Z}(z)} = \frac{p_{Y}(y)}{p_{Z}(z)}$
$= \frac{\sqrt{(2 \pi)^{p}\det{P_{ZZ}}}}{\sqrt{(2 \pi)^{n+p}\det{P_{YY}}}}\exp({-\frac{1}{2} (y-\mu_{Y})^{T}P^{-1}_{YY}(y-\mu_{Y})-(z-\mu_{Z})^{T}P^{-1}_{ZZ}(z-\mu_{Z})})$
$= \frac{1}{\sqrt{(2 \pi)^{n}\det{P_{XX\mid Z}}}} \exp{(-\frac{1}{2}(x-E[X \mid Z=z])^{T}P_{XX \mid Z} ^{-1}(x-E[X \mid Z=z]))}$

where

  • $E[X \mid Z=z]$ : $\mu_{X} + P_{XZ}P_{ZZ}^{-1}(z-\mu_{Z})$

  • $P_{XX \mid Z}$ : $P_{XX} - P_{XZ} P_{ZZ}^{-1} P_{ZX}$

Thus, the MMSE estimation value $\hat{X}^{MMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ are:

$\hat{X}^{MMSE}(z) = \mu_{X} + P_{XZ}P_{ZZ}^{-1}(z-\mu_{Z})$
$P_{\tilde{X}\tilde{X}} = P_{XX \mid Z} = P_{XX}-P_{XZ}P_{ZZ}^{-1}P_{ZX}$





Joint Gaussian MMSE Estimator for Linear Measurements

Let an unknown random vector $X$ and a measurement vector $Z$ have a linear relationship as:

$Z = HX + V$

where $X$ and $V$ which is a measurement noise are given as Gaussian random vectors as:

$X \sim N(\mu_{X}, P_{XX}),$
$V \sim N(0, R),$
$\mathbb{E}[(X-\mu_{X})V^{T}]=0$

and assume that the two random vectors are uncorrelated.

Since $X$ and $V$ are uncorrelated Gaussian random vectors, and $X$ and $Z$ have a linear relationship, $X$ and $Z$ have joint Gaussian distribution. Thus, the estimation value $\hat{X}^{MMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ of the random vector $X$ conditioned on the random vector $Z$ are:

$\hat{X}^{MMSE}(z) = \mu_{X}+P_{XX}H^{T}(HP_{XX}H^{T} + R)^{-1}(z-H\mu_{X})$
$P_{\tilde{X}\tilde{X}} = (P_{XX}^{-1} + H^{T}R^{-1}H)^{-1}$

To prove this, we need to get $\mu_{Z}$ first.

$\mu_{Z} = \mathbb{E}[Z] = \mathbb{E}[HX + V] = H\mathbb{E}[X] = H\mu_{X}$

Second, get $P_{ZZ}$.

$P_{ZZ} = \mathbb{E}[(Z-\mu_{Z})(Z-\mu_{Z})^{T}]$
$= \mathbb{E}[(H(X-\mu_{X})+V)(H(X-\mu_{X})+V)^{T}]$
$= HP_{XX}H^{T} + \mathbb{E}[H(X-\mu_{X})V^{T}]+\mathbb{E}[V(X-\mu_{X})^{T}H^{T}]+ R$
$=HP_{XX}H^{T}+R$

Third, get the cross-covariance $P_{XZ}$.

$P_{XZ} = \mathbb{E}[(X-\mu_{X})(Z-\mu_{Z})]$
$= \mathbb{E}[(X-\mu_{X})(H(X-\mu_{X})+V)^{T}]$
$= P_{XX}H^{T} + \mathbb{E}[(X-\mu_{X})V^{T}]$
$= P_{XX}H^{T}$

Finally, to get $\hat{X}^{MMSE}(z)$ and $P_{\tilde{X}\tilde{X}}$, use the equations of joint Gaussian MMSE estimator above.

$\hat{X}^{MMSE}(z) = \mu_{X}+P_{XX}H^{T}(HP_{XX}H^{T} + R)^{-1}(z-H\mu_{X})$
$P_{\tilde{X}\tilde{X}} = P_{XX\mid Z}$
$= P_{XX}-P_{XX}H^{T}(HP_{XX}H^{T}+R)^{-1}HP_{XX}$
$(P_{XX}^{-1} + H^{T}R^{-1}H)^{-1}$





Linear MMSE Estimator

When a measurement vector $z$ and an estimate $\hat{x}$ of an unknown random vector $X$ have a linear relationship given by $\hat{x}(z) = Az + b$, we refer to the estimator as a linear estimator.

If the measurement vector is not fixed as $Z=z$ and is instead given as a random vector, the linear estimator is expressed as $\hat{X}(Z) = AZ + b$.

The Linear Minimum Mean-Square Error (LMMSE) estimator is defined as an estimator that minimizes the following objective function.

$J = \mathbb{E}[(X-\hat{X}^{LMMSE})^{T}(X-\hat{X}^{LMMSE})]$
$= tr\mathbb{E}[(X-\hat{X}^{LMMSE})(X-\hat{X}^{LMMSE})^{T}]$
$= tr\mathbb{E}[(X - AZ -b)(X-AZ-b)^{T}]$
$= tr\mathbb{E}[(X-AZ-b-\mathbb{E}[X]+\mathbb{E}[X])(X-AZ-b-\mathbb{E}[X]+\mathbb{E}[X])^{T}]$
$= trP_{XX}+A(P_{ZZ}+\mathbb{E}[Z](\mathbb{E}[Z])^{T})A^{T}+(\mathbb{E}[X]-b)(\mathbb{E}[X]-b)^{T}-2A\mathbb{E}[Z](\mathbb{E}[X]-b)^{T}-2AP_{XZ}$

where

  • $tr$ : Trace(sum of diagonal components).

  • $A$ : A deterministic matrix. Need to be determined to make $J$ minimum.

  • $b$ : A deterministic vector. Need to be determined to make $J$ minimum.

The necessary conditions for minimizing $J$ are as:

$\frac{\partial J}{\partial b} = 2(\mathbb{E}[X]-b)-2A\mathbb{E}[Z]=0$
$\frac{\partial J}{\partial A} = 2A(P_{ZZ}+\mathbb{E}[Z](\mathbb{E}[Z])^{T} - 2P_{XZ}-2(\mathbb{E}[X]-b)(\mathbb{E}[Z])^{T})=0$

$A$ and $b$ are calculated among the equations above.

  • $A$ : $P_{XZ}P_{ZZ}^{-1}$

  • $b$ : $\mathbb{E}[X]-P_{XZ}P_{ZZ}^{-1}\mathbb{E}[Z]$

Then, LMMSE is given as:

$\hat{X}^{LMMSE}(Z) = \mathbb{E}[X] + P_{XZ}P_{ZZ}^{-1}(Z-\mathbb{E}[Z])$

If the estimation vector is confirmed as $Z=z$, LLMSE is given as:

$\hat{X}^{LLMSE}(z) = \mathbb{E}[X] + P_{XZ}P_{ZZ}^{-1}(z-\mathbb{E}[Z])$

Additionally, $\mathbb{E}[\hat{X}^{LMMSE}(Z)]$ and $P_{\tilde{X}\tilde{X}}$ are:

$\mathbb{E}[\hat{X}^{LMMSE}(Z)] = \mathbb{E}[\mathbb{E}[X]] + P_{XZ}P_{ZZ}^{-1}(\mathbb{E}[Z]-\mathbb{E}[Z]) = \mathbb{E}[X]$
$P_{\tilde{X}\tilde{X}} = \mathbb{E}[(\tilde{X}-\mathbb{E}[\tilde{X}])(\tilde{X}-\mathbb{E}[\tilde{X}])^{T}]$
$=\mathbb{E}[\tilde{X}\tilde{X}^{T}] = \mathbb{E}[(X-\hat{X}^{LMMSE})(X-\hat{X}^{LMMSE})^{T}]$
$= P_{XX}-P_{XZ}P_{ZZ}^{-1}P_{ZX}$

Since $\mathbb{E}[\hat{X}^{LMMSE}(Z)] = \mathbb{E}[X]$, the LMMSE estimator is unbiased.





Linear MMSE Estimator for Linear Measurements

Let an unknown random vector $X$ and a measurement vector $Z$ be related by the linear equation $Z = HX + V$. Assume that $X$ and the measurement noise $V$ are random vectors with arbitrary probability distributions and are uncorrelated with each other as:

$X \sim (\mu_{X},P_{XX})$,
$V \sim (0, R)$,
$\mathbb{E}[(X-\mu_{X})V^{T}] = 0$

The estimate $\hat{X}^{LMMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ of the random vector $X$ conditioned on the random vector $Z = z$ are:

$\hat{X}^{LMMSE}(z) = \mu_X + P_{XX}H^{T}(HP_{XX}H^{T} + R)^{-1}(z - H\mu_X)$
$P_{\tilde{X}\tilde{X}} = (P_{XX}^{-1}+H^{T}R^{-1}H)^{-1}$

To prove this, we need to get $\mu_{Z}$ first.

$\mu_{Z}=\mathbb{E}[Z] = \mathbb{E}[HX+V] = H\mathbb{E}[X] = H\mu_{X}$

Second, get $P_{ZZ}$

$P_{ZZ} = \mathbb{E}[(Z-\mu_{Z})(Z-\mu_{Z})^{T}]$
$=\mathbb{E}[(H(X-\mu_{X})+V)(H(X-\mu_{X})+V)^{T}]$
$= HP_{XX}H^{T} + \mathbb{E}[H(X-\mu_{X})V^{T}]+ \mathbb{E}[V(X-\mu_{X})^{T}H^{T}]+R$
$=HP_{XX}H^{T} + R$

Third, get the cross-covariance $P_{XZ}$.

$P_{XZ} = \mathbb{E}[(X-\mu_{X})(Z-\mu_{Z})^{T}]$
$= \mathbb{E}[(X-\mu_{X})(H(X-\mu_{X})+V)^{T}]$
$=HP_{XX}H^{T} + \mathbb{E}[(X-\mu_{X})+V^{T}]$
$=P_{XX}H^{T}$

Finally, to get \(\hat{X}^{LMMSE}(z)\) and $P_{\tilde{X}\tilde{X}}$, use the equations of the LMMSE estimator given above.

$\hat{X}^{LMMSE}(z) = \mu_{X}+P_{XX}H^{T}(HP_{XX}H^{T} +R)^{-1}(z-H\mu_{X})$
$P_{\tilde{X}\tilde{X}}=P_{XX}-P_{XX}H^{T}(HP_{XX}H^{T}+R)^{-1}HP_{XX}$
$=(P_{XX}^{-1}+H^{T}R^{-1}H)^{-1}$
This post is licensed under CC BY 4.0 by the author.