MMSE(Minimum Mean-Square error) Estimator
When a set of measurement vectors is given as $Z_{k} = z_{k}$, the MMSE estimator is defined as an estimator that minimizes the conditional average estimation error or conditional mean-square error as follows:
$= \int_{-\infty}^{\infty}(x-g(z_{k}))^{T}(x-g(z_{k}))p_{X \mid Z_{k}}(x\mid z_{k})dx$
To simplify, the MMSE is expressed as follows:
Since $\hat{x} = g(z_{k})$ is a constant value, $J = \mathbb{E}[X^{T}x-X^{T}\hat{x} - \hat{x}^{T}X + \hat{x}^{T}\hat{x} \mid Z_{k} = z_{k}] \ = \mathbb{E}[X^{T}X \mid Z_{k} = z_{k}] - 2\hat{x}^{T}\mathbb{E}[X \mid Z_{k}=z_{k}]+\hat{x}^{T}\hat{x}$.
Because the MMSE estimator is a function of a quadratic functional shape that is convex downward, we can get the minimum value when $\frac{dJ}{d\hat{x}} = 0$.
Therefore, $\frac{dJ}{d\hat{x}} = -2( \mathbb{E}[X\mid Z_{k}=z_{k}] + \hat{x})$. Note that $\frac{d\hat{x}}{d\hat{x}} = \frac{d\hat{x}^{T}}{d\hat{x}}$
Similar to the formula above, if the measurement vectors are random vectors, the MMSE estimator is estimated as follows and the value will also be a random vector.
There are four kinds of MMSE Estimators. Which are :
The Performance of The MMSE Estimator
Mean of The MMSE Estimator
The estimation error $\tilde{X}$ is defined as follows:
Then, the average of $\tilde{X}$ is:
$= \mathbb{E}[X] - \mathbb{E}[\mathbb{E}[X \mid Z_{k}]]$
$= \mathbb{E}[X] - \mathbb{E}[X] = 0$
Because the mean of $\tilde{X}$ is 0, the expectation value of the estimation value of the MMSE equals the expectation value of the unknown random vector $X$. That is, the MMSE estimator is an unbiased estimator.
Covariance of The Estimation Error
The covariance of the estimation error $\tilde{X}$ is given as the mean of the conditional covariance of $X$ conditioned on the set of measurement vectors $Z_{k}$, as follows:
$=\mathbb{E}[\tilde{X}\tilde{X}^{T}]$
$=\mathbb{E}[\mathbb{E}[\tilde{X}\tilde{X}^{T} \mid Z_{k}]]$
$=\mathbb{E}[P_{XX\mid Z_{k}}]$
If the measurement vector is given as $Z_{k} = z_{k}$, the covariance of the estimation error $\tilde{X}$ is as follows:
$= \mathbb{E}[(X-\mathbb{E}[X \mid Z_{k}=z_{k}])(X-\mathbb{E}[X \mid Z_{k}=z_{k}])^{T}]$
$=P_{XX \mid Z_{k}}$
Also, the estimation error is always orthogonal with $g_{j}$ which is composed by measurement variable. It is written as an equation as :
where $Z$ is a measurement vector.
To prove this,
$=\mathbb{E}[Xg^{T}(Z)]-\mathbb{E}[\mathbb{E}[Xg^{T}(Z) \mid Z]]$
$= \mathbb{E}[Xg^{T}(Z)] - \mathbb{E}[Xg^{T}(Z)] = 0$
Then, let $g^{T}(Z) = Z$, $\mathbb{E}[X-\hat{X}^{MMSE}Z^{T}] = 0$.
According to the equation above, \(\hat{X}_{i}^{MMSE}\) is given as the value of projecting $X_{i}$ into a span consisting of a linear combination of measurement variables $Z_{i}$ and the measurement error is orthogonal to this span.
Joint Gaussian MMSE Estimator
Let’s get the MMSE estimation value $\hat{X}^{MMSE}$ of random vector $X$ when the two random vectors $X$ and $Z$ have a joint Gaussian distribution.
If the two random vectors are joint Gaussian vectors, each random vector follows a Gaussian distribution. Let $X$ and $Z$ have the following probability density functions:
$Z \sim N(\mu_{Z}, P_{ZZ})$
Also, assume that the joint probability density function of two random vectors is given as:
Then, $\mu_{Y}$ and $P_{YY}$ are:
where $P$ represents the covariance matrix.
To obtain $\hat{X}^{MMSE}$, we need the conditional probability density function of $X$ given $Z = z$. The conditional probability density function $p_{X\mid Z}(x \mid z)$ is given as:
$= \frac{\sqrt{(2 \pi)^{p}\det{P_{ZZ}}}}{\sqrt{(2 \pi)^{n+p}\det{P_{YY}}}}\exp({-\frac{1}{2} (y-\mu_{Y})^{T}P^{-1}_{YY}(y-\mu_{Y})-(z-\mu_{Z})^{T}P^{-1}_{ZZ}(z-\mu_{Z})})$
$= \frac{1}{\sqrt{(2 \pi)^{n}\det{P_{XX\mid Z}}}} \exp{(-\frac{1}{2}(x-E[X \mid Z=z])^{T}P_{XX \mid Z} ^{-1}(x-E[X \mid Z=z]))}$
where
$E[X \mid Z=z]$ : $\mu_{X} + P_{XZ}P_{ZZ}^{-1}(z-\mu_{Z})$
$P_{XX \mid Z}$ : $P_{XX} - P_{XZ} P_{ZZ}^{-1} P_{ZX}$
Thus, the MMSE estimation value $\hat{X}^{MMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ are:
$P_{\tilde{X}\tilde{X}} = P_{XX \mid Z} = P_{XX}-P_{XZ}P_{ZZ}^{-1}P_{ZX}$
Joint Gaussian MMSE Estimator for Linear Measurements
Let an unknown random vector $X$ and a measurement vector $Z$ have a linear relationship as:
where $X$ and $V$ which is a measurement noise are given as Gaussian random vectors as:
$V \sim N(0, R),$
$\mathbb{E}[(X-\mu_{X})V^{T}]=0$
and assume that the two random vectors are uncorrelated.
Since $X$ and $V$ are uncorrelated Gaussian random vectors, and $X$ and $Z$ have a linear relationship, $X$ and $Z$ have joint Gaussian distribution. Thus, the estimation value $\hat{X}^{MMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ of the random vector $X$ conditioned on the random vector $Z$ are:
$P_{\tilde{X}\tilde{X}} = (P_{XX}^{-1} + H^{T}R^{-1}H)^{-1}$
To prove this, we need to get $\mu_{Z}$ first.
Second, get $P_{ZZ}$.
$= \mathbb{E}[(H(X-\mu_{X})+V)(H(X-\mu_{X})+V)^{T}]$
$= HP_{XX}H^{T} + \mathbb{E}[H(X-\mu_{X})V^{T}]+\mathbb{E}[V(X-\mu_{X})^{T}H^{T}]+ R$
$=HP_{XX}H^{T}+R$
Third, get the cross-covariance $P_{XZ}$.
$= \mathbb{E}[(X-\mu_{X})(H(X-\mu_{X})+V)^{T}]$
$= P_{XX}H^{T} + \mathbb{E}[(X-\mu_{X})V^{T}]$
$= P_{XX}H^{T}$
Finally, to get $\hat{X}^{MMSE}(z)$ and $P_{\tilde{X}\tilde{X}}$, use the equations of joint Gaussian MMSE estimator above.
$P_{\tilde{X}\tilde{X}} = P_{XX\mid Z}$
$= P_{XX}-P_{XX}H^{T}(HP_{XX}H^{T}+R)^{-1}HP_{XX}$
$(P_{XX}^{-1} + H^{T}R^{-1}H)^{-1}$
Linear MMSE Estimator
When a measurement vector $z$ and an estimate $\hat{x}$ of an unknown random vector $X$ have a linear relationship given by $\hat{x}(z) = Az + b$, we refer to the estimator as a linear estimator.
If the measurement vector is not fixed as $Z=z$ and is instead given as a random vector, the linear estimator is expressed as $\hat{X}(Z) = AZ + b$.
The Linear Minimum Mean-Square Error (LMMSE) estimator is defined as an estimator that minimizes the following objective function.
$= tr\mathbb{E}[(X-\hat{X}^{LMMSE})(X-\hat{X}^{LMMSE})^{T}]$
$= tr\mathbb{E}[(X - AZ -b)(X-AZ-b)^{T}]$
$= tr\mathbb{E}[(X-AZ-b-\mathbb{E}[X]+\mathbb{E}[X])(X-AZ-b-\mathbb{E}[X]+\mathbb{E}[X])^{T}]$
$= trP_{XX}+A(P_{ZZ}+\mathbb{E}[Z](\mathbb{E}[Z])^{T})A^{T}+(\mathbb{E}[X]-b)(\mathbb{E}[X]-b)^{T}-2A\mathbb{E}[Z](\mathbb{E}[X]-b)^{T}-2AP_{XZ}$
where
$tr$ : Trace(sum of diagonal components).
$A$ : A deterministic matrix. Need to be determined to make $J$ minimum.
$b$ : A deterministic vector. Need to be determined to make $J$ minimum.
The necessary conditions for minimizing $J$ are as:
$\frac{\partial J}{\partial A} = 2A(P_{ZZ}+\mathbb{E}[Z](\mathbb{E}[Z])^{T} - 2P_{XZ}-2(\mathbb{E}[X]-b)(\mathbb{E}[Z])^{T})=0$
$A$ and $b$ are calculated among the equations above.
$A$ : $P_{XZ}P_{ZZ}^{-1}$
$b$ : $\mathbb{E}[X]-P_{XZ}P_{ZZ}^{-1}\mathbb{E}[Z]$
Then, LMMSE is given as:
If the estimation vector is confirmed as $Z=z$, LLMSE is given as:
Additionally, $\mathbb{E}[\hat{X}^{LMMSE}(Z)]$ and $P_{\tilde{X}\tilde{X}}$ are:
$P_{\tilde{X}\tilde{X}} = \mathbb{E}[(\tilde{X}-\mathbb{E}[\tilde{X}])(\tilde{X}-\mathbb{E}[\tilde{X}])^{T}]$
$=\mathbb{E}[\tilde{X}\tilde{X}^{T}] = \mathbb{E}[(X-\hat{X}^{LMMSE})(X-\hat{X}^{LMMSE})^{T}]$
$= P_{XX}-P_{XZ}P_{ZZ}^{-1}P_{ZX}$
Since $\mathbb{E}[\hat{X}^{LMMSE}(Z)] = \mathbb{E}[X]$, the LMMSE estimator is unbiased.
Linear MMSE Estimator for Linear Measurements
Let an unknown random vector $X$ and a measurement vector $Z$ be related by the linear equation $Z = HX + V$. Assume that $X$ and the measurement noise $V$ are random vectors with arbitrary probability distributions and are uncorrelated with each other as:
$V \sim (0, R)$,
$\mathbb{E}[(X-\mu_{X})V^{T}] = 0$
The estimate $\hat{X}^{LMMSE}$ and the estimation error covariance $P_{\tilde{X}\tilde{X}}$ of the random vector $X$ conditioned on the random vector $Z = z$ are:
$P_{\tilde{X}\tilde{X}} = (P_{XX}^{-1}+H^{T}R^{-1}H)^{-1}$
To prove this, we need to get $\mu_{Z}$ first.
Second, get $P_{ZZ}$
$=\mathbb{E}[(H(X-\mu_{X})+V)(H(X-\mu_{X})+V)^{T}]$
$= HP_{XX}H^{T} + \mathbb{E}[H(X-\mu_{X})V^{T}]+ \mathbb{E}[V(X-\mu_{X})^{T}H^{T}]+R$
$=HP_{XX}H^{T} + R$
Third, get the cross-covariance $P_{XZ}$.
$= \mathbb{E}[(X-\mu_{X})(H(X-\mu_{X})+V)^{T}]$
$=HP_{XX}H^{T} + \mathbb{E}[(X-\mu_{X})+V^{T}]$
$=P_{XX}H^{T}$
Finally, to get \(\hat{X}^{LMMSE}(z)\) and $P_{\tilde{X}\tilde{X}}$, use the equations of the LMMSE estimator given above.
$P_{\tilde{X}\tilde{X}}=P_{XX}-P_{XX}H^{T}(HP_{XX}H^{T}+R)^{-1}HP_{XX}$
$=(P_{XX}^{-1}+H^{T}R^{-1}H)^{-1}$