Statistical Inverse Theory
From Theory of Measurements Wiki
Contents |
Basic Elements
The basic elements of statistical inverse theory are:
-
: the unknown
-
: the measurement
-
: the a priori density of the unknown
-
: the conditional density of the measurement, supposed the unknown is known. This function completely describes the measurement process
Inverse solution: The inverse solution is the conditional density of the unknown, when the measurement is known:
- inverse solution
- inverse solution
This is simply given by
where
is fixed.
329669885300142134227686
Non-linear example with Gaussian errors
Let us suppose
, where
is a Gaussian error term independent of
:
Suppose now that
constant and that the covariance of the measurement errors is diagonal:
The a posteriori density is then simply given by
with peak at the ordinary weighted least-squares solution.
Important Notes
- Because least-squares solution is the point of maximum posteriori density, it is often called the maximum likelihood estimate
- Note that we get the weights automatically: they just appear when we calculate the formula for the posteriori distribution
- It is not shown here, but it can be shown in many ways that the maximum likelihood estimator (with correct weights) is better than any other estimator.
- What is said above is true regardless of what kind of use is made of the analysis results afterwards. In particular, the optimal weights do not depend on wheter one is interested in component 1 of the unknown or component 2 of the unknown.
- If the measurement errors correlate (
is non-diagonal), the matrix formula in (1) cannot be written as a sum of a weighted least squares by any weights. The conclusion is: Optimal least squares in case of non-diagonal error covariance is obtained by minimizeing the matrix formula in (1) instead of any weighted sum of differences
39819381834
Linear example
Let us suppose that instead of
, we have
, where
is a matrix. In addition, let us suppose that the a priori density of the unknown is given by
The a posteriori density is then given by
where we have an analytical solution for the posteriori covariance
and centerpoint
:
If we suppose the prior density is constant, this is simply
and on the other hand, if we suppose we have many independent (vector-valued) measurements
we have the general solution
Note: The use of these formulas is not the most efficient way to solve linear inverse problems. Special-made direct numerical methods like flips are much more efficient.
Further study of finite-dimensional a posteriori distributions
It is easy to apply the matrix formulae to finite dimensional situations of a few parameters only. To do that, one has to calculate the matrix formulae by a computer and one then finds the widths of the a posteriori distributions in different situations. Let us suppose that we have a a posteriori covariance matrix
for the vector of the unknown quantities. Then we may get to know the first component of the unknowns with some certain accuracy, say
. By the formulas for the finite-dimensional linear and Gaussian case, the new a posteriori covariance matrix is simply
It is not necessary to invert the matrix numerically, because we can derive the expressions:
and
where
depends on the correlation coefficients of the original a posteriori covariance matrix.
Here, the first formula is pretty obvious. The effects on the other parameters are less obvious: Firstly, if the additional a priori information is infinitelyaccurate
, the widths of the final (marginal) a posteriori distribution for the rest of the parameters are reduced from
to
, where
depends on the correlation coefficient of the parameter with additional a priori information and the parameter under study. If the additional a priori information is broad, it does not change the widht of the a posteriori distributions for any parameters.
The most interesting results come for the intermediate cases: If
, we get
that is, every parameter improves by the ratio of the width of the additional a priori information of the first parameter and the width of its original (marginal) a posteriori distribution with the exceptions that the a posteriori distribution:
- never becomes narrower than that resulting from infinitely good additional a priori and
- never becomes broader than it originally was.
Other studies like this appear in (Vallinkoski and Lehtinen, 1990 a and b).
