Tomography Problems
From Theory of Measurements Wiki
Contents |
Definition of the Tomographic Inversion Problem
We consider a two-dimensional target described by a function
and a grid as shown in Fig. 1. The grid can be rectangular as in Fig. 1, but different shapes of grid elements, e.g. triangular, can also be chosen. The measurements are integrals of the target function along a set of lines
,
, i.e.
and the unknowns are the values of the target function at the grid points
. The task is to determine the best values of the unknowns
, when the measurements and their error estimates are known.
The direct theory of the inversion problem is formulated by assuming bilinear interpolation of the target function inside each grid element. In the case of a rectangular grid, for instance,
When the target function is approximated using this interpolation, each measurement is a linear combination of the unknowns. By collecting all measurements into a single column vector
and all unknowns into a column vector
, we can write
where
is a matrix of the linear coefficients obtained from the integrals and
is a column vector containing the measurement errors. Each line of integration crosses only a part of the grid elements and only the unknowns at the corners of those elements contibute the corresponding measurement. Therefore
is necessarily a sparse matrix. The task of the tomographic inversion is to determine the unknown vector
from this equation.
Stochastic inversion
Stochastic inversion is a method which treats the measurements, the unknowns and the measurement errors as random variables. The joint probablility distribution function of
and
is
where
and
are the conditional densities of
and
, respectively, and
and
are their a priori densities. We are interested in the distribution of the unknowns when the measurements are fixed, i.e. on the conditional density
which is also called the a posteriori density. After the results of the measurement are known,
is fixed in this equation and therefore
is only a normalisation constant. Since
is fixed (although unknown) in each target to be measured, the conditional density of the error vector is
Hence the a posteriori density of
is
Next we assume that the measurement errors are Gaussian with zero mean, which gives
Here
is the error covariance matrix and T indicates the transpose.
The a posteriori density contains all available information on the unknowns after the measurements have been carried out. The maximum point
of the a posteriori distribution is the most likely solution of the inversion problem. If no a priori information is available,
is constant in eq. (8), and a formal solution can be written for
. This solution, however, is unstable because it contains inversions of large matrices. Therefore some sort of regularisation is needed in order to obtain a stable solution.
The instability in the solution appears as too strong point-to-point variations in the inversion results. In order to obtain smoother stable solutions, one should find a way of limiting these variations.
Let us consider two neighboring grid points
and
either in
or
direction. We consider the difference of the unknowns at these points, i.e.
By collecting all such differences at neighbouring grid points into a column vector, we obtain an equation
Here
is a matrix with elements +1, -1 and 0 in appropriate places so that each component of eq. (10) is of the form (9). Obviously
is a random vector with some distribution function
If this contains a priori information on the probabilities of the changes of the unknowns from one grid point to another,
and
In this formalism, stable solutions can be obtained by means of an a priori density, which controls statistically the variation of the unknowns from point to point.
If we assume that
is a multivariate Gaussian random variable with zero mean and a covariance matrix
, the a posteriori density is
Since
is a quadratic form of the components of
, the chosen
is called L^2 priori. The maximum point of the a posteriori density function is given by
This is the solution of the inversion problem.
Eq. (15) gives stable solutions and the smoothness of the result can be varied by changing the covariances in
. However, eq. (15) is not well suited for investigating target functions with sharp gradients, since it favours smoothly behaving solutions. Another drawback is that it does not contain any non-negativity constraint, and therefore the result may be negative within some reegions, although the target function is normally positive.
Steeper gradients of the target function can be reconstructed if, instead of an L^2 priori, an L^1 priori
is chosen. Here C is a constant and i and j get only such values that xi and xj are unknowns in neighbouring grid points either in the ξ or the ψ direction. With this a priori density, the a posteriori density is
A drawback in this density is that its maximum point cannot be obtained in a straghtforward way as was the case for the L^2 priori. Therefore different methods must be used.
Gibbs sampling
Gibbs sampling is a method of reconstructing an estimate of a density function by means of a random number generator. In the case of multivariate distributions, this is done sequentially for each component at a time. The idea is best understood in terms of a two-dimensional distribution.
We assume a joint distribution
of two random variables
and
, as shown in Fig. 2, and choose an arbitrary start point
. Then we build a random number generator for
with a distribution
and use it to obtain a random value
. Next a second random number generator is made for
, obeying a distribution
. This generator outputs some random value
. As a result of these two steps, a new random point
is obtained. The process can be continued infinitely using the distributions
and
etc. As a result, a set of random points is obtained and, since the points obey the joint distribution, most of them lie close to the maximum of the distribution function. This is dispalyed schematically in Fig. 3.
In principle, it would be possible to map the joint distribution function by collecting a very large number of random points by means of Gibbs sampling. However, it is sufficient to get a farily small number of points if we, instead of searching for the maximum point, are satisfied in finding the mean value. If the distribution is nearly symmetric these two quantities are also nearly the same.
In our case of stochastic inversion, Gibbs sampling is applied to the a posteriori density. An arbitrary initial vector
is first selected. Then its first component
is replaced by a new value
, obtained by means of a one-dimensional random number generator obeying a distribution function proportional to
. The next step is to get a new value
for the second component of
using a random number generator
with a distribution function proportional to
. Following this procedure successively for all components of
will finally lead to a new vector
. The whole process is then repeated with
as a starting point and, as a result, a second new vector
is obtained. In this manner a sequence of random vectors
with a distribution
can be generated. After the sequence is long enough, the mean value of the random points is calculated.
Gibbs sampling offers a convenient way of implementing the non-negativity constraint. The method is simply to reject the result when a random number generator gives a negative value and to try again until a positive value is obtained. This is equivalent to multiplying the a posteriori density by a step function which is unity in the region where all components of $x\,$ are positive, but zero elsewhere.
Gibbs sampling is useful if the maximum of the a posteriori density is sharp enough in all directions in the result space. If the density has an elongated ridge in some direction, a very large number of random points may be needed and the time for computing the task may be excessive.
