6.3 Estimating the Error in $ \mathbf{u}^0$

To estimate the error in the best-fitting parameter values that we find, we assume $ \mathrm{P}\left( \mathbf{u} \vert \left\{ \mathbf{x}_i, f_i, \sigma_i \right\} \right)$ to be approximated by an $ n_\mathrm{u}$-dimensional Gaussian distribution around $ \mathbf{u}^0$. Taking a Taylor expansion of $ L(\mathbf{u})$ about $ \mathbf{u}^0$, we can write:


$\displaystyle L(\mathbf{u})$ $\displaystyle =$ $\displaystyle L(\mathbf{u}^0) +
\underbrace{
\sum_{i=0}^{n_\mathrm{u}-1} \left(...
...right\vert _{\mathbf{u}^0}
}_{\textrm{Zero at $\mathbf{u}^0$ by definition}} +$ (6.4)
    $\displaystyle \sum_{i=0}^{n_\mathrm{u}-1} \sum_{j=0}^{n_\mathrm{u}-1} \frac{\le...
...ght\vert _{\mathbf{u}^0} +
\mathcal{O}\left( \mathbf{u} - \mathbf{u}^0\right)^3$  

Since the logarithm of a Gaussian distribution is a parabola, the quadratic terms in the above expansion encode the Gaussian component of the probability distribution $ \mathrm{P}\left( \mathbf{u} \vert \left\{ \mathbf{x}_i, f_i, \sigma_i \right\} \right)$ about $ \mathbf{u}^0$.6.1 We may write the sum of these terms, which we denote $ Q$, in matrix form:

$\displaystyle Q = \frac{1}{2} \left(\mathbf{u} - \mathbf{u}^0\right)^\mathbf{T} \mathbf{A} \left(\mathbf{u} - \mathbf{u}^0\right)$ (6.5)

where the superscript $ ^\mathbf{T}$ represents the transpose of the vector displacement from $ \mathbf{u}^0$, and $ \mathbf{A}$ is the Hessian matrix of $ L$, given by:

$\displaystyle A_{ij} = \nabla\nabla L = \left.\frac{\partial^2 L}{\partial u_i \partial u_j}\right\vert _{\mathbf{u}^0}$ (6.6)

This is the Hessian matrix which is output by the fit command. In general, an $ n_\mathrm{u}$-dimensional Gaussian distribution such as that given by equation (6.4) yields elliptical contours of equiprobability in parameter space, whose principal axes need not be aligned with our chosen coordinate axes - the variables $ u_0 ... u_{n_u-1}$. The eigenvectors $ \mathbf{e}_i$ of $ \mathbf{A}$ are the principal axes of these ellipses, and the corresponding eigenvalues $ \lambda_i$ equal $ 1/\sigma_i^2$, where $ \sigma _i$ is the standard deviation of the probability density function along the direction of these axes.

This can be visualised by imagining that we diagonalise $ \mathbf{A}$, and expand equation (6.5) in our diagonal basis. The resulting expression for $ L$ is a sum of square terms; the cross terms vanish in this basis by definition. The equations of the equiprobability contours become the equations of ellipses:

$\displaystyle Q = \frac{1}{2} \sum_{i=0}^{n_\mathrm{u}-1} A_{ii} \left(u_i - u^0_i\right)^2 = k$ (6.7)

where $ k$ is some constant. By comparison with the equation for the logarithm of a Gaussian distribution, we can associate $ A_{ii}$ with $ -1/\sigma_i^2$ in our eigenvector basis.

The problem of evaluating the standard deviations of our variables $ u_i$ is more complicated, however, as we are attempting to evaluate the width of these elliptical equiprobability contours in directions which are, in general, not aligned with their principal axes. To achieve this, we first convert our Hessian matrix into a covariance matrix.

Dominic Ford 2006-09-09