• Language: en

# Uncertainty and Standard Errors¶

Parameter estimation (also referred to as “model fitting”) aims to find a solution in parameter space that has a likelihood at least as good as all other nearby solutions. We know that a locally optimal solution has been found when changing any parameter in any direction decreases the likelihood (or, equivalently, increases the cost).

## Likelihood Hessian¶

We can also quantify the confidence in our estimates from the shape of the likelihood surface at the local minimum: changing some parameters will lower the likelihood by a lot, suggesting that we have high confidence (or low uncertainty) in their estimated value; changing others, however, will lower the likelihood by only a little or possibly not at all, suggesting that we have low confidence (high uncertainty) in our estimate.

First we return to the example used in the last chapter – a one compartment model with absorption – only here we reduce it to a one dimensional problem by fixing CL and V to the optimal values and allowing only KA to vary. We can then plot ObjV for a range of values of KA as a line (Fig. 32).

Fig. 32 ObjV vs KA with other parameters fixed at their true values

Although ObjV as a function of KA is not symmetric, it is smooth and has a clearly defined minimum. We can therefore compute properties of the curve (e.g. its gradient) at any point and approximate it with a polynomial using a Taylor series expansion.

More specifically, using a second order Taylor series at the minimum requires only the minimum value, ObjV, and its second derivative because the gradient is zero by definition. This approximates the ObjV function with a quadratic (Fig. 33).

Fig. 33 ObjV vs KA with other parameters fixed at their true values, plus the quadratic approximation at the minimum

In effect, this approximates the likelihood surface with a Gaussian whose peak coincides with that of the true likelihood (Fig. 34).

Fig. 34 Likelihood vs KA with other parameters fixed at their true values, plus the gaussian approximation at the maximum

In more than one dimension, the second derivative is a matrix and is known as the hessian. In two dimensions, for example, the quadratic approximation to ObjV resembles a basin whose shape reflects the true function at its minimum (Table 20).

## Confidence Intervals¶

With a gaussian approximation to the likelihood, we can sample new values for the population parameters from the distribution. By sampling many possible solutions from the distribution, we can compute a range for each parameter in which a given percentage of the sampled values lie. These ranges are called confidence intervals and typically encompass 90% or 95% of the sampled values (Fig. 35).

Fig. 35 ObjV surface with population parameters sampled from the approximating gaussian and 90% confidence intervals in red for KA and CL

```CI(KA) = 0.274 + [-0.0533, 0.0544]
CI(CL) = 3.24 + [-0.343, 0.338]
```

The purpose of the confidence interval is to compare one estimate with another, for example by confirming that our estimated confidence interval contains a value published in research literature by a third party.

When computing confidence intervals, we can get different estimates depending on how we parameterized the model. When population parameters are required to be positive, for example, we might optimise over values for their logarithm (which spans the whole of the real line). This can change the ObjV surface to be closer to a quadratic, and effectively samples from a lognormal distribution over the population parameters (Fig. 36).

Fig. 36 ObjV surface with population parameters sampled from the approximating gaussian and 90% confidence intervals in red for log(KA) and log(CL)

One outcome of this is that (in the limit) the confidence interval is less likely to be centred on the estimated value:

```CI(KA) = 0.273 + [-0.049, 0.0606]
CI(CL) = 3.25 + [-0.323, 0.357]
```

## Standard Errors¶

The Likelihood Hessian can be used to compute standard errors for parameter estimates, which are similar to the Confidence Intervals described above and computed in a similar manner.