Template:Fisher Matrix Confidence Bounds

Fisher Matrix Confidence Bounds
This section presents an overview of the theory on obtaining approximate confidence bounds on suspended (multiply censored) data. The methodology used is the so-called Fisher matrix bounds (FM), described in Nelson [30] and  Lloyd and Lipow [24]. These bounds are employed in most other commercial statistical applications. In general, these bounds tend to be more optimistic than the non-parametric rank based bounds. This may be a concern, particularly when dealing with small sample sizes. Some statisticians feel that the Fisher matrix bounds are too optimistic when dealing with small sample sizes and prefer to use other techniques for calculating confidence bounds, such as the likelihood ratio bounds.

Approximate Estimates of the Mean and Variance of a Function
In utilizing FM bounds for functions, one must first determine the mean and variance of the function in question (i.e. reliability function, failure rate function, etc.). An example of the methodology and assumptions for an arbitrary function $$G$$ is presented next.

Single Parameter Case
For simplicity, consider a one-parameter distribution represented by a general function, $$G,$$ which is a function of one parameter estimator, say $$G(\widehat{\theta }).$$ For example, the mean of the exponential distribution is a function of the parameter $$\lambda $$: $$G(\lambda )=1/\lambda =\mu $$. Then, in general, the expected value of $$G\left( \widehat{\theta } \right)$$ can be found by:


 * $$E\left( G\left( \widehat{\theta } \right) \right)=G(\theta )+O\left( \frac{1}{n} \right)$$

where $$G(\theta )$$ is some function of $$\theta $$, such as the reliability function, and $$\theta $$ is the population parameter where $$E\left( \widehat{\theta } \right)=\theta $$ as $$n\to \infty $$. The term $$O\left( \tfrac{1}{n} \right)$$ is a function of $$n$$, the sample size, and tends to zero, as fast as $$\tfrac{1}{n},$$ as $$n\to \infty .$$ For example, in the case of $$\widehat{\theta }=1/\overline{x}$$ and $$G(x)=1/x$$, then $$E(G(\widehat{\theta }))=\overline{x}+O\left( \tfrac{1}{n} \right)$$ where $$O\left( \tfrac{1}{n} \right)=\tfrac{n}$$. Thus as $$n\to \infty $$, $$E(G(\widehat{\theta }))=\mu $$ where $$\mu $$ and $$\sigma $$ are the mean and standard deviation, respectively. Using the same one-parameter distribution, the variance of the function $$G\left( \widehat{\theta } \right)$$ can then be estimated by:


 * $$Var\left( G\left( \widehat{\theta } \right) \right)=\left( \frac{\partial G}{\partial \widehat{\theta }} \right)_{\widehat{\theta }=\theta }^{2}Var\left( \widehat{\theta } \right)+O\left( \frac{1} \right)$$

Two-Parameter Case
Consider a Weibull distribution with two parameters $$\beta $$ and $$\eta $$. For a given value of $$t$$, $$R(t)=G(\beta ,\eta )={{e}^{-{{\left( \tfrac{t}{\eta } \right)}^{\beta }}}}$$. Repeating the previous method for the case of a two-parameter distribution, it is generally true that for a function $$G$$, which is a function of two parameter estimators, say $$G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right)$$, that:


 * $$E\left( G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \right)=G\left( {{\theta }_{1}},{{\theta }_{2}} \right)+O\left( \frac{1}{n} \right)$$

and:


 * $$\begin{align}

Var( G( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}}))= &{(\frac{\partial G}{\partial {{\widehat{\theta }}_{1}}})^2}_{{\widehat{\theta_{1}}}={\theta_{1}}}Var(\widehat{\theta_{1}})+{(\frac{\partial G}{\partial {{\widehat{\theta }}_{2}}})^2}_{{\widehat{\theta_{2}}}={\theta_{2}}}Var(\widehat{\theta_{2}})\\ & +2{(\frac{\partial G}{\partial {{\widehat{\theta }}_{1}}})^2}_{{\widehat{\theta_{1}}}={\theta_{1}}}{(\frac{\partial G}{\partial {{\widehat{\theta }}_{2}}})^2}_{{\widehat{\theta_{2}}}={\theta_{2}}}Cov(\widehat{\theta_{1}},\widehat{\theta_{2}}) \\ & +O(\frac{1}{n^{\tfrac{3}{2}}}) \end{align}

$$

Note that the derivatives of Eqn. (var) are evaluated at $${{\widehat{\theta }}_{1}}={{\theta }_{1}}$$ and $${{\widehat{\theta }}_{2}}={{\theta }_{2}},$$ where E $$\left( {{\widehat{\theta }}_{1}} \right)\simeq {{\theta }_{1}}$$ and E $$\left( {{\widehat{\theta }}_{2}} \right)\simeq {{\theta }_{2}}.$$

Parameter Variance and Covariance Determination
The determination of the variance and covariance of the parameters is accomplished via the use of the Fisher information matrix. For a two-parameter distribution, and using maximum likelihood estimates (MLE), the log-likelihood function for censored data is given by:


 * $$\begin{align}

\ln [L]= & \Lambda =\underset{i=1}{\overset{R}{\mathop \sum }}\,\ln [f({{T}_{i}};{{\theta }_{1}},{{\theta }_{2}})] \\ & \text{ }+\underset{j=1}{\overset{M}{\mathop \sum }}\,\ln [1-F({{S}_{j}};{{\theta }_{1}},{{\theta }_{2}})] \\ & \text{ }+\underset{l=1}{\overset{P}{\mathop \sum }}\,\ln \left\{ F({{I}_};{{\theta }_{1}},{{\theta }_{2}})-F({{I}_};{{\theta }_{1}},{{\theta }_{2}}) \right\} \end{align}$$

In the equation above, the first summation is for complete data, the second summation is for right censored data, and the third summation is for interval or left censored data. For more information on these data types, see Chapter 5. Then the Fisher information matrix is given by:


 * $${{F}_{0}}=\left[ \begin{matrix}

{{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \right]}_{0}} \\ {} & {} & {} \\   {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}} \right]}_{0}}  \\ \end{matrix} \right]$$

The subscript $$0$$ indicates that the quantity is evaluated at $${{\theta }_{1}}={{\theta }_}$$ and $${{\theta }_{2}}={{\theta }_},$$ the true values of the parameters. So for a sample of $$N$$ units where $$R$$ units have failed, $$S$$ have been suspended, and $$P$$ have failed within a time interval, and $$N=R+M+P,$$ one could obtain the sample local information matrix by:


 * $$F={{\left[ \begin{matrix}

-\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]}^{}}$$

Substituting in the values of the estimated parameters, in this case $${{\widehat{\theta }}_{1}}$$ and $${{\widehat{\theta }}_{2}}$$, and then inverting the matrix, one can then obtain the local estimate of the covariance matrix or:


 * $$\left[ \begin{matrix}

\widehat{Var}\left( {{\widehat{\theta }}_{1}} \right) & {} & \widehat{Cov}\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \\ {} & {} & {} \\   \widehat{Cov}\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) & {} & \widehat{Var}\left( {{\widehat{\theta }}_{2}} \right)  \\ \end{matrix} \right]={{\left[ \begin{matrix} -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]}^{-1}}$$

Then the variance of a function ($$Var(G)$$) can be estimated using Eqn. (var). Values for the variance and covariance of the parameters are obtained from Eqn. (Fisher2). Once they have been obtained, the approximate confidence bounds on the function are given as:


 * $$C{{B}_{R}}=E(G)\pm {{z}_{\alpha }}\sqrt{Var(G)}$$

which is the estimated value plus or minus a certain number of standard deviations. We address finding $${{z}_{\alpha }}$$ next.

Approximate Confidence Intervals on the Parameters
In general, MLE estimates of the parameters are asymptotically normal, meaning for large sample sizes that a distribution of parameter estimates from the same population would be very close to the normal distribution. Thus if $$\widehat{\theta }$$ is the MLE estimator for $$\theta $$, in the case of a single parameter distribution, estimated from a large sample of $$n$$ units then:


 * $$z\equiv \frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}$$

follows an approximating normal distribution. That is


 * $$P\left( x\le z \right)\to \Phi \left( z \right)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{z}{{e}^{-\tfrac{2}}}dt$$

for large $$n$$. We now place confidence bounds on $$\theta ,$$ at some confidence level $$\delta $$, bounded by the two end points $${{C}_{1}}$$ and $${{C}_{2}}$$ where:


 * $$P\left( {{C}_{1}}<\theta <{{C}_{2}} \right)=\delta $$

From Eqn. (e729):


 * $$P\left( -{{K}_{\tfrac{1-\delta }{2}}}<\frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}<{{K}_{\tfrac{1-\delta }{2}}} \right)\simeq \delta $$

where $${{K}_{\alpha }}$$ is defined by:


 * $$\alpha =\frac{1}{\sqrt{2\pi }}\int_^{\infty }{{e}^{-\tfrac{2}}}dt=1-\Phi \left( {{K}_{\alpha }} \right)$$

Now by simplifying Eqn. (e731), one can obtain the approximate two-sided confidence bounds on the parameter $$\theta ,$$ at a confidence level $$\delta ,$$ or:


 * $$\left( \widehat{\theta }-{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)}<\theta <\widehat{\theta }+{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)} \right)$$

The upper one-sided bounds are given by:


 * $$\theta <\widehat{\theta }+{{K}_{1-\delta }}\sqrt{Var(\widehat{\theta })}$$

while the lower one-sided bounds are given by:


 * $$\theta >\widehat{\theta }-{{K}_{1-\delta }}\sqrt{Var(\widehat{\theta })}$$

If $$\widehat{\theta }$$ must be positive, then $$\ln \widehat{\theta }$$ is treated as normally distributed. The two-sided approximate confidence bounds on the parameter $$\theta $$, at confidence level $$\delta $$, then become:


 * $$\begin{align}

& {{\theta }_{U}}= & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{\tfrac{1-\delta }{2}}}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}}\text{ (Two-sided upper)} \\ & {{\theta }_{L}}= & \frac{\widehat{\theta }}\text{    (Two-sided lower)} \end{align}$$

The one-sided approximate confidence bounds on the parameter $$\theta $$, at confidence level $$\delta ,$$ can be found from:


 * $$\begin{align}

& {{\theta }_{U}}= & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{1-\delta }}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}}\text{ (One-sided upper)} \\ & {{\theta }_{L}}= & \frac{\widehat{\theta }}\text{    (One-sided lower)} \end{align}$$

The same procedure can be extended for the case of a two or more parameter distribution. Lloyd and Lipow [24] further elaborate on this procedure.

Confidence Bounds on Time (Type 1)
Type 1 confidence bounds are confidence bounds around time for a given reliability. For example, when using the one-parameter exponential distribution, the corresponding time for a given exponential percentile (i.e. y-ordinate or unreliability, $$Q=1-R)$$ is determined by solving the unreliability function for the time, $$T$$, or:


 * $$\begin{align}\widehat{T}(Q)= &-\frac{1}{\widehat{\lambda }}

\ln (1-Q)= & -\frac{1}{\widehat{\lambda }}\ln (R) \end{align}$$

Bounds on time (Type 1) return the confidence bounds around this time value by determining the confidence intervals around $$\widehat{\lambda }$$ and substituting these values into Eqn. (cb). The bounds on $$\widehat{\lambda }$$ were determined using Eqns. (cblmu) and (cblml), with its variance obtained from Eqn. (Fisher2). Note that the procedure is slightly more complicated for distributions with more than one parameter.

Confidence Bounds on Reliability (Type 2)
Type 2 confidence bounds are confidence bounds around reliability. For example, when using the two-parameter exponential distribution, the reliability function is:


 * $$\widehat{R}(T)={{e}^{-\widehat{\lambda }\cdot T}}$$

Reliability bounds (Type 2) return the confidence bounds by determining the confidence intervals around $$\widehat{\lambda }$$ and substituting these values into Eqn. (cbr). The bounds on $$\widehat{\lambda }$$ were determined using Eqns. (cblmu) and (cblml), with its variance obtained from Eqn. (Fisher2). Once again, the procedure is more complicated for distributions with more than one parameter.