Template:Fisher Matrix Confidence Bounds

Fisher Matrix Confidence Bounds
This section presents an overview of the theory on obtaining approximate confidence bounds on suspended (multiple censored) data. The methodology used is the so-called Fisher matrix bounds (FM), described in Nelson [30] and Lloyd and Lipow [24]. These bounds are employed in most other commercial statistical applications. In general, these bounds tend to be more optimistic than the non-parametric rank based bounds. This may be a concern, particularly when dealing with small sample sizes. Some statisticians feel that the Fisher matrix bounds are too optimistic when dealing with small sample sizes and prefer to use other techniques for calculating confidence bounds, such as the likelihood ratio bounds.

Approximate Estimates of the Mean and Variance of a Function
In utilizing FM bounds for functions, one must first determine the mean and variance of the function in question (i.e., reliability function, failure rate function, etc.). An example of the methodology and assumptions for an arbitrary function G is presented next.

Single Parameter Case

For simplicity, consider a one-parameter distribution represented by a general function G, which is a function of one parameter estimator, say $$G(\widehat{\theta }).$$ For example, the mean of the exponential distribution is a function of the parameter λ : G(λ) = 1 / λ = μ. Then, in general, the expected value of $$G\left( \widehat{\theta } \right)$$ can be found by:


 * $$E\left( G\left( \widehat{\theta } \right) \right)=G(\theta )+O\left( \frac{1}{n} \right)$$

where G(θ) is some function of θ, such as the reliability function, and θ is the population parameter where $$E\left( \widehat{\theta } \right)=\theta $$ as $$n\to \infty $$. The term $$O\left( \tfrac{1}{n} \right)$$ is a function of n, the sample size, and tends to zero, as fast as $$\tfrac{1}{n},$$ as $$n\to \infty .$$ For example, in the case of $$\widehat{\theta }=1/\overline{x}$$ and G(x) = 1 / x , then $$E(G(\widehat{\theta }))=\overline{x}+O\left( \tfrac{1}{n} \right)$$ where $$O\left( \tfrac{1}{n} \right)=\tfrac{n}$$. Thus as $$n\to \infty $$, $$E(G(\widehat{\theta }))=\mu $$ where μ and σ are the mean and standard deviation, respectively. Using the same one-parameter distribution, the variance of the function $$G\left( \widehat{\theta } \right)$$ can then be estimated by:


 * $$Var\left( G\left( \widehat{\theta } \right) \right)=\left( \frac{\partial G}{\partial \widehat{\theta }} \right)_{\widehat{\theta }=\theta }^{2}Var\left( \widehat{\theta } \right)+O\left( \frac{1} \right)$$

Two-Parameter Case

Consider a Weibull distribution with two parameters β and η. For a given value of t, $$R(t)=G(\beta ,\eta )={{e}^{-{{\left( \tfrac{t}{\eta } \right)}^{\beta }}}}$$. Repeating the previous method for the case of a two-parameter distribution, it is generally true that for a function G, which is a function of two parameter estimators, say $$G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right)$$, that:


 * $$E\left( G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \right)=G\left( {{\theta }_{1}},{{\theta }_{2}} \right)+O\left( \frac{1}{n} \right)$$

and:


 * $$\begin{align}

Var( G( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}}))= &{(\frac{\partial G}{\partial {{\widehat{\theta }}_{1}}})^2}_{{\widehat{\theta_{1}}}={\theta_{1}}}Var(\widehat{\theta_{1}})+{(\frac{\partial G}{\partial {{\widehat{\theta }}_{2}}})^2}_{{\widehat{\theta_{2}}}={\theta_{2}}}Var(\widehat{\theta_{2}})\\ & +2{(\frac{\partial G}{\partial {{\widehat{\theta }}_{1}}})^2}_{{\widehat{\theta_{1}}}={\theta_{1}}}{(\frac{\partial G}{\partial {{\widehat{\theta }}_{2}}})^2}_{{\widehat{\theta_{2}}}={\theta_{2}}}Cov(\widehat{\theta_{1}},\widehat{\theta_{2}}) \\ & +O(\frac{1}{n^{\tfrac{3}{2}}}) \end{align}

$$

Note that the derivatives of the above equation are evaluated at $${{\widehat{\theta }}_{1}}={{\theta }_{1}}$$ and $${{\widehat{\theta }}_{2}}={{\theta }_{2}},$$ where E $$\left( {{\widehat{\theta }}_{1}} \right)\simeq {{\theta }_{1}}$$ and E $$\left( {{\widehat{\theta }}_{2}} \right)\simeq {{\theta }_{2}}.$$

Parameter Variance and Covariance Determination

The determination of the variance and covariance of the parameters is accomplished via the use of the Fisher information matrix. For a two-parameter distribution, and using maximum likelihood estimates (MLE), the log-likelihood function for censored data is given by:


 * $$\begin{align}

\ln [L]= & \Lambda =\underset{i=1}{\overset{R}{\mathop \sum }}\,\ln [f({{T}_{i}};{{\theta }_{1}},{{\theta }_{2}})] \\ & \text{ }+\underset{j=1}{\overset{M}{\mathop \sum }}\,\ln [1-F({{S}_{j}};{{\theta }_{1}},{{\theta }_{2}})] \\ & \text{ }+\underset{l=1}{\overset{P}{\mathop \sum }}\,\ln \left\{ F({{I}_};{{\theta }_{1}},{{\theta }_{2}})-F({{I}_};{{\theta }_{1}},{{\theta }_{2}}) \right\} \end{align}$$

In the equation above, the first summation is for complete data, the second summation is for right censored data and the third summation is for interval or left censored data.

Then the Fisher information matrix is given by:


 * $${{F}_{0}}=\left[ \begin{matrix}

{{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \right]}_{0}} \\ {} & {} & {} \\   {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}} \right]}_{0}}  \\ \end{matrix} \right]$$

The subscript 0 indicates that the quantity is evaluated at $${{\theta }_{1}}={{\theta }_}$$ and $${{\theta }_{2}}={{\theta }_},$$ the true values of the parameters. So for a sample of N units where R units have failed, S have been suspended, and P have failed within a time interval, and N = R + M + P, one could obtain the sample local information matrix by:


 * $$F={{\left[ \begin{matrix}

-\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]}^{}}$$

Substituting the values of the estimated parameters, in this case $${{\widehat{\theta }}_{1}}$$ and $${{\widehat{\theta }}_{2}}$$, and then inverting the matrix, one can then obtain the local estimate of the covariance matrix or:


 * $$\left[ \begin{matrix}

\widehat{Var}\left( {{\widehat{\theta }}_{1}} \right) & {} & \widehat{Cov}\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \\ {} & {} & {} \\   \widehat{Cov}\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) & {} & \widehat{Var}\left( {{\widehat{\theta }}_{2}} \right)  \\ \end{matrix} \right]={{\left[ \begin{matrix} -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]}^{-1}}$$

Then the variance of a function ( V'a'r(G) ) can be estimated using equation for the variance. Values for the variance and covariance of the parameters are obtained from Fisher Matrix equation. Once they have been obtained, the approximate confidence bounds on the function are given as:


 * $$C{{B}_{R}}=E(G)\pm {{z}_{\alpha }}\sqrt{Var(G)}$$

which is the estimated value plus or minus a certain number of standard deviations. We address finding zα next.

Approximate Confidence Intervals on the Parameters
In general, MLE estimates of the parameters are asymptotically normal, meaning that for large sample sizes, a distribution of parameter estimates from the same population would be very close to the normal distribution. Thus if $$\widehat{\theta }$$ is the MLE estimator for θ, in the case of a single parameter distribution estimated from a large sample of n units, then:


 * $$z\equiv \frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}$$

follows an approximating normal distribution. That is


 * $$P\left( x\le z \right)\to \Phi \left( z \right)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{z}{{e}^{-\tfrac{2}}}dt$$

for large n. We now place confidence bounds on θ, at some confidence level δ, bounded by the two end points C1 and C2 where:


 * $$P\left( {{C}_{1}}<\theta <{{C}_{2}} \right)=\delta $$

From the above equation:


 * $$P\left( -{{K}_{\tfrac{1-\delta }{2}}}<\frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}<{{K}_{\tfrac{1-\delta }{2}}} \right)\simeq \delta $$

where Kα is defined by:


 * $$\alpha =\frac{1}{\sqrt{2\pi }}\int_^{\infty }{{e}^{-\tfrac{2}}}dt=1-\Phi \left( {{K}_{\alpha }} \right)$$

Now by simplifying the equation for the confidence level, one can obtain the approximate two-sided confidence bounds on the parameter θ, at a confidence level δ, or:


 * $$\left( \widehat{\theta }-{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)}<\theta <\widehat{\theta }+{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)} \right)$$

The upper one-sided bounds are given by:


 * $$\theta <\widehat{\theta }+{{K}_{1-\delta }}\sqrt{Var(\widehat{\theta })}$$

while the lower one-sided bounds are given by:


 * $$\theta >\widehat{\theta }-{{K}_{1-\delta }}\sqrt{Var(\widehat{\theta })}$$

If $$\widehat{\theta }$$ must be positive, then $$\ln \widehat{\theta }$$ is treated as normally distributed. The two-sided approximate confidence bounds on the parameter θ, at confidence level δ , then become:


 * $$\begin{align}

& {{\theta }_{U}}= & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{\tfrac{1-\delta }{2}}}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}}\text{ (Two-sided upper)} \\ & {{\theta }_{L}}= & \frac{\widehat{\theta }}\text{    (Two-sided lower)} \end{align}$$

The one-sided approximate confidence bounds on the parameter θ, at confidence level δ, can be found from:


 * $$\begin{align}

& {{\theta }_{U}}= & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{1-\delta }}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}}\text{ (One-sided upper)} \\ & {{\theta }_{L}}= & \frac{\widehat{\theta }}\text{    (One-sided lower)} \end{align}$$

The same procedure can be extended for the case of a two or more parameter distribution. Lloyd and Lipow [24] further elaborate on this procedure.

Confidence Bounds on Time (Type 1)
Type 1 confidence bounds are confidence bounds around time for a given reliability. For example, when using the one-parameter exponential distribution, the corresponding time for a given exponential percentile (i.e., y-ordinate or unreliability, Q = 1 − R) is determined by solving the unreliability function for the time, T, or:


 * $$\begin{align}\widehat{T}(Q)= &-\frac{1}{\widehat{\lambda }}

\ln (1-Q)= & -\frac{1}{\widehat{\lambda }}\ln (R) \end{align}$$

Bounds on time (Type 1) return the confidence bounds around this time value by determining the confidence intervals around $$\widehat{\lambda }$$ and substituting these values into the above equation. The bounds on $$\widehat{\lambda }$$ are determined using the method for the bounds on parameters, with its variance obtained from the Fisher Matrix. Note that the procedure is slightly more complicated for distributions with more than one parameter.

Confidence Bounds on Reliability (Type 2)
Type 2 confidence bounds are confidence bounds around reliability. For example, when using the two-parameter exponential distribution, the reliability function is:


 * $$\widehat{R}(T)={{e}^{-\widehat{\lambda }\cdot T}}$$

Reliability bounds (Type 2) return the confidence bounds by determining the confidence intervals around $$\widehat{\lambda }$$ and substituting these values into the above equation. The bounds on $$\widehat{\lambda }$$ are determined using the method for the bounds on parameters, with its variance obtained from the Fisher Matrix. Once again, the procedure is more complicated for distributions with more than one parameter.