Appendix A: Brief Statistical Background

In this appendix we attempt to provide a brief elementary introduction to the most common and fundamental statistical equations and definitions used in reliability engineering and life data analysis. The equations and concepts presented in this appendix are used extensively throughout this reference. =Basic Statistical Definitions=

=Confidence Intervals (or Bounds)= One of the most confusing concepts to an engineer new to the field is the concept of putting a probability on a probability. In life data analysis, this concept is referred to as confidence intervals or confidence bounds. In this section, we will try to briefly present the concept, in less than statistical terms, but based on solid common sense.

The Black and White Marbles


To illustrate, imagine a situation in which there are millions of black and white marbles in a rather large swimming pool, and our job is to estimate the percentage of black marbles. One way to do this (other than counting all the marbles!) is to estimate the percentage of black marbles by taking a sample and then counting the number of black marbles in the sample.

Taking a Small Sample of Marbles


First, let's pick out a small sample of marbles and count the black ones. Say you picked out 10 marbles and counted 4 black marbles. Based on this, your estimate would be that 40% of the marbles are black.

If you put the 10 marbles back into the pool and repeated this example, you might get 5 black marbles, changing your estimate to 50% black marbles.



Which of the two estimates is correct? Both estimates are correct! As you repeat this experiment over and over again, you might find out that this estimate is usually between $${{X}_{1}}%$$  and  $${{X}_{2}}%$$, or maybe 90% of the time this estimate is between  $${{X}_{1}}%$$  and  $${{X}_{2}}%.$$

Taking a Larger Sample of Marbles
If we now repeat the experiment and pick out 1,000 marbles, we might get results such as 545, 570, 530, etc. for the number of black marbles in each trial. Note that the range in this case will be much narrower than before. For example, let's say that 90% of the time, the number of black marbles will be from $${{Y}_{1}}%$$  to  $${{Y}_{2}}%$$, where  $${{X}_{1}}%<{{Y}_{1}}%$$  and  $${{X}_{2}}%>{{Y}_{2}}%$$ , thus giving us a narrower interval. For confidence intervals, the larger the sample size, the narrower the confidence intervals.

Back to Reliability
Returning to the subject at hand, our task is to determine the probability of failure or reliability of all of our units. However, until all units fail, we will never know the exact value. Our task is to estimate the reliability based on a sample, much like estimating the number of black marbles in the pool. If we perform 10 different reliability tests for our units, and estimate the parameters using ALTA, we will obtain slightly different parameters for the distribution each time, and thus slightly different reliability results. However, when employing confidence bounds, we obtain a range in which these values are more likely to occur $$X$$  percent of the time. Remember that each parameter is an estimate of the true parameter, a true parameter that is unknown to us.

One-Sided and Two-Sided Confidence Bounds
Confidence bounds (or intervals) are generally described as one-sided or two-sided.

Two-Sided Bounds


When we use two-sided confidence bounds (or intervals), we are looking at where most of the population is likely to lie. For example, when using 90% two-sided confidence bounds, we are saying that 90% lies between $$X$$  and  $$Y$$, with 5% less than  $$X$$  and 5% greater than  $$Y$$.

One-Sided Bounds
When using one-sided intervals, we are looking at the percentage of units that are greater or less (upper and lower) than a certain point $$X$$.



For example, 95% one-sided confidence bounds would indicate that 95% of the population is greater than $$X$$  (if 95% is a lower confidence bound), or that 95% is less than  $$X$$  (if 95% is an upper confidence bound).

In ALTA, we use upper to mean the higher limit and lower to mean the lower limit, regardless of their position, but based on the value of the results. So for example, when returning the confidence bounds on the reliability, we would term the lower value of reliability as the lower limit and the higher value of reliability as the higher limit. When returning the confidence bounds on probability of failure, we will again term the lower numeric value for the probability of failure as the lower limit and the higher value as the higher limit.

=Confidence Limits Determination= This section presents an overview of the theory on obtaining approximate confidence bounds on suspended (multiply censored) data. The methodology used is the so-called Fisher Matrix Bounds, described in Nelson [27] and Lloyd and Lipow [24].

Suggested References
This section presents a brief introduction into how the confidence intervals are calculated by ALTA. By no means do we intend to cover the full theory behind this methodology. More complete details on confidence intervals can be found in the following books: •	Nelson, Wayne, Applied Life Data Analysis, 1982, John Wiley & Sons, New York, New York.

•	Nelson, Wayne, Accelerated Testing: Statistical Models, Test Plans, and Data Analyses, 1990, John Wiley & Sons, New York, New York.

•	David K. Lloyd and Myron Lipow, Reliability: Management, Methods, and Mathematics, 1962, Prentice Hall, Englewood Cliffs, New Jersey.

•	H. Cramer, Mathematical Methods of Statistics, 1946, Princeton University Press, Princeton, New Jersey.

Single Parameter Case
For simplicity, consider a one parameter distribution represented by a general function $$G,$$  which is a function of one parameter estimator, say  $$G(\widehat{\theta }).$$  Then, in general, the expected value of  $$G\left( \widehat{\theta } \right)$$  can be found by:


 * $$E\left( G\left( \widehat{\theta } \right) \right)=G(\theta )+O\left( \frac{1}{n} \right)$$

where $$G(\theta )$$  is some function of  $$\theta $$, such as the reliability function, and  $$\theta $$  is the population moment, or parameter such that  $$E\left( \widehat{\theta } \right)=\theta $$  as  $$n\to \infty $$. The term $$O\left( \tfrac{1}{n} \right)$$  is a function of  $$n$$, the sample size, and tends to zero, as fast as  $$\frac{1}{n}$$  as  $$n\to \infty .$$  For example, in the case of  $$\widehat{\theta }=\overline{x}$$  and  $$G(x)={{x}^{2}}$$ , then  $$E(G(\overline{x}))={{\mu }^{2}}+O\left( \tfrac{1}{n} \right)$$  where  $$O\left( \tfrac{1}{n} \right)=\tfrac{n},$$  thus as  $$n\to \infty $$ ,  $$E(G(\overline{x}))={{\mu }^{2}}$$  ( $$\mu $$  and  $$\sigma $$  are the mean and standard deviation, respectively). Using the same one parameter distribution, the variance of the function $$G\left( \widehat{\theta } \right)$$  can then be estimated by:


 * $$Var\left( G\left( \widehat{\theta } \right) \right)=\left( \frac{\partial G}{\partial \widehat{\beta }} \right)_{\widehat{\theta }=\theta }^{2}Var\left( \widehat{\theta } \right)+O\left( \frac{1} \right)$$

Two Parameter Case
Repeating the previous method for the case of a two parameter distribution, it is generally true that for a function $$G$$, which is a function of two parameter estimators, say  $$G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right)$$ , that:
 * $$E\left( G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \right)=G\left( {{\theta }_{1}},{{\theta }_{2}} \right)+O\left( \frac{1}{n} \right)$$

and:


 * $$\begin{align}

& Var\left( G\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \right)= & \left( \frac{\partial G}{\partial {{\widehat{\theta }}_{1}}} \right)_{{{\widehat{\theta }}_{1}}={{\theta }_{1}}}^{2}Var\left( {{\widehat{\theta }}_{1}} \right)+\left( \frac{\partial G}{\partial {{\widehat{\theta }}_{2}}} \right)_{{{\widehat{\theta }}_{2}}={{\theta }_{1}}}^{2}Var\left( {{\widehat{\theta }}_{2}} \right) +2{{\left( \frac{\partial G}{\partial {{\widehat{\theta }}_{1}}} \right)}_{{{\widehat{\theta }}_{1}}={{\theta }_{1}}}}{{\left( \frac{\partial G}{\partial {{\widehat{\theta }}_{2}}} \right)}_{{{\widehat{\theta }}_{2}}={{\theta }_{1}}}}Cov\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right)+O\left( \frac{1} \right) \end{align}$$

Note that the derivatives of $$G$$ are evaluated at $${{\widehat{\theta }}_{1}}={{\theta }_{1}}$$  and  $${{\widehat{\theta }}_{2}}={{\theta }_{1}},$$  where E $$\left( {{\widehat{\theta }}_{1}} \right)\simeq {{\theta }_{1}}$$  and E $$\left( {{\widehat{\theta }}_{2}} \right)\simeq {{\theta }_{2}}.$$

Variance and Covariance Determination of the Parameters
The determination of the variance and covariance of the parameters is accomplished via the use of the Fisher information matrix. For a two parameter distribution, and using maximum likelihood estimates, the log likelihood function for censored data (without the constant coefficient) is given by:


 * $$\begin{align}

& \ln [L]= & \Lambda =\underset{i=1}{\overset{R}{\mathop \sum }}\,\ln [f({{T}_{i}};{{\theta }_{1}},{{\theta }_{2}})] \text{ }+\underset{j=1}{\overset{M}{\mathop \sum }}\,\ln [1-F({{S}_{j}};{{\theta }_{1}},{{\theta }_{2}})] \text{ }+\underset{k=1}{\overset{P}{\mathop \sum }}\,\ln F({{I}_{i}};{{\theta }_{1}},{{\theta }_{2}})-F({{I}_{i-1}};{{\theta }_{1}},{{\theta }_{2}})\} \end{align}$$

Then the Fisher information matrix is given by:


 * $${{F}_{0}}=\left[ \begin{matrix}

{{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \right]}_{0}} \\ {} & {} & {} \\   {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} \right]}_{0}} & {} & {{E}_{0}}{{\left[ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}} \right]}_{0}}  \\ \end{matrix} \right]$$

where:


 * $${{\theta }_{1}}={{\theta }_},$$ and  $${{\theta }_{2}}={{\theta }_}.$$

So for a sample of $$N$$  units where  $$R$$  units have failed,  $$S$$  have been suspended, and  $$P$$  have failed within a time interval, and  $$N=R+M+P,$$  one could obtain the sample local information matrix by:


 * $$F=\left[ \begin{matrix}

-\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]$$

By substituting in the values of the estimated parameters, in this case $${{\widehat{\theta }}_{1}}$$  and  $${{\widehat{\theta }}_{2}},$$  and inverting the matrix, one can then obtain the local estimate of the covariance matrix or,


 * $${{F}^{-1}}={{\left[ \begin{matrix}

-\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{1}^{2}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{1}}\partial {{\theta }_{2}}} \\ {} & {} & {} \\   -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\theta }_{2}}\partial {{\theta }_{1}}} & {} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \theta _{2}^{2}}  \\ \end{matrix} \right]}^{-1}}=\left[ \begin{matrix} Var\left( {{\widehat{\theta }}_{1}} \right) & {} & Cov\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) \\ {} & {} & {} \\   Cov\left( {{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}} \right) & {} & Var\left( {{\widehat{\theta }}_{2}} \right)  \\ \end{matrix} \right]$$

Then the variance of a function ( $$Var(G)$$ ) can be estimated using the Fisher information matrix. Values for the variance and covariance of the parameters are obtained from the invert matrix. Once they are obtained, the approximate confidence bounds on the function are given as:


 * $$C{{B}_{R}}=E(G)\pm {{z}_{\alpha }}\sqrt{Var(G)}$$

Approximate Confidence Intervals on the Parameters
In general, MLE estimates of the parameters are asymptotically normal, thus if $$\widehat{\theta }$$  is the MLE estimator for  $$\theta $$, in the case of a single parameter distribution, estimated from a sample of  $$n$$  units, and if:


 * $$z\equiv \frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}$$

then:


 * $$P\left( z\le x \right)\to \Phi \left( z \right)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{x}{{e}^{-\tfrac{2}}}dt$$

for large $$n$$. If one now wishes to place confidence bounds on $$\theta ,$$  at some confidence level  $$\delta $$, bounded by the two end points  $${{C}_{1}}$$  and  $${{C}_{2}}$$ , and where:


 * $$P\left( {{C}_{1}}<\theta <{{C}_{2}} \right)=\delta $$

then:


 * $$P\left( -{{K}_{\tfrac{1-\delta }{2}}}<\frac{\widehat{\theta }-\theta }{\sqrt{Var\left( \widehat{\theta } \right)}}<{{K}_{\tfrac{1-\delta }{2}}} \right)\simeq \delta $$

where $${{K}_{\alpha }}$$  is defined by:


 * $$\alpha =\frac{1}{\sqrt{2\pi }}\int_^{\infty }{{e}^{-\tfrac{2}}}dt=1-\Phi \left( {{K}_{\alpha }} \right)$$

Now by simplifying the above equation, one can obtain the approximate confidence bounds on the parameter $$\theta ,$$  at a confidence level  $$\delta $$  or:


 * $$\left( \widehat{\theta }-{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)}<\theta <\widehat{\theta }+{{K}_{\tfrac{1-\delta }{2}}}\cdot \sqrt{Var\left( \widehat{\theta } \right)} \right)$$

If $$\widehat{\theta }$$  must be positive, then  $$\ln \widehat{\theta }$$  is treated as normally distributed. The two-sided approximate confidence bounds on the parameter $$\theta $$, at confidence level  $$\delta $$ , then become:


 * $$\begin{align}

{{\theta }_{U}}=\ & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{\tfrac{1-\delta }{2}}}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}} &\text{ (Two-sided Upper)} \\ {{\theta }_{L}}=\ & \frac{\widehat{\theta }} &\text{ (Two-sided Lower)} \end{align}$$

The one-sided approximate confidence bounds on the parameter $$\theta $$, at confidence level  $$\delta $$  can be found from:


 * $$\begin{align}

{{\theta }_{U}}=\ & \widehat{\theta }\cdot {{e}^{\tfrac{{{K}_{1-\delta }}\sqrt{Var\left( \widehat{\theta } \right)}}{\widehat{\theta }}}} &\text{ (One-sided Upper)} \\ {{\theta }_{L}}=\ & \frac{\widehat{\theta }} &\text{ (One-sided Lower)} \end{align}$$

The same procedure can be repeated for the case of a two or more parameter distribution. Lloyd and Lipow [24] elaborate on this procedure.

Percentile Confidence Bounds (Type 1 in ALTA)
Percentile confidence bounds are confidence bounds around time. For example, when using the 1-parameter exponential distribution, the corresponding time for a given exponential percentile (i.e., y-ordinate or unreliability, $$Q=1-R)$$  is determined by solving the unreliability function for the time,  $$T$$, or:


 * $$\begin{align}

& T(Q)= & -\frac{1}{\widehat{\lambda }}\ln (1-Q) = & -\frac{1}{\widehat{\lambda }}\ln (R) \end{align}$$

Percentile bounds (Type 1) return the confidence bounds by determining the confidence intervals around $$\widehat{\lambda }$$  and substituting into the above equation.

Reliability Confidence Bounds (Type 2 in ALTA)
Type 2 bounds in ALTA are confidence bounds around reliability. For example, when using the 1-parameter exponential distribution, the reliability function is:


 * $$R(T)={{e}^{-\widehat{\lambda }\cdot T}}$$

Reliability bounds (Type 2) return the confidence bounds by determining the confidence intervals around $$\widehat{\lambda }$$  and substituting into the above equation.