The Normal Distribution: Difference between revisions
Line 76: | Line 76: | ||
• The normal <math>pdf</math> has a mean, <math>\bar{T}</math> , which is equal to the median, <math>\breve{T}</math> , and also equal to the mode, <math>\tilde{T}</math> , or <math>\bar{T}=\breve{T}=\tilde{T}</math> . This is because the normal distribution is symmetrical about its mean. | • The normal <math>pdf</math> has a mean, <math>\bar{T}</math> , which is equal to the median, <math>\breve{T}</math> , and also equal to the mode, <math>\tilde{T}</math> , or <math>\bar{T}=\breve{T}=\tilde{T}</math> . This is because the normal distribution is symmetrical about its mean. | ||
[[ | [[Image:NPDF.gif|thumb|center|600px| ]] | ||
• The mean, <math>\mu </math> , or the mean life or the <math>MTTF</math> , is also the location parameter of the normal <math>pdf</math> , as it locates the <math>pdf</math> along the abscissa. It can assume values of <math>-\infty <\bar{T}<\infty </math> . | • The mean, <math>\mu </math> , or the mean life or the <math>MTTF</math> , is also the location parameter of the normal <math>pdf</math> , as it locates the <math>pdf</math> along the abscissa. It can assume values of <math>-\infty <\bar{T}<\infty </math> . |
Revision as of 16:12, 9 August 2011
The Normal Distribution
The normal distribution, also known as the Gaussian distribution, is the most widely-used general purpose distribution. It is for this reason that it is included among the lifetime distributions commonly used for reliability and life data analysis. There are some who argue that the normal distribution is inappropriate for modeling lifetime data because the left-hand limit of the distribution extends to negative infinity. This could conceivably result in modeling negative times-to-failure. However, provided that the distribution in question has a relatively high mean and a relatively small standard deviation, the issue of negative failure times should not present itself as a problem. Nevertheless, the normal distribution has been shown to be useful for modeling the lifetimes of consumable items, such as printer toner cartridges.
Normal Probability Density Function
The [math]\displaystyle{ pdf }[/math] of the normal distribution is given by:
- [math]\displaystyle{ f(T)=\frac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{T-\mu }{{{\sigma }_{T}}} \right)}^{2}}}} }[/math]
where:
[math]\displaystyle{ \mu= \text{mean of the normal times-to-faiure, also noted as} \bar T }[/math] [math]\displaystyle{ \theta=\text{standard deviation of the times-to-failure} }[/math]
It is a two-parameter distribution with parameters [math]\displaystyle{ \mu }[/math] (or [math]\displaystyle{ \bar{T} }[/math] ) and [math]\displaystyle{ {{\sigma }_{T}} }[/math] , i.e. the mean and the standard deviation, respectively.
Normal Statistical Properties
The Normal Mean, Median and Mode
The normal mean or MTTF is actually one of the parameters of the distribution, usually denoted as [math]\displaystyle{ \mu . }[/math] Since the normal distribution is symmetrical, the median and the mode are always equal to the mean,
- [math]\displaystyle{ \mu =\tilde{T}=\breve{T} }[/math]
The Normal Standard Deviation
As with the mean, the standard deviation for the normal distribution is actually one of the parameters, usually denoted as [math]\displaystyle{ {{\sigma }_{T}}. }[/math]
The Normal Reliability Function
The reliability for a mission of time [math]\displaystyle{ T }[/math] for the normal distribution is determined by:
- [math]\displaystyle{ R(T)=\int_{T}^{\infty }f(t)dt=\int_{T}^{\infty }\frac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}dt }[/math]
There is no closed-form solution for the normal reliability function. Solutions can be obtained via the use of standard normal tables. Since the application automatically solves for the reliability, we will not discuss manual solution methods. For interested readers, full explanations can be found in the references.
The Normal Conditional Reliability Function
The normal conditional reliability function is given by:
- [math]\displaystyle{ R(t|T)=\frac{R(T+t)}{R(T)}=\frac{\int_{T+t}^{\infty }\tfrac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}dt}{\int_{T}^{\infty }\tfrac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}dt} }[/math]
Once again, the use of standard normal tables for the calculation of the normal conditional reliability is necessary, as there is no closed form solution.
The Normal Reliable Life
Since there is no closed-form solution for the normal reliability function, there will also be no closed-form solution for the normal reliable life. To determine the normal reliable life, one must solve:
- [math]\displaystyle{ R(T)=\int_{T}^{\infty }\frac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}dt }[/math]
for [math]\displaystyle{ T }[/math] .
The Normal Failure Rate Function
The instantaneous normal failure rate is given by:
- [math]\displaystyle{ \lambda (T)=\frac{f(T)}{R(T)}=\frac{\tfrac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{T-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}}{\int_{T}^{\infty }\tfrac{1}{{{\sigma }_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\mu }{{{\sigma }_{T}}} \right)}^{2}}}}dt} }[/math]
Characteristics of the Normal Distribution
Some of the specific characteristics of the normal distribution are the following:
• The normal [math]\displaystyle{ pdf }[/math] has a mean, [math]\displaystyle{ \bar{T} }[/math] , which is equal to the median, [math]\displaystyle{ \breve{T} }[/math] , and also equal to the mode, [math]\displaystyle{ \tilde{T} }[/math] , or [math]\displaystyle{ \bar{T}=\breve{T}=\tilde{T} }[/math] . This is because the normal distribution is symmetrical about its mean.
• The mean, [math]\displaystyle{ \mu }[/math] , or the mean life or the [math]\displaystyle{ MTTF }[/math] , is also the location parameter of the normal [math]\displaystyle{ pdf }[/math] , as it locates the [math]\displaystyle{ pdf }[/math] along the abscissa. It can assume values of [math]\displaystyle{ -\infty \lt \bar{T}\lt \infty }[/math] .
• The normal [math]\displaystyle{ pdf }[/math] has no shape parameter. This means that the normal [math]\displaystyle{ pdf }[/math] has only one shape, the bell shape, and this shape does not change.
• The standard deviation, [math]\displaystyle{ {{\sigma }_{T}} }[/math] , is the scale parameter of the normal [math]\displaystyle{ pdf }[/math] .
- - As [math]\displaystyle{ {{\sigma }_{T}} }[/math] decreases, the [math]\displaystyle{ pdf }[/math] gets pushed toward the mean, or it becomes narrower and taller.
- - As [math]\displaystyle{ {{\sigma }_{T}} }[/math] increases, the [math]\displaystyle{ pdf }[/math] spreads out away from the mean, or it becomes broader and shallower.
- - The standard deviation can assume values of [math]\displaystyle{ 0\lt {{\sigma }_{T}}\lt \infty }[/math] .
- - The greater the variability, the larger the value of [math]\displaystyle{ {{\sigma }_{T}} }[/math] , and vice versa.
- - The standard deviation is also the distance between the mean and the point of inflection of the [math]\displaystyle{ pdf }[/math] , on each side of the mean. The point of inflection is that point of the [math]\displaystyle{ pdf }[/math] where the slope changes its value from a decreasing to an increasing one, or where the second derivative of the [math]\displaystyle{ pdf }[/math] has a value of zero.
• The normal [math]\displaystyle{ pdf }[/math] starts at [math]\displaystyle{ T=-\infty }[/math] with an [math]\displaystyle{ f(T)=0 }[/math] . As [math]\displaystyle{ T }[/math] increases, [math]\displaystyle{ f(T) }[/math] also increases, goes through its point of inflection and reaches its maximum value at [math]\displaystyle{ T=\bar{T} }[/math] . Thereafter, [math]\displaystyle{ f(T) }[/math] decreases, goes through its point of inflection, and assumes a value of [math]\displaystyle{ f(T)=0 }[/math] at [math]\displaystyle{ T=+\infty }[/math] .
Weibull++ Notes on Negative Time Values
One of the disadvantages of using the normal distribution for reliability calculations is the fact that the normal distribution starts at negative infinity. This can result in negative values for some of the results. Negative values for time are not accepted in most of the components of Weibull++, nor are they implemented. Certain components of the application reserve negative values for suspensions, or will not return negative results. For example, the Quick Calculation Pad will return a null value (zero) if the result is negative. Only the Free-Form (Probit) data sheet can accept negative values for the random variable (x-axis values).
Estimation of the Parameters
Probability Plotting
As described before, probability plotting involves plotting the failure times and associated unreliability estimates on specially constructed probability plotting paper. The form of this paper is based on a linearization of the [math]\displaystyle{ cdf }[/math] of the specific distribution. For the normal distribution, the cumulative density function can be written as:
- [math]\displaystyle{ F(T)=\Phi \left( \frac{T-\mu }{{{\sigma }_{T}}} \right) }[/math]
or:
- [math]\displaystyle{ {{\Phi }^{-1}}\left[ F(T) \right]=-\frac{\mu}{\sigma}+\frac{1}{\sigma}T }[/math]
where:
- [math]\displaystyle{ \Phi (x)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{x}{{e}^{-\tfrac{{{t}^{2}}}{2}}}dt }[/math]
Now, let:
- [math]\displaystyle{ y={{\Phi }^{-1}}\left[ F(T) \right] }[/math]
- [math]\displaystyle{ a=-\frac{\mu }{\sigma } }[/math]
and:
- [math]\displaystyle{ b=\frac{1}{\sigma } }[/math]
which results in the linear equation of:
- [math]\displaystyle{ y=a+bT }[/math]
The normal probability paper resulting from this linearized [math]\displaystyle{ cdf }[/math] function is shown next.
Since the normal distribution is symmetrical, the area under the [math]\displaystyle{ pdf }[/math] curve from [math]\displaystyle{ -\infty }[/math] to [math]\displaystyle{ \mu }[/math] is [math]\displaystyle{ 0.5 }[/math] , as is the area from [math]\displaystyle{ \mu }[/math] to [math]\displaystyle{ +\infty }[/math] . Consequently, the value of [math]\displaystyle{ \mu }[/math] is said to be the point where [math]\displaystyle{ R(t)=Q(t)=50% }[/math] . This means that the estimate of [math]\displaystyle{ \mu }[/math] can be read from the point where the plotted line crosses the 50% unreliability line.
To determine the value of [math]\displaystyle{ \sigma }[/math] from the probability plot, it is first necessary to understand that the area under the [math]\displaystyle{ pdf }[/math] curve that lies between one standard deviation in either direction from the mean (or two standard deviations total) represents 68.3% of the area under the curve. This is represented graphically in the following figure.
Consequently, the interval between [math]\displaystyle{ Q(t)=84.15% }[/math] and [math]\displaystyle{ Q(t)=15.85% }[/math] represents two standard deviations, since this is an interval of 68.3% ( [math]\displaystyle{ 84.15-15.85=68.3 }[/math] ), and is centered on the mean at 50%. As a result, the standard deviation can be estimated from:
- [math]\displaystyle{ \widehat{\sigma }=\frac{t(Q=84.15%)-t(Q=15.85%)}{2} }[/math]
That is: the value of [math]\displaystyle{ \widehat{\sigma } }[/math] is obtained by subtracting the time value where the plotted line crosses the 84.15% unreliability line from the time value where the plotted line crosses the 15.85% unreliability line and dividing the result by two. This process is illustrated in the following example.
Example 1
Seven units are put on a life test and run until failure. The failure times are 85, 90, 95, 100, 105, 110, and 115 hours. Assuming a normal distribution, estimate the parameters using probability plotting.
Solution to Example 1
In order to plot the points for the probability plot, the appropriate unreliability estimate values must be obtained. These will be estimated through the use of median ranks, which can be obtained from statistical tables or the Quick Statistical Reference in Weibull++. The following table shows the times-to-failure and the appropriate median rank values for this example:
- [math]\displaystyle{ \begin{matrix} \text{Time-to-} & \text{Median} \\ \text{Failure (hr)} & \text{Rank ( }\!\!%\!\!\text{ )} \\ \text{85} & \text{ 9}\text{.43 }\!\!%\!\!\text{ } \\ \text{90} & \text{22}\text{.85 }\!\!%\!\!\text{ } \\ \text{95} & \text{36}\text{.41 }\!\!%\!\!\text{ } \\ \text{100} & \text{50}\text{.00 }\!\!%\!\!\text{ } \\ \text{105} & \text{63}\text{.59 }\!\!%\!\!\text{ } \\ \text{110} & \text{77}\text{.15 }\!\!%\!\!\text{ } \\ \text{115} & \text{90}\text{.57 }\!\!%\!\!\text{ } \\ \end{matrix} }[/math]
These points can now be plotted on normal probability plotting paper as shown in the next figure.
Draw the best possible line through the plot points. The time values where this line intersects the 15.85%, 50%, and 84.15% unreliability values should be projected down to the abscissa, as shown in the following plot.
The estimate of [math]\displaystyle{ \mu }[/math] is determined from the time value at the 50% unreliability level, which in this case is 100 hours. The value of the estimator of [math]\displaystyle{ \sigma }[/math] is determined by Eqn. (sigplot):
- [math]\displaystyle{ \begin{align} \widehat{\sigma }= & \frac{t(Q=84.15%)-t(Q=15.85%)}{2} \\ \widehat{\sigma }= & \frac{112-88}{2}=\frac{24}{2} \\ \widehat{\sigma }= & 12\text{ hours} \end{align} }[/math]
Alternately, [math]\displaystyle{ \widehat{\sigma } }[/math] could be determined by measuring the distance from [math]\displaystyle{ t(Q=15.85%) }[/math] to [math]\displaystyle{ t(Q=50%) }[/math] , or [math]\displaystyle{ t(Q=50%) }[/math] to [math]\displaystyle{ t(Q=84.15%) }[/math] , as either of these two distances is equal to the value of one standard deviation.
Rank Regression on Y
Performing rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the squares of the vertical deviations from the points to the line is minimized.
The least squares parameter estimation method (regression analysis) was discussed in Chapter 3 and the following equations for regression on Y were derived:
- [math]\displaystyle{ \begin{align}\hat{a}= & \bar{b}-\hat{b}\bar{x} \\ =& \frac{\sum_{i=1}^N y_{i}}{N}-\hat{b}\frac{\sum_{i=1}^{N}x_{i}}{N}\\ \end{align} }[/math]
and:
- [math]\displaystyle{ \hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,x_{i}^{2}-\tfrac{{{\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}} \right)}^{2}}}{N}} }[/math]
In the case of the normal distribution, the equations for [math]\displaystyle{ {{y}_{i}} }[/math] and [math]\displaystyle{ {{x}_{i}} }[/math] are:
- [math]\displaystyle{ {{y}_{i}}={{\Phi }^{-1}}\left[ F({{T}_{i}}) \right] }[/math]
and:
- [math]\displaystyle{ {{x}_{i}}={{T}_{i}} }[/math]
where the values for [math]\displaystyle{ F({{T}_{i}}) }[/math] are estimated from the median ranks. Once [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{b} }[/math] are obtained, [math]\displaystyle{ \widehat{\sigma } }[/math] and [math]\displaystyle{ \widehat{\mu } }[/math] can easily be obtained from Eqns. (an) and (bn).
The Correlation Coefficient
The estimator of the sample correlation coefficient, [math]\displaystyle{ \hat{\rho } }[/math] , is given by:
- [math]\displaystyle{ \hat{\rho }=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,({{x}_{i}}-\overline{x})({{y}_{i}}-\overline{y})}{\sqrt{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{({{x}_{i}}-\overline{x})}^{2}}\cdot \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{({{y}_{i}}-\overline{y})}^{2}}}} }[/math]
Example 2
Fourteen units were reliability tested and the following life test data were obtained:
Data point index | Time-to-failure |
1 | 5 |
2 | 10 |
3 | 15 |
4 | 20 |
5 | 25 |
6 | 30 |
7 | 35 |
8 | 40 |
9 | 50 |
10 | 60 |
11 | 70 |
12 | 80 |
13 | 90 |
14 | 100 |
Assuming the data follow a normal distribution, estimate the parameters and determine the correlation coefficient, [math]\displaystyle{ \rho }[/math] , using rank regression on Y.
Solution to Example 2
Construct a table like the one shown next.
- [math]\displaystyle{ \overset{{}}{\mathop{\text{Table 8}\text{.2 - Least Squares Analysis}}}\, }[/math]
- [math]\displaystyle{ \begin{matrix} \text{N} & \text{T}_{i} & \text{F(T}_{i}\text{)} & \text{y}_{i} & \text{T}_{i}^{2} & \text{y}_{i}^{2} & \text{T}_{i}\text{ y}_{i} \\ \text{1} & \text{5} & \text{0}\text{.0483} & \text{-1}\text{.6619} & \text{25} & \text{2}\text{.7619} & \text{-8}\text{.3095} \\ \text{2} & \text{10} & \text{0}\text{.1170} & \text{-1}\text{.1901} & \text{100} & \text{1}\text{.4163} & \text{-11}\text{.9010} \\ \text{3} & \text{15} & \text{0}\text{.1865} & \text{-0}\text{.8908} & \text{225} & \text{0}\text{.7935} & \text{-13}\text{.3620} \\ \text{4} & \text{20} & \text{0}\text{.2561} & \text{-0}\text{.6552} & \text{400} & \text{0}\text{.4292} & \text{-13}\text{.1030} \\ \text{5} & \text{25} & \text{0}\text{.3258} & \text{-0}\text{.4512} & \text{625} & \text{0}\text{.2036} & \text{-11}\text{.2800} \\ \text{6} & \text{30} & \text{0}\text{.3954} & \text{-0}\text{.2647} & \text{900} & \text{0}\text{.0701} & \text{-7}\text{.9422} \\ \text{7} & \text{35} & \text{0}\text{.4651} & \text{-0}\text{.0873} & \text{1225} & \text{0}\text{.0076} & \text{-3}\text{.0542} \\ \text{8} & \text{40} & \text{0}\text{.5349} & \text{0}\text{.0873} & \text{1600} & \text{0}\text{.0076} & \text{3}\text{.4905} \\ \text{9} & \text{50} & \text{0}\text{.6046} & \text{0}\text{.2647} & \text{2500} & \text{0}\text{.0701} & \text{13}\text{.2370} \\ \text{10} & \text{60} & \text{0}\text{.6742} & \text{0}\text{.4512} & \text{3600} & \text{0}\text{.2036} & \text{27}\text{.0720} \\ \text{11} & \text{70} & \text{0}\text{.7439} & \text{0}\text{.6552} & \text{4900} & \text{0}\text{.4292} & \text{45}\text{.8605} \\ \text{12} & \text{80} & \text{0}\text{.8135} & \text{0}\text{.8908} & \text{6400} & \text{0}\text{.7935} & \text{71}\text{.2640} \\ \text{13} & \text{90} & \text{0}\text{.8830} & \text{1}\text{.1901} & \text{8100} & \text{1}\text{.4163} & \text{107}\text{.1090} \\ \text{14} & \text{100} & \text{0}\text{.9517} & \text{1}\text{.6619} & \text{10000} & \text{2}\text{.7619} & \text{166}\text{.1900} \\ \mathop{}_{}^{} & \text{630} & {} & \text{0} & \text{40600} & \text{11}\text{.3646} & \text{365}\text{.2711} \\ \end{matrix} }[/math]
• The median rank values ( [math]\displaystyle{ F({{T}_{i}}) }[/math] ) can be found in rank tables, available in many statistical texts, or they can be estimated by using the Quick Statistical Reference in Weibull++.
• The [math]\displaystyle{ {{y}_{i}} }[/math] values were obtained from standardized normal distribution's area tables by entering for [math]\displaystyle{ F(z) }[/math] and getting the corresponding [math]\displaystyle{ z }[/math] value ( [math]\displaystyle{ {{y}_{i}} }[/math] ). As with the median rank values, these standard normal values can be obtained with the Quick Statistical Reference. Given the values in Table 8.2, calculate [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{b} }[/math] using Eqns. (aan) and (bbn):
- [math]\displaystyle{ \begin{align} & \widehat{b}= & \frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}}{{y}_{i}}-(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}})(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}})/14}{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,T_{i}^{2}-{{(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}})}^{2}}/14} \\ & & \\ & \widehat{b}= & \frac{365.2711-(630)(0)/14}{40,600-{{(630)}^{2}}/14}=0.02982 \end{align} }[/math]
and:
- [math]\displaystyle{ \widehat{a}=\overline{y}-\widehat{b}\overline{T}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}-\widehat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{T}_{i}}}{N} }[/math]
or:
- [math]\displaystyle{ \widehat{a}=\frac{0}{14}-(0.02982)\frac{630}{14}=-1.3419 }[/math]
Therefore, from Eqn. (bn):
[math]\displaystyle{ \widehat{\sigma}=\frac{1}{\hat{b}}=\frac{1}{0.02982}=33.5367 }[/math]
and from Eqn. (an):
- [math]\displaystyle{ \widehat{\mu }=-\widehat{a}\cdot \widehat{\sigma }=-(-1.3419)\cdot 33.5367\simeq 45 }[/math]
or [math]\displaystyle{ \widehat{\mu }=45 }[/math] hours [math]\displaystyle{ . }[/math] The correlation coefficient can be estimated using Eqn. (RHOn):
- [math]\displaystyle{ \widehat{\rho }=0.979 }[/math]
The preceding example can be repeated using Weibull++ .
• Create a new folio for Times-to-Failure data, and enter the data given in Table 8.1.
• Choose Normal from the Distributions list.
• Go to the Analysis page and select Rank Regression on Y (RRY).
• Click the Calculate icon located on the Main page.
The probability plot is shown next.
Rank Regression on X
As was mentioned previously, performing a rank regression on X requires that a straight line be fitted to a set of data points such that the sum of the squares of the horizontal deviations from the points to the fitted line is minimized.
Again, the first task is to bring our function, Eqn. (Fnorm), into a linear form. This step is exactly the same as in regression on Y analysis and Eqns. (norm), (yn), (an), and (bn) apply in this case as they did for the regression on Y. The deviation from the previous analysis begins on the least squares fit step where: in this case, we treat [math]\displaystyle{ x }[/math] as the dependent variable and [math]\displaystyle{ y }[/math] as the independent variable. The best-fitting straight line for the data, for regression on X, is the straight line:
- [math]\displaystyle{ x=\widehat{a}+\widehat{b}y }[/math]
The corresponding equations for [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{b} }[/math] are:
- [math]\displaystyle{ \hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}}{N}-\hat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N} }[/math]
and:
- [math]\displaystyle{ \hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{{{\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}} \right)}^{2}}}{N}} }[/math]
where:
- [math]\displaystyle{ {{y}_{i}}={{\Phi }^{-1}}\left[ F({{T}_{i}}) \right] }[/math]
and:
- [math]\displaystyle{ {{x}_{i}}={{T}_{i}} }[/math]
and the [math]\displaystyle{ F({{T}_{i}}) }[/math] values are estimated from the median ranks. Once [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{b} }[/math] are obtained, solve Eqn. (xlinen) for the unknown value of [math]\displaystyle{ y }[/math] which corresponds to:
- [math]\displaystyle{ y=-\frac{\widehat{a}}{\widehat{b}}+\frac{1}{\widehat{b}}x }[/math]
Solving for the parameters from Eqns. (an) and (bn), we get:
- [math]\displaystyle{ a=-\frac{\widehat{a}}{\widehat{b}}=-\frac{\mu }{\sigma }\Rightarrow \mu =\widehat{a} }[/math]
and:
- [math]\displaystyle{ b=\frac{1}{\widehat{b}}=\frac{1}{\sigma }\Rightarrow \sigma =\widehat{b} }[/math]
The correlation coefficient is evaluated as before using Eqn. (RHOn).
Example 3
Using the data of Example 2 and assuming a normal distribution, estimate the parameters and determine the correlation coefficient, [math]\displaystyle{ \rho }[/math] , using rank regression on X.
Solution to Example 3
Table 8.2 constructed in Example 2 applies to this example also. Using the values on this table, we get:
- [math]\displaystyle{ \begin{align} \hat{b}= & \frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}}\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}}}{14}}{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{{{\left( \underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}} \right)}^{2}}}{14}} \\ \widehat{b}= & \frac{365.2711-(630)(0)/14}{11.3646-{{(0)}^{2}}/14}=32.1411 \end{align} }[/math]
and:
- [math]\displaystyle{ \hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{T}_{i}}}{14}-\widehat{b}\frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}}}{14} }[/math]
or:
- [math]\displaystyle{ \widehat{a}=\frac{630}{14}-(32.1411)\frac{(0)}{14}=45 }[/math]
Therefore, from Eqn. (bnx):
- [math]\displaystyle{ \widehat{\sigma }=\widehat{b}=32.1411 }[/math]
and from Eqn. (anx):
- [math]\displaystyle{ \widehat{\mu }=\widehat{a}=45\text{ hours} }[/math]
The correlation coefficient is found using Eqn. (RHOn):
- [math]\displaystyle{ \widehat{\rho }=0.979 }[/math]
Note that the results for regression on X are not necessarily the same as the results for regression on Y. The only time when the two regressions are the same (i.e. will yield the same equation for a line) is when the data lie perfectly on a straight line.
Using Weibull++ , Rank Regression on X (RRX) can be selected from the Analysis page.
The plot of the solution for this example is shown next.
[math]\displaystyle{ }[/math]
Maximum Likelihood Estimation
As it was outlined in Chapter 3, maximum likelihood estimation works by developing a likelihood function based on the available data and finding the values of the parameter estimates that maximize the likelihood function. This can be achieved by using iterative methods to determine the parameter estimate values that maximize the likelihood function. This can be rather difficult and time-consuming, particularly when dealing with the three-parameter distribution. Another method of finding the parameter estimates involves taking the partial derivatives of the likelihood function with respect to the parameters, setting the resulting equations equal to zero, and solving simultaneously to determine the values of the parameter estimates. The log-likelihood functions and associated partial derivatives used to determine maximum likelihood estimates for the normal distribution are covered in Appendix C.
Special Note About Bias
Estimators (i.e. parameter estimates) have properties such as unbiasedness, minimum variance, sufficiency, consistency, squared error constancy, efficiency and completeness [7][5]. Numerous books and papers deal with these properties and this coverage is beyond the scope of this reference.
However, we would like to briefly address one of these properties, unbiasedness. An estimator is said to be unbiased if the estimator [math]\displaystyle{ \widehat{\theta }=d({{X}_{1,}}{{X}_{2,}}...,{{X}_{n)}} }[/math] satisfies the condition [math]\displaystyle{ E\left[ \widehat{\theta } \right] }[/math] [math]\displaystyle{ =\theta }[/math] for all [math]\displaystyle{ \theta \in \Omega . }[/math] Note that [math]\displaystyle{ E\left[ X \right] }[/math] denotes the expected value of X and is defined (for continuous distributions) by:
- [math]\displaystyle{ \begin{align} E\left[ X \right]= \int_{\varpi }x\cdot f(x)dx \\ X\in & \varpi . \end{align} }[/math]
It can be shown [7][5] that the MLE estimator for the mean of the normal (and lognormal) distribution does satisfy the unbiasedness criteria, or [math]\displaystyle{ E\left[ \widehat{\mu } \right] }[/math] [math]\displaystyle{ =\mu . }[/math] The same is not true for the estimate of the variance [math]\displaystyle{ \hat{\sigma }_{T}^{2} }[/math] . The maximum likelihood estimate for the variance for the normal distribution is given by:
- [math]\displaystyle{ \hat{\sigma }_{T}^{2}=\frac{1}{N}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}} }[/math]
with a standard deviation of:
- [math]\displaystyle{ {{\hat{\sigma }}_{T}}=\sqrt{\frac{1}{N}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}}} }[/math]
These estimates, however, have been shown to be biased. It can be shown [7][5] that the unbiased estimate of the variance and standard deviation for complete data is given by:
- [math]\displaystyle{ \begin{align} \hat{\sigma }_{T}^{2}= & \left[ \frac{N}{N-1} \right]\cdot \left[ \frac{1}{N}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}} \right]=\frac{1}{N-1}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}} \\ {{{\hat{\sigma }}}_{T}}= & \sqrt{\left[ \frac{N}{N-1} \right]\cdot \left[ \frac{1}{N}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}} \right]} \\ = & \sqrt{\frac{1}{N-1}\underset{i=1}{\overset{N}{\mathop \sum }}\,{{({{T}_{i}}-\bar{T})}^{2}}} \end{align} }[/math]
Note that for larger values of [math]\displaystyle{ N }[/math] , [math]\displaystyle{ \sqrt{\left[ N/(N-1) \right]} }[/math] tends to 1.
Weibull++ by default returns the standard deviation as defined by Eqn. (NormSt2). The Use Unbiased Std on Normal Data option in the User Setup under the Calculations tab allows biasing to be considered when estimating the parameters.
When this option is selected, Weibull++ returns the standard deviation as defined by Eqn. (NormSt2). This is only true for complete data sets. For all other data types, Weibull++ by default returns the standard deviation as defined by Eqn. (normbias2) regardless of the selection status of this option. The next figure shows this setting in Weibull++.
[math]\displaystyle{ }[/math]
Confidence Bounds
The method used by the application in estimating the different types of confidence bounds for normally distributed data is presented in this section. The complete derivations were presented in detail (for a general function) in Chapter 5.
Exact Confidence Bounds
There are closed-form solutions for exact confidence bounds for both the normal and lognormal distributions. However these closed-forms solutions only apply to complete data. To achieve consistent application across all possible data types, Weibull++ always uses the Fisher matrix method or likelihood ratio method in computing confidence intervals.
Fisher Matrix Confidence Bounds
Bounds on the Parameters
The lower and upper bounds on the mean, [math]\displaystyle{ \widehat{\mu } }[/math] , are estimated from:
- [math]\displaystyle{ \begin{align} & {{\mu }_{U}}= & \widehat{\mu }+{{K}_{\alpha }}\sqrt{Var(\widehat{\mu })}\text{ (upper bound),} \\ & {{\mu }_{L}}= & \widehat{\mu }-{{K}_{\alpha }}\sqrt{Var(\widehat{\mu })}\text{ (lower bound)}\text{.} \end{align} }[/math]
Since the standard deviation, [math]\displaystyle{ {{\widehat{\sigma }}_{T}} }[/math] , must be positive, [math]\displaystyle{ \ln ({{\widehat{\sigma }}_{T}}) }[/math] is treated as normally distributed, and the bounds are estimated from:
- [math]\displaystyle{ \begin{align} & {{\sigma }_{U}}= & {{\widehat{\sigma }}_{T}}\cdot {{e}^{\tfrac{{{K}_{\alpha }}\sqrt{Var({{\widehat{\sigma }}_{T}})}}{{{\widehat{\sigma }}_{T}}}}}\text{ (upper bound),} \\ & {{\sigma }_{L}}= & \frac{{{\widehat{\sigma }}_{T}}}{{{e}^{\tfrac{{{K}_{\alpha }}\sqrt{Var({{\widehat{\sigma }}_{T}})}}{{{\widehat{\sigma }}_{T}}}}}}\text{ (lower bound),} \end{align} }[/math]
where [math]\displaystyle{ {{K}_{\alpha }} }[/math] is defined by:
- [math]\displaystyle{ \alpha =\frac{1}{\sqrt{2\pi }}\int_{{{K}_{\alpha }}}^{\infty }{{e}^{-\tfrac{{{t}^{2}}}{2}}}dt=1-\Phi ({{K}_{\alpha }}) }[/math]
If [math]\displaystyle{ \delta }[/math] is the confidence level, then [math]\displaystyle{ \alpha =\tfrac{1-\delta }{2} }[/math] for the two-sided bounds and [math]\displaystyle{ \alpha =1-\delta }[/math] for the one-sided bounds.
The variances and covariances of [math]\displaystyle{ \widehat{\mu } }[/math] and [math]\displaystyle{ {{\widehat{\sigma }}_{T}} }[/math] are estimated from the Fisher matrix, as follows:
- [math]\displaystyle{ \left( \begin{matrix} \widehat{Var}\left( \widehat{\mu } \right) & \widehat{Cov}\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) \\ \widehat{Cov}\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) & \widehat{Var}\left( {{\widehat{\sigma }}_{T}} \right) \\ \end{matrix} \right)=\left( \begin{matrix} -\tfrac{{{\partial }^{2}}\Lambda }{\partial {{\mu }^{2}}} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \mu \partial {{\sigma }_{T}}} \\ {} & {} \\ -\tfrac{{{\partial }^{2}}\Lambda }{\partial \mu \partial {{\sigma }_{T}}} & -\tfrac{{{\partial }^{2}}\Lambda }{\partial \sigma _{T}^{2}} \\ \end{matrix} \right)_{\mu =\widehat{\mu },\sigma =\widehat{\sigma }}^{-1} }[/math]
[math]\displaystyle{ \Lambda }[/math] is the log-likelihood function of the normal distribution, described in
Chapter 3 and Appendix C.
Bounds on Reliability
The reliability of the normal distribution is:
- [math]\displaystyle{ \widehat{R}(T;\hat{\mu },{{\hat{\sigma }}_{T}})=\int_{T}^{\infty }\frac{1}{{{\widehat{\sigma }}_{T}}\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{\left( \tfrac{t-\widehat{\mu }}{{{\widehat{\sigma }}_{T}}} \right)}^{2}}}}dt }[/math]
Let [math]\displaystyle{ \widehat{z}(t;\hat{\mu },{{\hat{\sigma }}_{T}})=\tfrac{t-\widehat{\mu }}{{{\widehat{\sigma }}_{T}}}, }[/math] then [math]\displaystyle{ \tfrac{dz}{dt}=\tfrac{1}{{{\widehat{\sigma }}_{T}}}. }[/math] For [math]\displaystyle{ t=T }[/math] , [math]\displaystyle{ \widehat{z}=\tfrac{T-\widehat{\mu }}{{{\widehat{\sigma }}_{T}}} }[/math] , and for [math]\displaystyle{ t=\infty , }[/math] [math]\displaystyle{ \widehat{z}=\infty . }[/math] The above equation then becomes:
- [math]\displaystyle{ \hat{R}(\widehat{z})=\int_{\widehat{z}(T)}^{\infty }\frac{1}{\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{z}^{2}}}}dz }[/math]
The bounds on [math]\displaystyle{ z }[/math] are estimated from:
- [math]\displaystyle{ \begin{align} & {{z}_{U}}= & \widehat{z}+{{K}_{\alpha }}\sqrt{Var(\widehat{z})} \\ & {{z}_{L}}= & \widehat{z}-{{K}_{\alpha }}\sqrt{Var(\widehat{z})} \end{align} }[/math]
where:
- [math]\displaystyle{ Var(\widehat{z})={{\left( \frac{\partial z}{\partial \mu } \right)}^{2}}Var(\widehat{\mu })+{{\left( \frac{\partial z}{\partial {{\sigma }_{T}}} \right)}^{2}}Var({{\widehat{\sigma }}_{T}})+2\left( \frac{\partial z}{\partial \mu } \right)\left( \frac{\partial z}{\partial {{\sigma }_{T}}} \right)Cov\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) }[/math]
or:
- [math]\displaystyle{ Var(\widehat{z})=\frac{1}{\widehat{\sigma }_{T}^{2}}\left[ Var(\widehat{\mu })+{{\widehat{z}}^{2}}Var({{\widehat{\sigma }}_{T}})+2\cdot \widehat{z}\cdot Cov\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) \right] }[/math]
The upper and lower bounds on reliability are:
- [math]\displaystyle{ \begin{align} & {{R}_{U}}= & \int_{{{z}_{L}}}^{\infty }\frac{1}{\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{z}^{2}}}}dz\text{ (upper bound)} \\ & {{R}_{L}}= & \int_{{{z}_{U}}}^{\infty }\frac{1}{\sqrt{2\pi }}{{e}^{-\tfrac{1}{2}{{z}^{2}}}}dz\text{ (lower bound)} \end{align} }[/math]
Bounds on Time
The bounds around time for a given normal percentile (unreliability) are estimated by first solving the reliability equation with respect to time, as follows:
- [math]\displaystyle{ \hat{T}(\widehat{\mu },{{\widehat{\sigma }}_{T}})=\widehat{\mu }+z\cdot {{\widehat{\sigma }}_{T}} }[/math]
where:
- [math]\displaystyle{ z={{\Phi }^{-1}}\left[ F(T) \right] }[/math]
and:
- [math]\displaystyle{ \Phi (z)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{z(T)}{{e}^{-\tfrac{1}{2}{{z}^{2}}}}dz }[/math]
The next step is to calculate the variance of [math]\displaystyle{ \hat{T}(\widehat{\mu },{{\widehat{\sigma }}_{T}}) }[/math] or:
- [math]\displaystyle{ \begin{align} Var(\hat{T})= & {{\left( \frac{\partial T}{\partial \mu } \right)}^{2}}Var(\widehat{\mu })+{{\left( \frac{\partial T}{\partial {{\sigma }_{T}}} \right)}^{2}}Var({{\widehat{\sigma }}_{T}}) \\ & +2\left( \frac{\partial T}{\partial \mu } \right)\left( \frac{\partial T}{\partial {{\sigma }_{T}}} \right)Cov\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) \\ Var(\hat{T})= & Var(\widehat{\mu })+{{\widehat{z}}^{2}}Var({{\widehat{\sigma }}_{T}})+2\cdot z\cdot Cov\left( \widehat{\mu },{{\widehat{\sigma }}_{T}} \right) \end{align} }[/math]
The upper and lower bounds are then found by:
- [math]\displaystyle{ \begin{align} & {{T}_{U}}= & \hat{T}+{{K}_{\alpha }}\sqrt{Var(\hat{T})}\text{ (upper bound)} \\ & {{T}_{L}}= & \hat{T}-{{K}_{\alpha }}\sqrt{Var(\hat{T})}\text{ (lower bound)} \end{align} }[/math]
Example 4
Using the data of Example 2 and assuming a normal distribution, estimate the parameters using the MLE method.
Solution to Example 4
In this example we have non-grouped data without suspensions and without interval data. The partial derivatives of the normal log-likelihood function, [math]\displaystyle{ \Lambda , }[/math] are given by:
- [math]\displaystyle{ \begin{align} \frac{\partial \Lambda }{\partial \mu }= & \frac{1}{{{\sigma }^{2}}}\underset{i=1}{\overset{14}{\mathop \sum }}\,({{T}_{i}}-\mu )=0 \\ \frac{\partial \Lambda }{\partial \sigma }= & \underset{i=1}{\overset{14}{\mathop \sum }}\,\left( \frac{{{T}_{i}}-\mu }{{{\sigma }^{3}}}-\frac{1}{\sigma } \right)=0 \end{align} }[/math]
(The derivations of these equations are presented in Appendix C.) Substituting the values of [math]\displaystyle{ {{T}_{i}} }[/math] and solving the above system simultaneously, we get [math]\displaystyle{ \widehat{\sigma }=29.58 }[/math] hours [math]\displaystyle{ , }[/math] [math]\displaystyle{ \widehat{\mu }=45 }[/math] hours [math]\displaystyle{ . }[/math]
The Fisher matrix is:
- [math]\displaystyle{ \left[ \begin{matrix} \widehat{Var}\left( \widehat{\mu } \right)=62.5000 & {} & \widehat{Cov}\left( \widehat{\mu },\widehat{\sigma } \right)=0.0000 \\ {} & {} & {} \\ \widehat{Cov}\left( \widehat{\mu },\widehat{\sigma } \right)=0.0000 & {} & \widehat{Var}\left( \widehat{\sigma } \right)=31.2500 \\ \end{matrix} \right] }[/math]
Using Weibull++ , the MLE method can be selected from the Set Analysis page.
The plot of the solution for this example is shown next.
Likelihood Ratio Confidence Bounds
Bounds on Parameters
As covered in Chapter 5, the likelihood confidence bounds are calculated by finding values for [math]\displaystyle{ {{\theta }_{1}} }[/math] and [math]\displaystyle{ {{\theta }_{2}} }[/math] that satisfy:
- [math]\displaystyle{ -2\cdot \text{ln}\left( \frac{L({{\theta }_{1}},{{\theta }_{2}})}{L({{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}})} \right)=\chi _{\alpha ;1}^{2} }[/math]
This equation can be rewritten as:
- [math]\displaystyle{ L({{\theta }_{1}},{{\theta }_{2}})=L({{\widehat{\theta }}_{1}},{{\widehat{\theta }}_{2}})\cdot {{e}^{\tfrac{-\chi _{\alpha ;1}^{2}}{2}}} }[/math]
For complete data, the likelihood formula for the normal distribution is given by:
- [math]\displaystyle{ L(\mu ,\sigma )=\underset{i=1}{\overset{N}{\mathop \prod }}\,f({{x}_{i}};\mu ,\sigma )=\underset{i=1}{\overset{N}{\mathop \prod }}\,\frac{1}{\sigma \cdot \sqrt{2\pi }}\cdot {{e}^{-\tfrac{1}{2}{{\left( \tfrac{{{x}_{i}}-\mu }{\sigma } \right)}^{2}}}} }[/math]
where the [math]\displaystyle{ {{x}_{i}} }[/math] values represent the original time to failure data. For a given value of [math]\displaystyle{ \alpha }[/math] , values for [math]\displaystyle{ \mu }[/math] and [math]\displaystyle{ \sigma }[/math] can be found which represent the maximum and minimum values that satisfy Eqn. (lratio3). These represent the confidence bounds for the parameters at a confidence level [math]\displaystyle{ \delta , }[/math] where [math]\displaystyle{ \alpha =\delta }[/math] for two-sided bounds and [math]\displaystyle{ \alpha =2\delta -1 }[/math] for one-sided.
Example 5
Five units are put on a reliability test and experience failures at 12, 24, 28, 34, and 46 hours. Assuming a normal distribution, the MLE parameter estimates are calculated to be [math]\displaystyle{ \widehat{\mu }=28.8 }[/math] and [math]\displaystyle{ \widehat{\sigma }=11.2143. }[/math] Calculate the two-sided 80% confidence bounds on these parameters using the likelihood ratio method.
Solution to Example 5
The first step is to calculate the likelihood function for the parameter estimates:
- [math]\displaystyle{ \begin{align} L(\widehat{\mu },\widehat{\sigma })= & \underset{i=1}{\overset{N}{\mathop \prod }}\,f({{x}_{i}};\widehat{\mu },\widehat{\sigma })=\underset{i=1}{\overset{5}{\mathop \prod }}\,\frac{1}{\widehat{\sigma }\cdot \sqrt{2\pi }}\cdot {{e}^{-\tfrac{1}{2}{{\left( \tfrac{{{x}_{i}}-\widehat{\mu }}{\widehat{\sigma }} \right)}^{2}}}} \\ L(\widehat{\mu },\widehat{\sigma })= & \underset{i=1}{\overset{5}{\mathop \prod }}\,\frac{1}{11.2143\cdot \sqrt{2\pi }}\cdot {{e}^{-\tfrac{1}{2}{{\left( \tfrac{{{x}_{i}}-28.8}{11.2143} \right)}^{2}}}} \\ L(\widehat{\mu },\widehat{\sigma })= & 4.676897\times {{10}^{-9}} \end{align} }[/math]
where [math]\displaystyle{ {{x}_{i}} }[/math] are the original time-to-failure data points. We can now rearrange Eqn. (lratio3) to the form:
- [math]\displaystyle{ L(\mu ,\sigma )-L(\widehat{\mu },\widehat{\sigma })\cdot {{e}^{\tfrac{-\chi _{\alpha ;1}^{2}}{2}}}=0 }[/math]
Since our specified confidence level, [math]\displaystyle{ \delta }[/math] , is 80%, we can calculate the value of the chi-squared statistic, [math]\displaystyle{ \chi _{0.8;1}^{2}=1.642374. }[/math] We can now substitute this information into the equation:
- [math]\displaystyle{ \begin{align} L(\mu ,\sigma )-L(\widehat{\mu },\widehat{\sigma })\cdot {{e}^{\tfrac{-\chi _{\alpha ;1}^{2}}{2}}}= & 0, \\ \\ L(\mu ,\sigma )-4.676897\times {{10}^{-9}}\cdot {{e}^{\tfrac{-1.642374}{2}}}= & 0, \\ \\ L(\mu ,\sigma )-2.057410\times {{10}^{-9}}= & 0. \end{align} }[/math]
It now remains to find the values of [math]\displaystyle{ \mu }[/math] and [math]\displaystyle{ \sigma }[/math] which satisfy this equation. This is an iterative process that requires setting the value of [math]\displaystyle{ \mu }[/math] and finding the appropriate values of [math]\displaystyle{ \sigma }[/math] , and vice versa.
The following table gives the values of [math]\displaystyle{ \sigma }[/math] based on given values of [math]\displaystyle{ \mu }[/math] .
[math]\displaystyle{ }[/math]
- [math]\displaystyle{ \begin{matrix} \text{ }\!\!\mu\!\!\text{ } & {{\text{ }\!\!\sigma\!\!\text{ }}_{\text{1}}} & {{\text{ }\!\!\sigma\!\!\text{ }}_{\text{2}}} & \text{ }\!\!\mu\!\!\text{ } & {{\text{ }\!\!\sigma\!\!\text{ }}_{\text{1}}} & {{\text{ }\!\!\sigma\!\!\text{ }}_{\text{2}}} \\ \text{22}\text{.0} & \text{12}\text{.045} & \text{14}\text{.354} & \text{29}\text{.0} & \text{7.849}& \text{19.909} \\ \text{22}\text{.5} & \text{11}\text{.004} & \text{15}\text{.310} & \text{29}\text{.5} & \text{7}\text{.876} & \text{17}\text{.889} \\ \text{23}\text{.0} & \text{10}\text{.341} & \text{15}\text{.894} & \text{30}\text{.0} & \text{7}\text{.935} & \text{17}\text{.844} \\ \text{23}\text{.5} & \text{9}\text{.832} & \text{16}\text{.328} & \text{30}\text{.5} & \text{8}\text{.025} & \text{17}\text{.776} \\ \text{24}\text{.0} & \text{9}\text{.418} & \text{16}\text{.673} & \text{31}\text{.0} & \text{8}\text{.147} & \text{17}\text{.683} \\ \text{24}\text{.5} & \text{9}\text{.074} & \text{16}\text{.954} & \text{31}\text{.5} & \text{8}\text{.304} & \text{17}\text{.562} \\ \text{25}\text{.0} & \text{8}\text{.784} & \text{17}\text{.186} & \text{32}\text{.0} & \text{8}\text{.498} & \text{17}\text{.411} \\ \text{25}\text{.5} & \text{8}\text{.542} & \text{17}\text{.377} & \text{32}\text{.5} & \text{8}\text{.732} & \text{17}\text{.227} \\ \text{26}\text{.0} & \text{8}\text{.340} & \text{17}\text{.534} & \text{33}\text{.0} & \text{9}\text{.012} & \text{17}\text{.004} \\ \text{26}\text{.5} & \text{8}\text{.176} & \text{17}\text{.661} & \text{33}\text{.5} & \text{9}\text{.344} & \text{16}\text{.734} \\ \text{27}\text{.0} & \text{8}\text{.047} & \text{17}\text{.760} & \text{34}\text{.0} & \text{9}\text{.742} & \text{16}\text{.403} \\ \text{27}\text{.5} & \text{7}\text{.950} & \text{17}\text{.833} & \text{34}\text{.5} & \text{10}\text{.229} & \text{15}\text{.990} \\ \text{28}\text{.0} & \text{7}\text{.885} & \text{17}\text{.882} & \text{35}\text{.0} & \text{10}\text{.854} & \text{15}\text{.444} \\ \text{28}\text{.5} & \text{7}\text{.852} & \text{17}\text{.907} & \text{35}\text{.5} & \text{11}\text{.772} & \text{14}\text{.609} \\ \end{matrix} }[/math]
This data set is represented graphically in the following contour plot:
(Note that this plot is generated with degrees of freedom [math]\displaystyle{ k=1 }[/math] , as we are only determining bounds on one parameter. The contour plots generated in Weibull++ are done with degrees of freedom [math]\displaystyle{ k=2 }[/math] , for use in comparing both parameters simultaneously.) As can be determined from the table, the lowest calculated value for [math]\displaystyle{ \sigma }[/math] is 7.849, while the highest is 17.909. These represent the two-sided 80% confidence limits on this parameter. Since solutions for the equation do not exist for values of [math]\displaystyle{ \mu }[/math] below 22 or above 35.5, these can be considered the two-sided 80% confidence limits for this parameter. In order to obtain more accurate values for the confidence limits on [math]\displaystyle{ \mu }[/math] , we can perform the same procedure as before, but finding the two values of [math]\displaystyle{ \mu }[/math] that correspond with a given value of [math]\displaystyle{ \sigma . }[/math] Using this method, we find that the two-sided 80% confidence limits on [math]\displaystyle{ \mu }[/math] are 21.807 and 35.793, which are close to the initial estimates of 22 and 35.5.
Bounds on Time and Reliability
In order to calculate the bounds on a time estimate for a given reliability, or on a reliability estimate for a given time, the likelihood function needs to be rewritten in terms of one parameter and time/reliability, so that the maximum and minimum values of the time can be observed as the parameter is varied. This can be accomplished by substituting a form of the normal reliability equation into the likelihood function. The normal reliability equation can be written as:
- [math]\displaystyle{ R=1-\Phi \left( \frac{t-\mu }{\sigma } \right) }[/math]
This can be rearranged to the form:
- [math]\displaystyle{ \mu =t-\sigma \cdot {{\Phi }^{-1}}(1-R) }[/math]
where [math]\displaystyle{ {{\Phi }^{-1}} }[/math] is the inverse standard normal. This equation can now be substituted into Eqn. (normlikelihood), to produce a likelihood equation in terms of [math]\displaystyle{ \sigma , }[/math] [math]\displaystyle{ t }[/math] and [math]\displaystyle{ R\ \ : }[/math]
- [math]\displaystyle{ L(\sigma ,t/R)=\underset{i=1}{\overset{N}{\mathop \prod }}\,\frac{1}{\sigma \cdot \sqrt{2\pi }}\cdot {{e}^{-\tfrac{1}{2}{{\left( \tfrac{{{x}_{i}}-\left[ t-\sigma \cdot {{\Phi }^{-1}}(1-R) \right]}{\sigma } \right)}^{2}}}} }[/math]
The unknown parameter [math]\displaystyle{ t/R }[/math] depends on what type of bounds are being determined. If one is trying to determine the bounds on time for a given reliability, then [math]\displaystyle{ R }[/math] is a known constant and [math]\displaystyle{ t }[/math] is the unknown parameter. Conversely, if one is trying to determine the bounds on reliability for a given time, then [math]\displaystyle{ t }[/math] is a known constant and [math]\displaystyle{ R }[/math] is the unknown parameter. Either way, Eqn. (normliketr) can be used to solve Eqn. (lratio3) for the values of interest.
Example 6
For the data given in Example 5, determine the two-sided 80% confidence bounds on the time estimate for a reliability of 40%. The ML estimate for the time at [math]\displaystyle{ R(t)=40% }[/math] is 31.637.
Solution to Example 6
In this example, we are trying to determine the two-sided 80% confidence bounds on the time estimate of 31.637. This is accomplished by substituting [math]\displaystyle{ R=0.40 }[/math] and [math]\displaystyle{ \alpha =0.8 }[/math] into Eqn. (normliketr), and varying [math]\displaystyle{ \sigma }[/math] until the maximum and minimum values of [math]\displaystyle{ t }[/math] are found. The following table gives the values of [math]\displaystyle{ t }[/math] based on given values of [math]\displaystyle{ \sigma }[/math] .
[math]\displaystyle{ }[/math]
This data set is represented graphically in the following contour plot:
As can be determined from the table, the lowest calculated value for [math]\displaystyle{ t }[/math] is 25.046, while the highest is 39.250. These represent the 80% confidence limits on the time at which reliability is equal to 40%.
Example 7
For the data given in Example 5, determine the two-sided 80% confidence bounds on the reliability estimate for [math]\displaystyle{ t=30 }[/math] . The ML estimate for the reliability at [math]\displaystyle{ t=30 }[/math] is 45.739%.
Solution to Example 7
In this example, we are trying to determine the two-sided 80% confidence bounds on the reliability estimate of 45.739%. This is accomplished by substituting [math]\displaystyle{ t=30 }[/math] and [math]\displaystyle{ \alpha =0.8 }[/math] into Eqn. (normliketr), and varying [math]\displaystyle{ \sigma }[/math] until the maximum and minimum values of [math]\displaystyle{ R }[/math] are found. The following table gives the values of [math]\displaystyle{ R }[/math] based on given values of [math]\displaystyle{ \sigma }[/math] .
This data set is represented graphically in the following contour plot:
As can be determined from the table, the lowest calculated value for [math]\displaystyle{ R }[/math] is 24.776%, while the highest is 68.000%. These represent the 80% two-sided confidence limits on the reliability at [math]\displaystyle{ t=30 }[/math] .
Bayesian Confidence Bounds
Bounds on Parameters
From Chapter 5, we know that the marginal posterior distribution of [math]\displaystyle{ \mu }[/math] can be written as:
- [math]\displaystyle{ \begin{align} f(\mu |Data)= & \int_{0}^{\infty }f(\mu ,\sigma |Data)d\sigma \\ = & \frac{\int_{0}^{\infty }L(Data|\mu ,\sigma )\varphi (\mu )\varphi (\sigma )d\sigma }{\int_{0}^{\infty }\int_{-\infty }^{\infty }L(Data|\mu ,\sigma )\varphi (\mu )\varphi (\sigma )d\mu d\sigma } \end{align} }[/math]
where:
[math]\displaystyle{ \varphi (\sigma ) }[/math] = [math]\displaystyle{ \tfrac{1}{\sigma } }[/math] is the non-informative prior of [math]\displaystyle{ \sigma }[/math] .
- [math]\displaystyle{ \varphi (\mu ) }[/math] is a uniform distribution from - [math]\displaystyle{ \infty }[/math] to + [math]\displaystyle{ \infty }[/math] , the non-informative prior of [math]\displaystyle{ \mu . }[/math]
Using the above prior distributions, [math]\displaystyle{ f(\mu |Data) }[/math] can be rewritten as:
- [math]\displaystyle{ f(\mu |Data)=\frac{\int_{0}^{\infty }L(Data|\mu ,\sigma )\tfrac{1}{\sigma }d\sigma }{\int_{0}^{\infty }\int_{-\infty }^{\infty }L(Data|\mu ,\sigma )\tfrac{1}{\sigma }d\mu d\sigma } }[/math]
The one-sided upper bound of [math]\displaystyle{ \mu }[/math] is:
- [math]\displaystyle{ CL=P(\mu \le {{\mu }_{U}})=\int_{-\infty }^{{{\mu }_{U}}}f(\mu |Data)d\mu }[/math]
The one-sided lower bound of [math]\displaystyle{ \mu }[/math] is:
- [math]\displaystyle{ 1-CL=P(\mu \le {{\mu }_{L}})=\int_{-\infty }^{{{\mu }_{L}}}f(\mu |Data)d\mu }[/math]
The two-sided bounds of [math]\displaystyle{ \mu }[/math] are:
- [math]\displaystyle{ CL=P({{\mu }_{L}}\le \mu \le {{\mu }_{U}})=\int_{{{\mu }_{L}}}^{{{\mu }_{U}}}f(\mu |Data)d\mu }[/math]
The same method can be used to obtained the bounds of [math]\displaystyle{ \sigma }[/math].
Bounds on Time (Type 1)
The reliable life for the normal distribution is:
- [math]\displaystyle{ T=\mu +\sigma {{\Phi }^{-1}}(1-R) }[/math]
The one-sided upper bound on time is:
- [math]\displaystyle{ CL=\underset{}{\overset{}{\mathop{\Pr }}}\,(T\le {{T}_{U}})=\underset{}{\overset{}{\mathop{\Pr }}}\,(\mu +\sigma {{\Phi }^{-1}}(1-R)\le {{T}_{U}}) }[/math]
Eqn. (1SCBT) can be rewritten in terms of [math]\displaystyle{ \mu }[/math] as:
- [math]\displaystyle{ CL=\underset{}{\overset{}{\mathop{\Pr }}}\,(\mu \le {{T}_{U}}-\sigma {{\Phi }^{-1}}(1-R)) }[/math]
From the posterior distribution of [math]\displaystyle{ \mu \ \ : }[/math]
- [math]\displaystyle{ CL=\frac{\int_{0}^{\infty }\int_{-\infty }^{{{T}_{U}}-\sigma {{\Phi }^{-1}}(1-R)}L(\sigma ,\mu )\tfrac{1}{\sigma }d\mu d\sigma }{\int_{0}^{\infty }\int_{-\infty }^{\infty }L(\sigma ,\mu )\tfrac{1}{\sigma }d\mu d\sigma } }[/math]
The same method can be applied for one-sided lower bounds and two-sided bounds on time.
Bounds on Reliability (Type 2)
The one-sided upper bound on reliability is:
- [math]\displaystyle{ CL=\underset{}{\overset{}{\mathop{\Pr }}}\,(R\le {{R}_{U}})=\underset{}{\overset{}{\mathop{\Pr }}}\,(\mu \le T-\sigma {{\Phi }^{-1}}(1-{{R}_{U}})) }[/math]
From the posterior distribution of [math]\displaystyle{ \mu \ \ : }[/math]
- [math]\displaystyle{ CL=\frac{\int_{0}^{\infty }\int_{-\infty }^{T-\sigma {{\Phi }^{-1}}(1-{{R}_{U}})}L(\sigma ,\mu )\tfrac{1}{\sigma }d\mu d\sigma }{\int_{0}^{\infty }\int_{-\infty }^{\infty }L(\sigma ,\mu )\tfrac{1}{\sigma }d\mu d\sigma } }[/math]
The same method can be used to calculate the one-sided lower bounds and the two-sided bounds on reliability.
General Examples
Example 8
Six units are tested to failure with the following hours-to-failure data obtained: 12125, 11260, 12080, 12825, 13550 and 14670 hours. Assuming the data are normally distributed, do the following: 8-1. Find the parameters for the data. (Use Rank Regression on X to duplicate the results shown in this example.) 8-2. Obtain the probability plot for the data with 90%, two-sided Type 1 confidence bounds. 8-3. Obtain the [math]\displaystyle{ pdf }[/math] plot for these data.
Solutions to Example 8
8-1. The next figure shows the data as entered in Weibull++, as well as the calculated parameters.
[math]\displaystyle{ }[/math]
8-2. Obtain the probability plot as before. To plot confidence bounds, from the Plot Options menu choose Confidence Bounds and then Show Confidence Bounds. On the Type and Settings page of the Confidence Bounds window, select Two Sided Bounds, make sure Type 1 is selected, and then enter 90 in the Confidence level, % box, and click OK, as shown next.
[math]\displaystyle{ }[/math]
The following plot should appear on your screen:
[math]\displaystyle{ }[/math]
8-3. From the Special Plot Type menu choose Pdf Plot. The following plot should appear on your screen.
[math]\displaystyle{ }[/math]
Example 9
Using the data and results from the previous example (and RRX), do the following: 9-1. Using the Quick Calculation Pad, determine the reliability for a mission of 11,000 hours, as well as the upper and lower two-sided 90% confidence limit on this reliability. 9-2. Using the Quick Calculation Pad, determine the MTTF, as well as the upper and lower two-sided 90% confidence limit on this MTTF.
Solutions to Example 9
Both of these results are easily obtained from the QCP. The QCP with results for both cases is shown in the next two figures.
Example 10
Using the data from Example 8, and using the rank regression on X analysis method (RRX), obtain tabulated values for the failure rate for 10 different mission end times. The mission end times are 1,000 to 10,000 hours, using increments of 1,000 hours.
Solution to Example 10
This can be easily accomplished via the use of the Function Wizard, available in either Weibull++'s Reports or the General Spreadsheet. (For more information on these features, please refer to the Weibull++ User's Guide.) We will illustrate this using the General Spreadsheet.
First, click on Insert General Spread Sheet from the Folio menu.
Type Time in cell A1 and Failure Rate in cell B1. Then enter 1000 through 10000 in cells A2 to A11. Finally, place the cursor into cell B2, as shown next.
Open the Function Wizard by selecting Function Wizard from the Data menu or by clicking the Function Wizard icon.
Select FAILURERATE from the list of functions. Enter A2 for Time; this indicates that the time input for the equation will be obtained from the specified cell in the worksheet. To specify the existing Weibull++ analysis that the function result will be based on, click Select... to open the Select Folio/Data Sheet window and then navigate to the desired sheet.
Click OK to close the window and return to the Function Wizard. Click Add to Equation to update the box at the bottom of the window with the function code that will be inserted into the spreadsheet.
Click Insert to close the window and insert the function code into the active cell in the General Spreadsheet. Copy the function into cells B3 through B11. One way to do this is to position the mouse over the bottom right corner of cell B2 and when the cursor turns into a plus symbol (+), click and drag the mouse to cell B11. By selecting one of the cells that you copied the function into, you can see that the cell reference was updated to match the current row, as shown next with cell B11 selected. The results are as follows:
Example 11
Eight units are being reliability tested and the following is a table of their times-to-failure:
Data point index | Last Inspected | State End Time |
1 | 30 | 32 |
2 | 32 | 35 |
3 | 35 | 37 |
4 | 37 | 40 |
5 | 42 | 42 |
6 | 45 | 45 |
7 | 50 | 50 |
8 | 55 | 55 |
Solution to Example 11
This is a sequence of interval times-to-failure. This data set can be entered into Weibull++ by creating a data sheet that can be used to analyze times-to-failure data with interval and left censored data.
[math]\displaystyle{ }[/math]
The computed parameters for maximum likelihood are:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 41.40 \\ & {{{\hat{\sigma }}}_{T}}= & 7.740. \end{align} }[/math]
For rank regression on x:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 41.40 \\ & {{{\hat{\sigma }}}_{T}}= & 9.03. \end{align} }[/math]
For rank regression on y:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 41.39 \\ & {{{\hat{\sigma }}}_{T}}= & 9.25. \end{align} }[/math]
A plot of the MLE solution is shown next.
[math]\displaystyle{ }[/math]
Example 12
Eight units are being reliability tested and the following is a table of their times-to-failure:
Data point index | State F or S | State End Time |
1 | F | 2 |
2 | F | 5 |
3 | F | 11 |
4 | F | 23 |
5 | F | 29 |
6 | F | 37 |
7 | F | 43 |
8 | F | 59 |
Solution to Example 12
This data set can be entered into Weibull++ by creating a Data Sheet appropriate for the entry of non-grouped times-to-failure data. The computed parameters for maximum likelihood are:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 26.13 \\ & {{{\hat{\sigma }}}_{T}}= & 18.57 \end{align} }[/math]
For rank regression on x:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 26.13 \\ & {{{\hat{\sigma }}}_{T}}= & 21.64 \end{align} }[/math]
For rank regression on y:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 26.13 \\ & {{{\hat{\sigma }}}_{T}}= & 22.28. \end{align} }[/math]
Example 13
Nineteen units are being reliability tested and the following is a table of their times-to-failure and suspensions.
Data point index | Last Inspected | State End Time |
1 | F | 2 |
2 | S | 3 |
3 | F | 5 |
4 | S | 7 |
5 | F | 11 |
6 | S | 13 |
7 | S | 17 |
8 | S | 19 |
9 | F | 23 |
10 | F | 29 |
11 | S | 31 |
12 | F | 37 |
13 | S | 41 |
14 | F | 43 |
15 | S | 47 |
16 | S | 53 |
17 | F | 59 |
18 | S | 61 |
19 | S | 67 |
Solution to Example 13
This augments the previous example by adding eleven suspensions to the data set. This data set can be entered into Weibull++ by selecting the data sheet for Times to Failure and with Right Censored Data (Suspensions). The parameters using maximum likelihood are:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 48.07 \\ & {{{\hat{\sigma }}}_{T}}= & 28.41. \end{align} }[/math]
For rank regression on x:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 46.40 \\ & {{{\hat{\sigma }}}_{T}}= & 28.64. \end{align} }[/math]
For rank regression on y:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 47.34 \\ & {{{\hat{\sigma }}}_{T}}= & 29.96. \end{align} }[/math]
Example 14
Suppose our data set includes left and right censored, interval censored and complete data as shown in the following table.
Solution to Example 14
This data set can be entered into Weibull++ by selecting the data type Times to Failure, with Right Censored Data (Suspensions), with Interval and Left Censored Data and with Grouped Observations.
The computed parameters using maximum likelihood are:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 48.11 \\ & {{{\hat{\sigma }}}_{T}}= & 26.42. \end{align} }[/math]
For rank regression on x:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 49.99 \\ & {{{\hat{\sigma }}}_{T}}= & 30.17. \end{align} }[/math]
For rank regression on y:
- [math]\displaystyle{ \begin{align} & \widehat{\mu }= & 51.61 \\ & {{{\hat{\sigma }}}_{T}}= & 33.07. \end{align} }[/math]