Weibull Parameter Estimation

Estimation of the Weibull Parameters
The estimates of the parameters of the Weibull distribution can be found graphically via probability plotting paper, or analytically, using either least squares (rank regression) or maximum likelihood estimation (MLE).

Probability Plotting
One method of calculating the parameters of the Weibull distribution is by using probability plotting. To better illustrate this procedure, consider the following example from Kececioglu [20].

Assume that six identical units are being reliability tested at the same application and operation stress levels. All of these units fail during the test after operating the following number of hours: 93, 34, 16, 120, 53 and 75. Estimate the values of the parameters for a 2-parameter Weibull distribution and determine the reliability of the units at a time of 15 hours.

Solution

The steps for determining the parameters of the Weibull representing the data, using probability plotting, are outlined in the following instructions. First, rank the times-to-failure in ascending order as shown next.

Obtain their median rank plotting positions. Median rank positions are used instead of other ranking methods because median ranks are at a specific confidence level (50%). Median ranks can be found tabulated in many reliability books. They can also be estimated using the following equation:


 * $$ MR \sim { \frac{i-0.3}{N+0.4}}\cdot 100 \,\!$$

where $$i\,\!$$ is the failure order number and $$N\,\!$$ is the total sample size. The exact median ranks are found in Weibull++ by solving:


 * $$\sum_{k=i}^N{\binom{N}{k}}{MR^k}{(1-MR)^{N-k}}=0.5=50%

\,\!$$

for $$MR\,\!$$, where $$N\,\!$$ is the sample size and $$i\,\!$$ the order number. The times-to-failure, with their corresponding median ranks, are shown next.

On a Weibull probability paper, plot the times and their corresponding ranks. A sample of a Weibull probability paper is given in the following figure.

The points of the data in the example are shown in the figure below. Draw the best possible straight line through these points, as shown below, then obtain the slope of this line by drawing a line, parallel to the one just obtained, through the slope indicator. This value is the estimate of the shape parameter $$ \hat{\beta } \,\!$$, in this case $$ \hat{\beta }=1.4 \,\!$$.



At the $$ Q(t)=63.2%\,\!$$ ordinate point, draw a straight horizontal line until this line intersects the fitted straight line. Draw a vertical line through this intersection until it crosses the abscissa. The value at the intersection of the abscissa is the estimate of $$ \hat{\eta } \,\!$$. For this case, $$ \hat{\eta }=76 \,\!$$ hours. This is always at 63.2% since:


 * $$ Q(t)=1-e^{-(\frac{t}{\eta })^{\beta }}=1-e^{-1}=0.632=63.2% \,\!$$

Now any reliability value for any mission time $$t\,\!$$ can be obtained. For example, the reliability for a mission of 15 hours, or any other time, can now be obtained either from the plot or analytically. To obtain the value from the plot, draw a vertical line from the abscissa, at hours, to the fitted line. Draw a horizontal line from this intersection to the ordinate and read $$ Q(t)\,\!$$, in this case $$ Q(t)=9.8%\,\!$$. Thus, $$ R(t)=1-Q(t)=90.2%\,\!$$. This can also be obtained analytically from the Weibull reliability function since the estimates of both of the parameters are known or:


 * $$ R(t=15)=e^{-\left( \frac{15}{\eta }\right) ^{\beta }}=e^{-\left( \frac{15}{76 }\right) ^{1.4}}=90.2% \,\!$$

Probability Plotting for the Location Parameter, Gamma
The third parameter of the Weibull distribution is utilized when the data do not fall on a straight line, but fall on either a concave up or down curve. The following statements can be made regarding the value of $$\gamma \,\!$$:


 * Case 1: If the curve for MR versus $${{t}_{j}}\,\!$$ is concave down and the curve for MR versus $${({t}_{j}-{t}_{1})}\,\!$$ is concave up, then there exists a $$\gamma \,\!$$ such that $$0< \gamma < t_{1}\,\!$$, or $$\gamma \,\!$$ has a positive value.


 * Case 2: If the curves for MR versus $${{t}_{j}}\,\!$$ and MR versus $${({t}_{j}-{t}_{1})}\,\!$$ are both concave up, then there exists a negative $$\gamma \,\!$$ which will straighten out the curve of MR versus $${{t}_{j}}\,\!$$.


 * Case 3: If neither one of the previous two cases prevails, then either reject the Weibull as one capable of representing the data, or proceed with the multiple population (mixed Weibull) analysis. To obtain the location parameter, $$\gamma \,\!$$:


 * Subtract the same arbitrary value, $$\gamma \,\!$$, from all the times to failure and replot the data.
 * If the initial curve is concave up, subtract a negative $$\gamma \,\!$$ from each failure time.
 * If the initial curve is concave down, subtract a positive $$\gamma \,\!$$ from each failure time.
 * Repeat until the data plots on an acceptable straight line.
 * The value of $$\gamma \,\!$$ is the subtracted (positive or negative) value that places the points in an acceptable straight line.

The other two parameters are then obtained using the techniques previously described. Also, it is important to note that we used the term subtract a positive or negative gamma, where subtracting a negative gamma is equivalent to adding it. Note that when adjusting for gamma, the x-axis scale for the straight line becomes $${({t}-\gamma)}\,\!$$.

Rank Regression on Y
Performing rank regression on Y requires that a straight line mathematically be fitted to a set of data points such that the sum of the squares of the vertical deviations from the points to the line is minimized. This is in essence the same methodology as the probability plotting method, except that we use the principle of least squares to determine the line through the points, as opposed to just eyeballing it. The first step is to bring our function into a linear form. For the two-parameter Weibull distribution, the (cumulative density function) is:


 * $$ F(t)=1-e^{-\left( \frac{t}{\eta }\right) ^{\beta }} \,\!$$

Taking the natural logarithm of both sides of the equation yields:


 * $$\ln[ 1-F(t)] =-( \frac{t}{\eta }) ^{\beta } \,\!$$


 * $$ \ln{ -\ln[ 1-F(t)]} =\beta \ln ( \frac{t}{ \eta }) \,\!$$

or:


 * $$\begin{align}

\ln \{ -\ln[ 1-F(t)]\} =-\beta \ln (\eta )+\beta \ln (t) \end{align}\,\!$$

Now let:


 * $$\begin{align}

y = \ln \{ -\ln[ 1-F(t)]\} \end{align}\,\!$$


 * $$\begin{align}

a = - \beta \ln(\eta) \end{align}\,\!$$

and:


 * $$\begin{align}

b= \beta \end{align}\,\!$$

which results in the linear equation of:


 * $$\begin{align}

y=a+bx \end{align}\,\!$$

The least squares parameter estimation method (also known as regression analysis) was discussed in Parameter Estimation, and the following equations for regression on Y were derived:


 * $$ \hat{a}=\frac{\sum\limits_{i=1}^{N}y_{i}}{N}-\hat{b}\frac{ \sum\limits_{i=1}^{N}x_{i}}{N}=\bar{y}-\hat{b}\bar{x} \,\!$$

and:


 * $$ \hat{b}={\frac{\sum\limits_{i=1}^{N}x_{i}y_{i}-\frac{\sum \limits_{i=1}^{N}x_{i}\sum\limits_{i=1}^{N}y_{i}}{N}}{\sum \limits_{i=1}^{N}x_{i}^{2}-\frac{\left( \sum\limits_{i=1}^{N}x_{i}\right) ^{2}}{N}}} \,\!$$

In this case the equations for $${{y}_{i}}\,\!$$ and $${{x}_{i}}\,\!$$ are:


 * $$ y_{i}=\ln \left\{ -\ln [1-F(t_{i})]\right\} \,\!$$

and:
 * $$\begin{align}

x_{i}=\ln(t_{i}) \end{align}\,\!$$

The $$ F(t_{i})\,\!$$ values are estimated from the median ranks.

Once $$ \hat{a} \,\!$$ and $$ \hat{b} \,\!$$ are obtained, then $$ \hat{\beta } \,\!$$ and $$ \hat{\eta } \,\!$$ can easily be obtained from previous equations.

The Correlation Coefficient

The correlation coefficient is defined as follows:


 * $$ \rho ={\frac{\sigma _{xy}}{\sigma _{x}\sigma _{y}}} \,\!$$

where $$\sigma_{xy}\,\!$$ = covariance of $$x\,\!$$ and $$y\,\!$$, $$\sigma_{x}\,\!$$ = standard deviation of $$x\,\!$$, and $$\sigma_{y}\,\!$$ = standard deviation of $$y\,\!$$. The estimator of $$\rho\,\!$$ is the sample correlation coefficient, $$ \hat{\rho} \,\!$$, given by:


 * $$ \hat{\rho}=\frac{\sum\limits_{i=1}^{N}(x_{i}-\overline{x})(y_{i}-\overline{y} )}{\sqrt{\sum\limits_{i=1}^{N}(x_{i}-\overline{x})^{2}\cdot \sum\limits_{i=1}^{N}(y_{i}-\overline{y})^{2}}}\,\!$$

RRY Example
Consider the same data set from the probability plotting example given above (with six failures at 16, 34, 53, 75, 93 and 120 hours). Estimate the parameters and the correlation coefficient using rank regression on Y, assuming that the data follow the 2-parameter Weibull distribution.

Solution

Construct a table as shown next.

Utilizing the values from the table, calculate $$ \hat{a} \,\!$$ and $$ \hat{b} \,\!$$ using the following equations:
 * $$ \hat{b} =\frac{\sum\limits_{i=1}^{6}(\ln t_{i})y_{i}-(\sum\limits_{i=1}^{6}\ln t_{i})(\sum\limits_{i=1}^{6}y_{i})/6}{ \sum\limits_{i=1}^{6}(\ln t_{i})^{2}-(\sum\limits_{i=1}^{6}\ln t_{i})^{2}/6}

\,\!$$


 * $$ \hat{b}=\frac{-8.0699-(23.9068)(-3.0070)/6}{97.9909-(23.9068)^{2}/6} \,\!$$

or:


 * $$ \hat{b}=1.4301 \,\!$$

and:


 * $$ \hat{a}=\overline{y}-\hat{b}\overline{T}=\frac{\sum \limits_{i=1}^{N}y_{i}}{N}-\hat{b}\frac{\sum\limits_{i=1}^{N}\ln t_{i}}{N } \,\!$$

or:


 * $$ \hat{a}=\frac{(-3.0070)}{6}-(1.4301)\frac{23.9068}{6}=-6.19935 \,\!$$

Therefore:


 * $$ \hat{\beta }=\hat{b}=1.4301 \,\!$$

and:


 * $$ \hat{\eta }=e^{-\frac{\hat{a}}{\hat{b}}}=e^{-\frac{(-6.19935)}{ 1.4301}} \,\!$$

or:


 * $$ \hat{\eta }=76.318\text{ hr} \,\!$$

The correlation coefficient can be estimated as:


 * $$ \hat{\rho }=0.9956 \,\!$$

This example can be repeated in the Weibull++ software. The following plot shows the Weibull probability plot for the data set (with 90% two-sided confidence bounds).



If desired, the Weibull pdf representing the data set can be written as:


 * $$ f(t)={\frac{\beta }{\eta }}\left( {\frac{t}{\eta }}\right) ^{\beta -1}e^{-\left( {\frac{t}{\eta }}\right) ^{\beta }} \,\!$$

or:


 * $$ f(t)={\frac{1.4302}{76.317}}\left( {\frac{t}{76.317}}\right) ^{0.4302}e^{-\left( {\frac{t}{76.317}}\right) ^{1.4302}} \,\!$$

You can also plot this result in Weibull++, as shown next. From this point on, different results, reports and plots can be obtained.



Rank Regression on X
Performing a rank regression on X is similar to the process for rank regression on Y, with the difference being that the horizontal deviations from the points to the line are minimized rather than the vertical. Again, the first task is to bring the reliability function into a linear form. This step is exactly the same as in the regression on Y analysis and all the equations apply in this case too. The derivation from the previous analysis begins on the least squares fit part, where in this case we treat as the dependent variable and as the independent variable. The best-fitting straight line to the data, for regression on X (see Parameter Estimation), is the straight line:


 * $$ x= \hat{a}+\hat{b}y \,\!$$

The corresponding equations for $$ \hat{a} \,\!$$ and $$ \hat{b} \,\!$$ are:


 * $$ \hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\sum\limits_{i=1}^{N}x_{i}}{N} -\hat{b}\frac{\sum\limits_{i=1}^{N}y_{i}}{N} \,\!$$

and:


 * $$ \hat{b}={\frac{\sum\limits_{i=1}^{N}x_{i}y_{i}-\frac{\sum \limits_{i=1}^{N}x_{i}\sum\limits_{i=1}^{N}y_{i}}{N}}{\sum \limits_{i=1}^{N}y_{i}^{2}-\frac{\left( \sum\limits_{i=1}^{N}y_{i}\right) ^{2}}{N}}} \,\!$$

where:


 * $$ y_{i}=\ln \left\{ -\ln [1-F(t_{i})]\right\} \,\!$$

and:


 * $$\begin{align}

x_{i}=\ln (t_{i}) \end{align}\,\!$$

and the $$F({{t}_{i}})\,\!$$ values are again obtained from the median ranks.

Once $$ \hat{a} \,\!$$ and $$ \hat{b} \,\!$$ are obtained, solve the linear equation for $$y\,\!$$, which corresponds to:


 * $$ y=-\frac{\hat{a}}{\hat{b}}+\frac{1}{\hat{b}}x \,\!$$ Solving for the parameters from above equations, we get:


 * $$ a=-\frac{\hat{a}}{\hat{b}}=-\beta \ln (\eta )\,\!$$

and


 * $$ b=\frac{1}{\hat{b}}=\beta\,\!$$

The correlation coefficient is evaluated as before.

RRX Example
Again using the same data set from the probability plotting and RRY examples (with six failures at 16, 34, 53, 75, 93 and 120 hours), calculate the parameters using rank regression on X.

Solution

The same table constructed above for the RRY example can also be applied for RRX.

Using the values from this table we get:


 * $$ \hat{b} ={\frac{\sum\limits_{i=1}^{6}(\ln T_{i})y_{i}-\frac{ \sum\limits_{i=1}^{6}\ln T_{i}\sum\limits_{i=1}^{6}y_{i}}{6}}{ \sum\limits_{i=1}^{6}y_{i}^{2}-\frac{\left( \sum\limits_{i=1}^{6}y_{i}\right) ^{2}}{6}}}

\,\!$$


 * $$\hat{b} =\frac{-8.0699-(23.9068)(-3.0070)/6}{7.1502-(-3.0070)^{2}/6} \,\!$$

or:


 * $$ \hat{b}=0.6931 \,\!$$

and:


 * $$ \hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\sum\limits_{i=1}^{6}\ln T_{i} }{6}-\hat{b}\frac{\sum\limits_{i=1}^{6}y_{i}}{6} \,\!$$

or:


 * $$ \hat{a}=\frac{23.9068}{6}-(0.6931)\frac{(-3.0070)}{6}=4.3318 \,\!$$

Therefore:


 * $$ \hat{\beta }=\frac{1}{\hat{b}}=\frac{1}{0.6931}=1.4428 \,\!$$

and:


 * $$ \hat{\eta }=e^{\frac{\hat{a}}{\hat{b}}\cdot \frac{1}{\hat{ \beta }}}=e^{\frac{4.3318}{0.6931}\cdot \frac{1}{1.4428}}=76.0811\text{ hr} \,\!$$

The correlation coefficient is:


 * $$ \hat{\rho }=0.9956 \,\!$$

The results and the associated graph using Weibull++ are shown next. Note that the slight variation in the results is due to the number of significant figures used in the estimation of the median ranks. Weibull++ by default uses double precision accuracy when computing the median ranks.



3-Parameter Weibull Regression
When the MR versus $${{t}_{j}}\,\!$$ points plotted on the Weibull probability paper do not fall on a satisfactory straight line and the points fall on a curve, then a location parameter, $$\gamma\,\!$$, might exist which may straighten out these points. The goal in this case is to fit a curve, instead of a line, through the data points using nonlinear regression. The Gauss-Newton method can be used to solve for the parameters, $$\beta\,\!$$, $$\eta\,\!$$ and $$\gamma\,\!$$, by performing a Taylor series expansion on $$F(t{_{i}};\beta ,\eta, \gamma )\,\!$$. Then the nonlinear model is approximated with linear terms and ordinary least squares are employed to estimate the parameters. This procedure is iterated until a satisfactory solution is reached.

(Note that other shapes, particularly S shapes, might suggest the existence of more than one population. In these cases, the multiple population mixed Weibull distribution, may be more appropriate.)

When you use the 3-parameter Weibull distribution, Weibull++ calculates the value of $$\gamma\,\!$$ by utilizing an optimized Nelder-Mead algorithm and adjusts the points by this value of $$\gamma\,\!$$ such that they fall on a straight line, and then plots both the adjusted and the original unadjusted points. To draw a curve through the original unadjusted points, if so desired, select Weibull 3P Line Unadjusted for Gamma from the Show Plot Line submenu under the Plot Options menu. The returned estimations of the parameters are the same when selecting RRX or RRY. To display the unadjusted data points and line along with the adjusted data points and line, select Show/Hide Items under the Plot Options menu and include the unadjusted data points and line as follows:





The results and the associated graph for the previous example using the 3-parameter Weibull case are shown next:



Maximum Likelihood Estimation
As outlined in Parameter Estimation, maximum likelihood estimation works by developing a likelihood function based on the available data and finding the values of the parameter estimates that maximize the likelihood function. This can be achieved by using iterative methods to determine the parameter estimate values that maximize the likelihood function, but this can be rather difficult and time-consuming, particularly when dealing with the three-parameter distribution. Another method of finding the parameter estimates involves taking the partial derivatives of the likelihood function with respect to the parameters, setting the resulting equations equal to zero and solving simultaneously to determine the values of the parameter estimates. ( Note that MLE asymptotic properties do not hold when estimating $$\gamma\,\!$$ using MLE, as discussed in Meeker and Escobar [27].) The log-likelihood functions and associated partial derivatives used to determine maximum likelihood estimates for the Weibull distribution are covered in Appendix D.

MLE Example
One last time, use the same data set from the probability plotting, RRY and RRX examples (with six failures at 16, 34, 53, 75, 93 and 120 hours) and calculate the parameters using MLE.

Solution

In this case, we have non-grouped data with no suspensions or intervals, (i.e., complete data). The equations for the partial derivatives of the log-likelihood function are derived in an appendix and given next:
 * $$ \frac{\partial \Lambda }{\partial \beta }=\frac{6}{\beta } +\sum_{i=1}^{6}\ln \left( \frac{T_{i}}{\eta }\right) -\sum_{i=1}^{6}\left( \frac{T_{i}}{\eta }\right) ^{\beta }\ln \left( \frac{T_{i}}{\eta }\right) =0

\,\!$$

And:


 * $$ \frac{\partial \Lambda }{\partial \eta }=\frac{-\beta }{\eta }\cdot 6+\frac{ \beta }{\eta }\sum\limits_{i=1}^{6}\left( \frac{T_{i}}{\eta }\right) ^{\beta }=0 \,\!$$

Solving the above equations simultaneously we get:


 * $$ \hat{\beta }=1.933,\,\!$$ $$\hat{\eta }=73.526 \,\!$$

The variance/covariance matrix is found to be:


 * $$ \left[ \begin{array}{ccc} \hat{Var}\left( \hat{\beta }\right) =0.4211 & \hat{Cov}( \hat{\beta },\hat{\eta })=3.272 \\

\hat{Cov}(\hat{\beta },\hat{\eta })=3.272 & \hat{Var} \left( \hat{\eta }\right) =266.646 \end{array} \right] \,\!$$

The results and the associated plot using Weibull++ (MLE) are shown next.



You can view the variance/covariance matrix directly by clicking the Analysis Summary table in the control panel. Note that the decimal accuracy displayed and used is based on your individual Application Setup.



Unbiased MLE $$\beta \,\!$$
It is well known that the MLE $$\beta \,\!$$ is biased. The biasness will affect the accuracy of reliability prediction, especially when the number of failures are small. Weibull++ provides a simple way to correct the bias of MLE $$\beta \,\!$$.

When there are no right censored observations in the data, the following equation provided by Hirose [39] is used to calculated the unbiased $$\beta \,\!$$. $${{\beta }_{U}}=\frac{\beta }{1.0115+\frac{1.278}{r}+\frac{2.001}+\frac{20.35}-\frac{46.98}}$$

where $$r\,\!$$ is the number of failures.

When there are right censored observations in the data, the following equation provided by Ross [40] is used to calculated the unbiased $$\beta\,\!$$.

$${{\beta }_{U}}=\frac{\beta }{1+\frac{1.37}{r-1.92}\sqrt{\frac{n}{r}}}$$

where $$n\,\!$$ is the number of observations.

The software will use the above equations only when there are more than two failures in the data set.