Lognormal Parameter Estimation

Probability Plotting
As described before, probability plotting involves plotting the failure times and associated unreliability estimates on specially constructed probability plotting paper. The form of this paper is based on a linearization of the cdf of the specific distribution. For the lognormal distribution, the cumulative density function can be written as:


 * $$F({t}')=\Phi \left( \frac{{t}'-{\mu }'} \right)\,\!$$

or:


 * $${{\Phi }^{-1}}\left[ F({t}') \right]=-\frac+\frac{1}\cdot {t}'\,\!$$

where:


 * $$\Phi (x)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{x}{{e}^{-\tfrac{2}}}dt\,\!$$

Now, let:


 * $$y={{\Phi }^{-1}}\left[ F({t}') \right]\,\!$$


 * $$a=-\frac\,\!$$

and:


 * $$b=\frac{1}\,\!$$

which results in the linear equation of:


 * $$\begin{align}

y=a+b{t}' \end{align}\,\!$$

The normal probability paper resulting from this linearized cdf function is shown next.

The process for reading the parameter estimate values from the lognormal probability plot is very similar to the method employed for the normal distribution (see The Normal Distribution). However, since the lognormal distribution models the natural logarithms of the times-to-failure, the values of the parameter estimates must be read and calculated based on a logarithmic scale, as opposed to the linear time scale as it was done with the normal distribution. This parameter scale appears at the top of the lognormal probability plot.

The process of lognormal probability plotting is illustrated in the following example.

Rank Regression on Y
Performing a rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the squares of the vertical deviations from the points to the line is minimized.

The least squares parameter estimation method, or regression analysis, was discussed in Parameter Estimation and the following equations for regression on Y were derived, and are again applicable:


 * $$\hat{a}=\bar{y}-\hat{b}\bar{x}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}-\hat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}}{N}\,\!$$

and:


 * $$\hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,x_{i}^{2}-\tfrac{N}}\,\!$$

In our case the equations for $${{y}_{i}}\,\!$$ and $$x_{i}\,\!$$ are:


 * $${{y}_{i}}={{\Phi }^{-1}}\left[ F(t_{i}^{\prime }) \right]\,\!$$

and:


 * $${{x}_{i}}=t_{i}^{\prime }\,\!$$

where the $$F(t_{i}^{\prime })\,\!$$ is estimated from the median ranks. Once $$\widehat{a}\,\!$$ and $$\widehat{b}\,\!$$ are obtained, then $$\widehat{\sigma }\,\!$$ and $$\widehat{\mu }\,\!$$ can easily be obtained from the above equations.

RRY Example
Lognormal Distribution RRY Example

14 units were reliability tested and the following life test data were obtained:

Assuming the data follow a lognormal distribution, estimate the parameters and the correlation coefficient, $$\rho \,\!$$, using rank regression on Y.

Solution

Construct a table like the one shown next.

$$\overset – {\mathop{\text{Least Squares Analysis}}}\,\,\!$$

$$\begin{matrix} N & t_{i} & F(t_{i}) & {t_{i}}'& y_{i} & {{t_{i}}'}^{2} & y_{i}^{2} & t_{i} y_{i} \\ \text{1} & \text{5} & \text{0}\text{.0483} & \text{1}\text{.6094}& \text{-1}\text{.6619} & \text{2}\text{.5903} & \text{2}\text{.7619} & \text{-2}\text{.6747} \\ \text{2} & \text{10} & \text{0}\text{.1170} & \text{2.3026}& \text{-1.1901} & \text{5.3019} & \text{1.4163} & \text{-2.7403} \\ \text{3} & \text{15} & \text{0}\text{.1865} & \text{2.7080}&\text{-0.8908} & \text{7.3335} & \text{0.7935} & \text{-2.4123} \\ \text{4} & \text{20} & \text{0}\text{.2561} & \text{2.9957} &\text{-0.6552} & \text{8.9744} & \text{0.4292} & \text{-1.9627} \\ \text{5} & \text{25} & \text{0}\text{.3258} & \text{3.2189}& \text{-0.4512} & \text{10.3612} & \text{0.2036} & \text{-1.4524} \\ \text{6} & \text{30} & \text{0}\text{.3954} & \text{3.4012}& \text{-0.2647} & \text{11.5681} & \text{0.0701} & \text{-0.9004} \\ \text{7} & \text{35} & \text{0}\text{.4651} & \text{3.5553} & \text{-0.0873} & \text{12.6405} & \text{-0.0076}& \text{-0.3102} \\ \text{8} & \text{40} & \text{0}\text{.5349} & \text{3.6889}& \text{0.0873} & \text{13.6078} & \text{0.0076} & \text{0.3219} \\ \text{9} & \text{50} & \text{0}\text{.6046} & \text{3.9120} & \text{0.2647} & \text{15.3039} & \text{0.0701} &\text{1.0357} \\ \text{10} & \text{60} & \text{0}\text{.6742} & \text{4.0943} & \text{0.4512} & \text{16.7637} & \text{0.2036}&\text{1.8474} \\ \text{11} & \text{70} & \text{0}\text{.7439} & \text{4.2485} & \text{0.6552} & \text{18.0497}& \text{0.4292} & \text{2.7834} \\ \text{12} & \text{80} & \text{0}\text{.8135} & \text{4.3820} & \text{0.8908} & \text{19.2022} & \text{0.7935} & \text{3.9035} \\ \text{13} & \text{90} & \text{0}\text{.8830} & \text{4.4998} & \text{1.1901} & \text{20.2483}&\text{1.4163} & \text{5.3552} \\ \text{14} & \text{100}& \text{0}\text{.9517} & \text{4.6052} & \text{1.6619} & \text{21.2076} &\text{2.7619} & \text{7.6533} \\ \sum_{}^{} & \text{ } & \text{ } & \text{49.222} & \text{0} & \text{183.1531} & \text{11.3646} & \text{10.4473} \\

\end{matrix}\,\!$$

The median rank values ( $$F({{t}_{i}})\,\!$$ ) can be found in rank tables or by using the Quick Statistical Reference in Weibull++.

The $${{y}_{i}}\,\!$$ values were obtained from the standardized normal distribution's area tables by entering for $$F(z)\,\!$$ and getting the corresponding $$z\,\!$$ value ( $${{y}_{i}}\,\!$$ ).

Given the values in the table above, calculate $$\widehat{a}\,\!$$ and $$\widehat{b}\,\!$$:


 * $$\begin{align}

& \widehat{b}= & \frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime }{{y}_{i}}-(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime })(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}})/14}{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime 2}-{{(\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime })}^{2}}/14} \\ & &  \\  & \widehat{b}= & \frac{10.4473-(49.2220)(0)/14}{183.1530-{{(49.2220)}^{2}}/14} \end{align}\,\!$$

or:


 * $$\widehat{b}=1.0349\,\!$$

and:


 * $$\widehat{a}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}-\widehat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,t_{i}^{\prime }}{N}\,\!$$

or:


 * $$\widehat{a}=\frac{0}{14}-(1.0349)\frac{49.2220}{14}=-3.6386\,\!$$

Therefore:


 * $${\sigma'}=\frac{1}{\widehat{b}}=\frac{1}{1.0349}=0.9663\,\!$$

and:


 * $${\mu }'=-\widehat{a}\cdot {\sigma'}=-(-3.6386)\cdot 0.9663\,\!$$

or:


 * $$\begin{align}

{\mu }'=3.516 \end{align}\,\!$$

The mean and the standard deviation of the lognormal distribution are obtained using equations in the Lognormal Distribution Functions section above:


 * $$\overline{T}=\mu ={{e}^{3.516+\tfrac{1}{2}{{0.9663}^{2}}}}=53.6707\text{ hours}\,\!$$

and:


 * $${\sigma}=\sqrt{({{e}^{2\cdot 3.516+{{0.9663}^{2}}}})({{e}^}-1)}=66.69\text{ hours}\,\!$$

The correlation coefficient can be estimated as:


 * $$\widehat{\rho }=0.9754\,\!$$

The above example can be repeated using Weibull++, using RRY.



The mean can be obtained from the QCP and both the mean and the standard deviation can be obtained from the Function Wizard.

Rank Regression on X
Performing a rank regression on X requires that a straight line be fitted to a set of data points such that the sum of the squares of the horizontal deviations from the points to the line is minimized.

Again, the first task is to bring our cdf function into a linear form. This step is exactly the same as in regression on Y analysis and all the equations apply in this case too. The deviation from the previous analysis begins on the least squares fit part, where in this case we treat $$x\,\!$$ as the dependent variable and $$y\,\!$$ as the independent variable. The best-fitting straight line to the data, for regression on X (see Parameter Estimation), is the straight line:


 * $$x=\widehat{a}+\widehat{b}y\,\!$$

The corresponding equations for $$\widehat{a}\,\!$$ and $$\widehat{b}\,\!$$ are:


 * $$\hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}}{N}-\hat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}\,\!$$

and:


 * $$\hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{N}}\,\!$$

where:


 * $${{y}_{i}}={{\Phi }^{-1}}\left[ F(t_{i}^{\prime }) \right]\,\!$$

and:


 * $${{x}_{i}}=t_{i}^{\prime }\,\!$$

and the $$F(t_{i}^{\prime })\,\!$$ is estimated from the median ranks. Once $$\widehat{a}\,\!$$ and $$\widehat{b}\,\!$$ are obtained, solve the linear equation for the unknown $$y\,\!$$, which corresponds to:


 * $$y=-\frac{\widehat{a}}{\widehat{b}}+\frac{1}{\widehat{b}}x\,\!$$

Solving for the parameters we get:


 * $$a=-\frac{\widehat{a}}{\widehat{b}}=-\frac{\sigma'}\,\!$$

and:


 * $$b=\frac{1}{\widehat{b}}=\frac{1}{\sigma'}\,\!$$

The correlation coefficient is evaluated as before using equation in the previous section.

RRX Example
Lognormal Distribution RRX Example

Using the same data set from the RRY example given above, and assuming a lognormal distribution, estimate the parameters and estimate the correlation coefficient, $$\rho \,\!$$, using rank regression on X.

Solution

The table constructed for the RRY example also applies to this example as well. Using the values in this table we get:


 * $$\begin{align}

& \hat{b}= & \frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime }{{y}_{i}}-\tfrac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime }\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}}}{14}}{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{14}} \\ & &  \\  & \widehat{b}= & \frac{10.4473-(49.2220)(0)/14}{11.3646-{{(0)}^{2}}/14} \end{align}\,\!$$

or:


 * $$\widehat{b}=0.9193\,\!$$

and:


 * $$\hat{a}=\overline{x}-\hat{b}\overline{y}=\frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,t_{i}^{\prime }}{14}-\widehat{b}\frac{\underset{i=1}{\overset{14}{\mathop{\sum }}}\,{{y}_{i}}}{14}\,\!$$

or:


 * $$\widehat{a}=\frac{49.2220}{14}-(0.9193)\frac{(0)}{14}=3.5159\,\!$$

Therefore:


 * $${\sigma'}=\widehat{b}=0.9193\,\!$$

and:


 * $${\mu }'=\frac{\widehat{a}}{\widehat{b}}{\sigma'}=\frac{3.5159}{0.9193}\cdot 0.9193=3.5159\,\!$$

Using for Mean and Standard Deviation we get:


 * $$\overline{T}=\mu =51.3393\text{ hours}\,\!$$

and:


 * $$\begin{align}

{\sigma'}=59.1682\text{ hours}. \end{align}\,\!$$

The correlation coefficient is found using the equation in previous section:


 * $$\widehat{\rho }=0.9754.\,\!$$

Note that the regression on Y analysis is not necessarily the same as the regression on X. The only time when the results of the two regression types are the same (i.e., will yield the same equation for a line) is when the data lie perfectly on a line.

Using Weibull++, with the Rank Regression on X option, the results are:



Maximum Likelihood Estimation
As it was outlined in Parameter Estimation, maximum likelihood estimation works by developing a likelihood function based on the available data and finding the values of the parameter estimates that maximize the likelihood function. This can be achieved by using iterative methods to determine the parameter estimate values that maximize the likelihood function. However, this can be rather difficult and time-consuming, particularly when dealing with the three-parameter distribution. Another method of finding the parameter estimates involves taking the partial derivatives of the likelihood equation with respect to the parameters, setting the resulting equations equal to zero, and solving simultaneously to determine the values of the parameter estimates. The log-likelihood functions and associated partial derivatives used to determine maximum likelihood estimates for the lognormal distribution are covered in Appendix D .

Note About Bias

See the discussion regarding bias with the normal distribution for information regarding parameter bias in the lognormal distribution.

MLE Example
Lognormal Distribution MLE Example

Using the same data set from the RRY and RRX examples given above and assuming a lognormal distribution, estimate the parameters using the MLE method.

Solution In this example we have only complete data. Thus, the partials reduce to:


 * $$\begin{align}

& \frac{\partial \Lambda }{\partial {\mu }'}= & \frac{1}{\sigma'^{2}}\cdot \underset{i=1}{\overset{14}{\mathop \sum }}\,\ln ({{t}_{i}})-{\mu }'=0 \\ & \frac{\partial \Lambda }{\partial }= & \underset{i=1}{\overset{14}{\mathop \sum }}\,\left( \frac{\ln ({{t}_{i}})-{\mu }'}{\sigma'^{3}}-\frac{1} \right)=0 \end{align}\,\!$$

Substituting the values of $${{T}_{i}}\,\!$$ and solving the above system simultaneously, we get:


 * $$\begin{align}

& = & 0.849 \\ & {{{\hat{\mu }}}^{\prime }}= & 3.516 \end{align}\,\!$$

Using the equation for mean and standard deviation in the Lognormal Distribution Functions section above, we get:


 * $$\overline{T}=\hat{\mu }=48.25\text{ hours}\,\!$$

and:


 * $$=49.61\text{ hours}.\,\!$$

The variance/covariance matrix is given by:


 * $$\left[ \begin{matrix}

\widehat{Var}\left( {{{\hat{\mu }}}^{\prime }} \right)=0.0515 & {} & \widehat{Cov}\left( {{{\hat{\mu }}}^{\prime }}, \right)=0.0000 \\ {} & {} & {} \\   \widehat{Cov}\left( {{{\hat{\mu }}}^{\prime }}, \right)=0.0000 & {} & \widehat{Var}\left(  \right)=0.0258  \\ \end{matrix} \right]\,\!$$