Template:MLE Parameter Estimation

MLE (Maximum Likelihood) Parameter Estimation for Complete Data
From a statistical point of view, the method of maximum likelihood estimation is, with some exceptions, considered to be the most robust of the parameter estimation techniques discussed here. This method is presented in this section for complete data, that is, data consisting only of single times-to-failure.

Background on Theory
The basic idea behind MLE is to obtain the most likely values of the parameters, for a given distribution, that will best describe the data. As an example, consider the following data (-3, 0, 4) and assume that you are trying to estimate the mean of the data. Now, if you have to choose the most likely value for the mean from -5, 1 and 10, which one would you choose? In this case, the most likely value is 1 (given your limit on choices). Similarly, under MLE, one determines the most likely values for the parameters of the assumed distribution.

It is mathematically formulated as follows:

If $$x$$ is a continuous random variable with $$pdf:$$


 * $$f(x;{{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}})$$

where $${\theta _1},{\theta _2},...,{\theta _k}$$ are $$k$$ unknown parameters which need to be estimated, with  independent observations, , which correspond in the case of life data analysis to failure times. The likelihood function is given by:


 * $$L({{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}}|{{x}_{1}},{{x}_{2}},...,{{x}_{R}})=L=\underset{i=1}{\overset{R}{\mathop \prod }}\,f({{x}_{i}};{{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}})

$$


 * $$i=1,2,...,R$$

The logarithmic likelihood function is given by:

undefined
 * $$\Lambda = \ln L =\sum_{i = 1}^R \ln f({x_i};{\theta _1},{\theta _2},...,{\theta _k})

$$

The maximum likelihood estimators (or parameter values) of $${{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}},$$ are obtained by maximizing $$L$$ or $$\Lambda .$$

By maximizing $$\Lambda ,$$ which is much easier to work with than $$L$$, the maximum likelihood estimators (MLE) of $${{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}}$$ are the simultaneous solutions of $$k$$ equations such that:


 * $$\frac{\partial{\Lambda}}{\partial{\theta_j}}=0, \text{ j=1,2...,k}

$$

Even though it is common practice to plot the MLE solutions using median ranks (points are plotted according to median ranks and the line according to the MLE solutions), this is not completely representative. As can be seen from the equations above, the MLE method is independent of any kind of ranks. For this reason, the MLE solution often appears not to track the data on the probability plot. This is perfectly acceptable since the two methods are independent of each other, and in no way suggests that the solution is wrong.

Comments on the MLE Method
The MLE method has many large sample properties that make it attractive for use. It is asymptotically consistent, which means that as the sample size gets larger, the estimates converge to the right values. It is asymptotically efficient, which means that for large samples, it produces the most precise estimates. It is asymptotically unbiased, which means that for large samples one expects to get the right value on average. The distribution of the estimates themselves is normal, if the sample is large enough, and this is the basis for the usual Fisher Matrix confidence bounds discussed later. These are all excellent large sample properties.

Unfortunately, the size of the sample necessary to achieve these properties can be quite large: thirty to fifty to more than a hundred exact failure times, depending on the application. With fewer points, the methods can be badly biased. It is known, for example, that MLE estimates of the shape parameter for the Weibull distribution are badly biased for small sample sizes, and the effect can be increased depending on the amount of censoring. This bias can cause major discrepancies in analysis. There are also pathological situations when the asymptotic properties of the MLE do not apply. One of these is estimating the location parameter for the three-parameter Weibull distribution when the shape parameter has a value close to 1. These problems, too, can cause major discrepancies.

However, MLE can handle suspensions and interval data better than rank regression, particularly when dealing with a heavily censored data set with few exact failure times or when the censoring times are unevenly distributed. It can also provide estimates with one or no observed failures, which rank regression cannot do. As a rule of thumb, our recommendation is to use rank regression techniques when the sample sizes are small and without heavy censoring (censoring is discussed in Chapter 4). When heavy or uneven censoring is present, when a high proportion of interval data is present and/or when the sample size is sufficient, MLE should be preferred.