Template:Probability Plotting

Probability Plotting
The least mathematically intensive method for parameter estimation is the method of probability plotting. As the term implies, probability plotting involves a physical plot of the data on specially constructed probability plotting paper. This method is easily implemented by hand, given that one can obtain the appropriate probability plotting paper.

Illustrating the method for the 2-parameter Weibull distribution
The method of probability plotting takes the cdf of the distribution and attempts to linearize it by employing a specially constructed paper. This is best illustrated using the 2-parameter Weibull distribution.

In the case of the two-parameter Weibull distribution, the cdf (also the unreliability Q(t)) is given by:
 * $$F(t)=Q(t)=1-{e^{-\left(\tfrac{t}{\eta}\right)^{\beta}}}$$

Linearizing the Weibull unreliability function
This function can then be linearized (i.e. put in the common form of $$y=mx+b$$ format) as follows:
 * $$\begin{align}

Q(t)= & 1-{e^{-\left(\tfrac{t}{\eta}\right)^{\beta}}}  \\ \ln (1-Q(t))= & \ln \left[ {e^{-\left(\tfrac{t}{\eta}\right)^{\beta}}} \right] \\ \ln (1-Q(t))=& -\left(\tfrac{t}{\eta}\right)^{\beta} \\ \ln ( -\ln (1-Q(t)))= & \beta \left(\ln \left( \frac{t}{\eta }\right)\right) \\ \ln \left( \ln \left( \frac{1}{1-Q(t)}\right) \right) = & \beta\ln{ t} -\beta(\eta ) \\ \end{align}$$

Then by setting
 * $$y=\ln \left( \ln \left( \frac{1}{1-Q(t)} \right) \right)$$

and
 * $$x=\ln \left( t \right)$$

the equation can then be rewritten as,
 * $$y=\beta x-\beta \ln \left( \eta \right)$$

which is now a linear equation with a slope of
 * Slope$$=m=\beta $$

and an intercept of
 * Intercept$$=b=-\beta \cdot ln(\eta)$$.

Constructing the paper
The next task is to construct the Weibull probability plotting paper with the appropriate y - and x-axes. The x-axis trasnformation is simply logarithmic. The y-axis, is a bit more complex requiring a double log reciprocal transformation, or ,


 * $$y=\ln \left(\ln \left( \frac{1}{1-Q(t)} ) \right) \right)$$

where Q(t) is the unreliability.

Such papers have been created by different vendors and are called  probability plotting papers. Weibull.com has different plotting papers available for download.



To illustrate, consider the following probability plot on a slightly different type of Weibull probability paper.



This paper is constructed based on the mentioned y - and x-transformations, where the y-axis represents unreliability and the x-axis represents time. Both of these values must be known for each time-to-failure point we want to plot.

Then, given the y and x value for each point, the points can easily be put on the plot. Once the points have been placed on the plot, the best possible straight line is drawn through these points. Once the line has been drawn, the slope of the line can be obtained (some probability papers include a slope indicator to simplify this calculation). This is the parameter $$\beta,$$ which is the value of the slope. To determine the scale parameter, $$\eta $$ (also called the characteristic life), one reads the time (from the x-axis corresponding to Q(t)=63.2%.

Note that from before at
 * $$\begin{align}

Q(t=\eta)= & 1-{{e}^{-{{\left( \tfrac{t}{\eta } \right)}^{\beta }}}} \\ = & 1-{{e}^{-1}} \\ = & 0.632 \\  = & 63.2%  \end{align}$$ Thus, if we enter the y axis at Q(t)=63.2%, the corresponding value of t will be equal to $$\eta.$$ Thus, using this simple methodology, the parameters of the Weibull distribution can be estimated.

Determining the x and y Position of the Plot Points
The points on the plot represent our data or, more specifically, our times-to-failure data. If, for example, we tested four units that failed at 10, 20, 30 and 40 hours, we would use these times as our x values or time values.

Determining what the appropriate y plotting positions, or the unreliability values, is a little more complex. To determine the y plotting positions, we must first determine a value indicating the corresponding unreliability for that failure. In other words, we need to obtain the cumulative percent failed for each time-to-failure. In this example, and by 10 hours, the cumulative percent failed is 25%, by 20 hours 50%, and so forth. This is a simple method illustrating the idea. The problem with this simple method is the fact that the 100% point is not defined on most probability plots, thus an alternative and more robust approach must be used. The most widely used method of determining this value is the method of obtaining the median rank for each failure. This is discussed next.

Beta and F distributions Approach
A more straightforward and easier method of estimating median ranks is by applying two transformations to the cumulative binomial equation, first to the beta distribution and then to the F distribution, resulting in  [12, 13],
 * $$\begin{array}{*{35}{l}}

MR & = & \tfrac{1}{1+\tfrac{N-j+1}{j}{{F}_{0.50;m;n}}} \\ m & = & 2(N-j+1) \\ n & = & 2j \\ \end{array}$$ where $${F_{0.50;m;n}}$$ denotes the F distribution at the 0.50 point, with m and n degrees of freedom, for failure j out of N units.