Basic Statistical Background

In this section provides a brief elementary introduction to the most common and fundamental statistical equations and definitions used in reliability engineering and life data analysis.

Random Variables
In general, most problems in reliability engineering deal with quantitative measures, such as the time-to-failure of a component, or whether the component fails or does not fail. In judging a component to be defective or non-defective, only two outcomes are possible. We can then denote a random variable X as representative of these possible outcomes (i.e. defective or non-defective). In this case, X is a random variable that can only take on these values.

In the case of times-to-failure, our random variable X can take on the time-to-failure (or time to an event of interest) of the product or component and can be in a range from 0 to infinity (since we do not know the exact time a priori).

In the first case, where the random variable can take on only two discrete values (let's say defective =0 and non-defective=1>), the variable is said to be a discrete random variable. In the second case, our product can be found failed at any time after time 0, i.e. at 12.4 hours or at 100.12 miles and so forth, thus X can take on any value in this range. In this case, our random variable X is said to be a continuous random variable.

Designations
From probability and statistics, given a continuous random variable, we denote:


 * The probability density function, pdf, as f(x).


 * The cumulative distribution function, cdf, as F(x).

The pdf and cdf give a complete description of the probability distribution of a random variable.

Definitions
If $$X$$ is a continuous random variable, then the probability density function, $$pdf$$, of $$X$$, is a function $$f(x)$$ such that for two numbers, $$a$$ and $$b$$ with $$a\le b$$:


 * $$P(a \le X \le b)=\int_a^b f(x)dx$$  and  $$f(x)\ge 0 $$ for all x

That is, the probability that takes on a value in the interval [a,b] is the area under the density function from $$a$$ to $$b$$. The cumulative distribution function, $$cdf$$, is a function $$F(x)$$ of a random variable, $$X$$, and is defined for a number $$x$$ by:


 * $$F(x)=P(X\le x)=\int_0^\infty xf(s)ds $$

That is, for a given value $$x$$, $$F(x)$$ is the probability that the observed value of $$X$$ will be at most $$x$$. Note that the limits of integration depend on the domain of $$f(x)$$. For example, for all the distributions considered in this reference, this domain would be $$[0,+\infty]$$,  $$[-\infty ,+\infty]$$ or $$[\gamma ,+\infty]$$. In the case of $$[\gamma ,+\infty ]$$, we use the constant $$\gamma $$ to denote an arbitrary non-zero point (or a location that indicates the starting point for the distribution). Figure 3-1, on the next page, illustrates the relationship between the probability density function and the cumulative distribution function.

Mathematical Relationship Between the $$pdf$$ and $$cdf$$
The mathematical relationship between the $$pdf$$ and $$cdf$$ is given by:


 * $$F(x)=\int_{-\infty }^x f(s)ds$$

Conversely:


 * $$f(x)=\frac{d(F(x))}{dx}$$

In plain English, the value of the $$cdf$$ at $$x$$ is the area under the probability density function up to $$x$$, if so chosen. It should also be pointed out that the total area under the  $$pdf$$ is always equal to 1, or mathematically:

$$\int_{-\infty }^{\infty }f(x)dx=1$$

An example of a probability density function is the well-known normal distribution, whose $$pdf$$ is given by:


 * $$f(t)={\frac{1}{\sigma \sqrt{2\pi }}}{e^{-\frac{1}{2}}(\frac{t-\mu}{\sigma})^2}$$

where $$\mu $$ is the mean and .. is the standard deviation. The normal distribution is a two-parameter distribution, i.e. with two parameters $$\mu $$ and $$\sigma $$. Another two-parameter distribution is the lognormal distribution, whose $$pdf$$  is given by:


 * $$f(t)=\frac{1}{t\cdot {{\sigma }^{\prime }}\sqrt{2\pi }}{e}^{-\tfrac{1}{2}(\tfrac{t^{\prime}-{\mu^{\prime}}}{\sigma^{\prime}})^2}$$

where $$ t^{\prime}$$ is the natural logarithm of the times-to-failure, $$\mu^{\prime}$$ is the mean of the natural logarithms of the times-to-failure and $$\sigma^{\prime}$$ is the standard deviation of the natural logarithms of the times-to-failure, $$ t^{\prime }$$.

The Reliability Function
The reliability function can be derived using the previous definition of the cumulative distribution function, Eqn. (e21). The probability of an event happening by time $$t$$ is given by:

$$F(t)=\int_{0,\gamma}^{t}f(s)ds$$

In particular, this represents the probability of a unit failing by time $$t$$. From this, we obtain the most commonly used function in reliability engineering, the reliability function, which represents the probability of success of a unit in undertaking a mission of a prescribed duration. To mathematically show this, we first define the unreliability function, $$Q(t)$$, which is the probability of failure, or the probability that our time-to-failure is in the region of $$0$$ (or $$\gamma $$) and $$t$$. So from Eqn.(ee34):

$$F(t)=Q(t)=\int_{0,\gamma}^{t}f(s)ds$$

In this situation, there are only two states that can occur: success or failure. These two states are also mutually exclusive. Since reliability and unreliability are the probabilities of these two mutually exclusive states, the sum of these probabilities is always equal to unity. So then:


 * $$\begin{align}

Q(t)+R(t)& = 1 \\ R(t) & = 1-Q(t) \\ R(t) & = 1-\int_{0,\gamma}^{t}f(s)ds \\ R(t) & = \int_{t}^{\infty }f(s)ds \end{align}$$

Conversely:

$$f(t)=-\frac{d(R(t))}{dt}$$

The Conditional Reliability Function
The reliability function discussed previously assumes that the unit is starting the mission with no accumulated time, i.e. brand new. Conditional reliability calculations allow one to calculate the probability of a unit successfully completing a mission of a particular duration given that it has already successfully completed a mission of a certain duration. In this respect, the conditional reliability function could be considered to be the reliability of used equipmen.

The conditional reliability function is given by the equation,


 * $$R(t|T)=\frac{R(T+t)}{R(T)}$$

where:
 * t is the duration of the new mission, and
 * T is the duration of the successfully completed previous mission.

In other words, the fact that the equipment has already successfully completed a mission, $$T$$, tells us that the product successfully traversed the failure rate path for the period from $$0\to T$$, and it will now be failing according to the failure rate from $$T\to T+t$$ undefined. This is used when analyzing warranty data (Chapter 11).

The Failure Rate Function
The failure rate function enables the determination of the number of failures occurring per unit time. Omitting the derivation (see [19; Chapter 4]), the failure rate is mathematically given as,

$$\begin{align} \lambda(t) & = \frac{f(t)}{1-\int_{0,\gamma}^{t}f(s)ds} \\ & = \frac{f(t)}{R(t)} \\ \end{align}$$

The failure rate function has the units of failures per unit time among surviving, e.g. 1 failure per month.

The Mean Life Function
The mean life function, which provides a measure of the average time of operation to failure, is given by:

$$\mu = m =\int_{0,\gamma}^{\infty}t\cdot f(t)dt$$

It should be noted that this is the expected or average time-to-failure and is denoted as the $$MTBF$$ undefined(Mean-Time-Before Failure) and is also called $$MTTF$$ (Mean-Time-To-Failure) by many authors.

The Median Life Function
Median life,$$\breve{T}$$ is the value of the random variable that has exactly one-half of the area under the $$pdf$$ to its left and one-half to its right. The median is obtained from:

$$\int_{-\infty}^{\breve{T}}f(t)dt=0.5$$

For sample data, e.g. 12, 20, 21, the median is the midpoint value, or 20 in this case.

The Mode Function
The modal (or mode) life, is the maximum value of $$T$$  that satisfies:

$$\frac{d\left[ f(t) \right]}{dt}=0$$

For a continuous distribution, the mode is that value of the variate which corresponds to the maximum probability density (the value where the $$pdf$$ has its maximum value).undefined

Distributions
A statistical distribution is fully described by its $$pdf$$ (or probability density function). In the previous sections, we used the definition of the $$pdf$$ to show how all other functions most commonly used in reliability engineering and life data analysis can be derived, namely the reliability function, failure rate function, mean time function and median life function, etc. All of these can be determined directly from the $$pdf$$ definition, or $$f(t)$$. Different distributions exist, such as the normal, exponential, etc., and each one of them has a predefined form of $$f(t).$$ These distribution definitions can be found in many references. In fact, entire texts have been dedicated to defining families of statistical distributions. These distributions were formulated by statisticians, mathematicians and engineers to mathematically model or represent certain behavior. For example, the Weibull distribution was formulated by Walloddi Weibull and thus it bears his name. Some distributions tend to better represent life data and are most commonly called lifetime distributions. One of the simplest and most commonly used distributions (and often erroneously overused due to its simplicity)', is the exponential distribution. The $$pdf$$ of the exponential distribution is mathematically defined as:

$$f(t)=\lambda e^{-\lambda t}$$

In this definition, note that $$t$$ is our random variable which represents time and the Greek letter $$\lambda $$ (lambda) represents what is commonly referred to as the parameter of the distribution. Depending on the value of $$\lambda ,$$  $$f(t)$$ will be scaled differently. For any distribution, the parameter or parameters of the distribution are estimated from the data. For example, the most well-known distribution, the normal (or Gaussian) distribution, is given by:


 * $$f(t)=\frac{1}{\sigma \sqrt{2\pi }}{e}^{-\frac{1}{2}(\frac{t-\mu}{\sigma})^2}$$

$$\mu$$, the mean, and $$\sigma ,$$ the standard deviation, are its parameters. Both of these parameters are estimated from the data, i.e. the mean and standard deviation of the data. Once these parameters have been estimated, our function $$f(t)$$ is fully defined and we can obtain any value for $$f(t)$$ given any value of $$t$$. Given the mathematical representation of a distribution, we can also derive all of the functions needed for life data analysis, which again will depend only on the value of $$t$$ after the value of the distribution parameter or parameters have been estimated from data.undefined For example, we know that the exponential distribution $$pdf$$ is given by:

$$f(t)=\lambda e^{-\lambda t}$$

Thus, the exponential reliability function can be derived to be:

$$\begin{align} R(t)= & 1-\int_{0}^{t}\lambda {{e}^{-\lambda s}}ds \\ = & 1-[ 1-{{e}^{-\lambda \cdot t}}] \\ = & {{e}^{-\lambda \cdot t}} \\ \end{align}$$

The exponential failure rate function is:

$$\begin{align} \lambda (t) =&  \frac{f(t)}{R(t)} \\ =& \frac{\lambda {e}^{-\lambda t}}{e^{-\lambda t}} \\ =& \lambda \end{align}$$

The exponential Mean-Time-To-Failure (MTTF) is given by:

$$\begin{align} \mu = & \int_{0}^{\infty} t\cdot f(t)dt \\ = & \int_{0}^{\infty}{t \cdot {\lambda} \cdot e^{-\lambda t}}dt \\ = & \frac{1}{\lambda } \end{align}$$

This exact same methodology can be applied to any distribution given its $$pdf$$, with various degrees of difficulty depending on the complexity of $$f(t)$$.

Parameter Types
Distributions can have any number of parameters. Do note that as the number of parameters increases, so does the amount of data required for a proper fit. In general, most distributions used for reliability and life data analysis, the lifetime distributions, usually are limited to a maximum of three parameters. These three parameters are usually known as the scale parameter, the shape parameter and the location parameter.

Scale Parameter
The scale parameter is the most common type of parameter. All distributions in this reference have a scale parameter. In the case of one-parameter distributions, the sole parameter is the scale parameter. The scale parameter defines where the bulk of the distribution lies, or how stretched out the distribution is. In the case of the normal distribution, the scale parameter is the standard deviation.

Shape Parameter
The shape parameter, as the name implies, helps define the shape of a distribution. Some distributions, such as the exponential or normal, do not have a shape parameter since they have a predefined shape that does not change. In the case of the normal distribution, the shape is always the familiar bell shape. The effect of the shape parameter on a distribution is reflected in the shapes of the $$pdf$$, the reliability function and the failure rate function.

Location Parameter
The location parameter is used to shift a distribution in one direction or another. The location parameter, usually denoted as $$\gamma $$, defines the location of the origin of a distribution and can be either positive or negative. In terms of lifetime distributions, the location parameter represents a time shift. This means that the inclusion of a location parameter for a distribution whose domain is normally $$[0,\infty]$$ will change the domain to $$[\gamma ,\infty]$$, where $$\gamma $$ can be either positive or negative. This can have some profound effects in terms of reliability. For a positive location parameter, this indicates that the reliability for that particular distribution is always 100% up to that point. In other words, a failure cannot occur before this time $$\gamma $$. Many engineers feel uncomfortable in saying that a failure will absolutely not happen before any given time. On the other hand, the argument can be made that almost all life distributions have a location parameter, although many of them may be negligibly small. Similarly, many people are uncomfortable with the concept of a negative location parameter, which states that failures theoretically occur before time zero. Realistically, the calculation of a negative location parameter is indicative of quiescent failures (failures that occur before a product is used for the first time) or of problems with the manufacturing, packaging or shipping processes. More attention will be given to the concept of the location parameter in subsequent discussions of the exponential and Weibull distributions, which are the lifetime distributions that most frequently employ the location parameter.

Most Commonly Used Distributions
There are many different lifetime distributions that can be used to model reliability data. Leemis [22] and others present a good overview of many of these distributions. In this reference, we will concentrate on the most commonly used and most widely applicable distributions for life data analysis, as outlined in the following sections.

The Weibull Distribution
The Weibull distribution is a general purpose reliability distribution used to model material strength, times-to-failure of electronic and mechanical components, equipment or systems. In its most general case, the three-parameter Weibull $$pdf$$ is defined by:

$$f(t)=\frac{\beta}{\eta }( \frac{t-\gamma }{\eta } )^{\beta -1}{e}^{-(\tfrac{t-\gamma }{\eta }) ^{\beta}}$$

with three parameters $$\beta $$,  $$\eta $$  and  $$\gamma ,$$  where  $$\beta =$$  shape parameter,  $$\eta =$$  scale parameter and location parameter. If the location parameter, $$\gamma $$, is assumed to be zero, the distribution then becomes the two-parameter Weibull or:

$$f(t)=\frac{\beta}{\eta }( \frac{t }{\eta } )^{\beta -1}{e}^{-(\tfrac{t }{\eta }) ^{\beta}}$$

One additional form is the one-parameter Weibull distribution, which assumes that the location parameter, $$\gamma ,$$ is zero, and the shape parameter is a known constant, or $$\beta =$$ constant $$=C$$, so:

$$f(t)=\frac{C}{\eta}(\frac{t}{\eta})^{C-1}e^{-(\frac{t}{\eta})^C} $$

Chapter 6 of this reference fully details the Weibull distribution and presents many examples of its use in Weibull++.

The Weibull-Bayesian Distribution
Another approach is the Weibull-Bayesian model which assumes that the analyst has some prior knowledge about the distribution of the shape parameter ( $$\beta )$$ of the Weibull distribution. There are many practical applications for this model, particularly when dealing with small sample sizes and/or some prior knowledge for the shape parameter is available. For example, when a test is performed, there is often a good understanding about the behavior of the failure mode under investigation, primarily through historical data or physics-of-failure. Note that this is not the same as the so called WeiBayes model. The so called WeiBayes model is really a one-parameter Weibull distribution. It assumes a fixed value (constant) for the shape parameter and solves for the scale parameter. The Weibull-Bayesian model in Weibull++ 7 is actually a true WeiBayes model and offers an alternative to the one-parameter Weibull by including the variation and uncertainty that is present in the prior estimation of the shape parameter. The Weibull-Bayesian distribution and its characteristics are presented in more detail in Chapter 6.

The Exponential Distribution
The exponential distribution is commonly used for components or systems exhibiting a constant failure rate and is defined in its most general case by:

$$f(t)=\lambda {e}^{-\lambda(t-\gamma )}$$

(also known as the two-parameter exponential in this form) with two parameters, namely $$\lambda $$ and  $$\gamma .$$ If the location parameter, $$\gamma $$, is assumed to be zero, the distribution then becomes the one-parameter exponential or,

$$f(t)=\lambda {{e}^{-\lambda t}}$$

The exponential distribution and its characteristics are presented in more detail in Chapter 7.

The Normal Distribution
The normal distribution is commonly used for general reliability analysis, times-to-failure of simple electronic and mechanical components, equipment or systems. The $$pdf$$ of the normal distribution is given by:

$$\begin{align} f(t)= \frac{1}{\sigma \sqrt{2\pi }}{e^{-\tfrac{1}{2}(\tfrac{t-\mu }{\sigma })^2}} \end{align}$$

where,

$$\begin{align} \mu = & \text{mean of the normal times to failure} \\ \sigma = & \text{standard deviation of the times to failure} \end{align}$$

The normal distribution and its characteristics are presented in more detail in Chapter 8.

The Lognormal Distribution
The lognormal distribution is commonly used for general reliability analysis, cycles-to-failure in fatigue, material strengths and loading variables in probabilistic design. When the natural logarithms of the times-to-failure are normally distributed, then we say that the data follow the lognormal distribution. The $$pdf$$ of the lognormal distribution is given by:

$$\begin{align} & f(t)=\frac{1}{t{\sigma}_{T'}\sqrt{2\pi}}e^{-\tfrac{1}{2}(\tfrac{T'-{\mu'}}{\sigma_{T'}})^2}\\ & f(t)\ge 0,t>0,{{\sigma }_{T'}}>0 \\ & {T'}= \ln (t) \end{align} $$

Where,

$$\begin{align} & {\mu'}= \text{mean of the natural logarithms of the times-to-failure} \\ & {\sigma_{T'}}= \text{standard deviation of the natural logarithms of the times to failure} \end{align}$$

The lognormal distribution and its characteristics are presented in more detail in Chapter 9.

Other Distributions
In addition to the distributions mentioned, the following additional distributions, even though not as frequently used in Life Data Analysis, have a variety of applications and can be found in many statistical references. They are included in Weibull++ as well as discussed in this reference.

The Mixed Weibull Distribution
The mixed Weibull distribution is commonly used for modeling the behavior of components or systems exhibiting multiple failure modes (mixed populations). It gives the global picture of the life of a product by mixing different Weibull distributions for different stages of the product’s life and is defined by:

$$f_{S}(t)=\sum_{i=1}^{S}p_{i}\frac{\beta_{i}}{\eta_{i}}(\frac{t}{\eta_{i}})^{\beta_{i}-1}e^{-(\frac{t}{\eta_{i}})^{\beta_{i}}} $$

where the value of $$S$$  is equal to the number of subpopulations. Note that this results in a total of $$(3\cdot S-1)$$ parameters. In other words, each population has a portion or mixing weight for the $${{i}^{th}}$$ population, a $$\beta_{i}$$, or shape parameter for the $${{i}^{th}}$$ population and  or scale parameter $$\eta_{i}$$ for $${{i}^{th}}$$ population. Note that the parameters are reduced to $$(3\cdot S-1)$$, given the fact that the following condition can also be used:

$$\sum_{i=1}^{s}p_{i}=1$$

The mixed Weibull distribution and its characteristics are presented in more detail in Chapter 10.

The Generalized Gamma Distribution
While not as frequently used for modeling life data as the distributions discussed previously, the generalized gamma distribution does have the ability to mimic the attributes of other distributions, such as the Weibull or lognormal, based on the values of the distribution’s parameters and also offers a compromise between two lifetime distributions. The generalized gamma function is a three-parameter distribution with parameters μ, $$\sigma$$ and λ. The pdf of the distribution is given by,

$$ f(x)=\begin{cases} \frac{|\lambda|}{\sigma \cdot t}\cdot \tfrac{1}{\Gamma( \tfrac{1}{\lambda}^2)}\cdot {e^{\tfrac{\lambda \cdot{\tfrac{\ln(t)-\mu}{\sigma}}+\ln( \tfrac{1}{{\lambda}^2})-e^{\lambda \cdot {\tfrac{\ln(t)-\mu}{\sigma}}}}{{\lambda}^2}}}, &\text{if} \lambda \ne 0 \\

\frac{1}{t\cdot \sigma \sqrt{2\pi }} e^{-\tfrac{1}{2}{(\tfrac{\ln(t)-\mu}{\sigma })^2}}, & \text{if} \lambda =0 \end{cases} $$

where Γ(x) is the gamma function, defined by:

$$\Gamma (x)=\int_{0}^{\infty}{s}^{x-1}{e^{-s}}ds$$

This distribution behaves as do other distributions based on the values of the parameters. For example, if λ = 1, the distribution is identical to the Weibull distribution. If both λ = 1 and σ = 1, the distribution is identical to the exponential distribution and for λ = 0, it is identical to the lognormal distribution. While the generalized gamma distribution is not often used to model life data by itself, its ability to behave like other more commonly-used life distributions is sometimes used to determine which of those life distributions should be used to model a particular set of data.

The Gamma Distribution
The gamma distribution is a flexible distribution that may offer a good fit to some sets of life data. Sometimes called the Erlang distribution, gamma distribution has applications in Bayesian analysis as a prior distribution and is also commonly used in queuing theory. The $$pdf$$ of the gamma distribution is given by:

$$\begin{align} f(t)= & \frac{e^{kz-{e^{z}}}}{t\Gamma(k)} \\ z= & \ln{t}-\mu \end{align}$$

where:

$$\begin{align} \mu = & \text{scale parameter} \\ k= & \text{shape parameter} \end{align}$$

where 0 $$0$$.

The gamma distribution and its characteristics are presented in more detail in Chapter 10.

The Logistic Distribution
The logistic distribution has a shape very similar to the normal distribution (i.e. bell shaped), but with heavier tails. Since the logistic distribution has closed form solutions for the reliability, $$cdf$$ and failure rate functions, it is sometimes preferred over the normal distribution, where these functions can only be obtained numerically. The $$pdf$$ of the logistic distribution is given by:

$$\begin{align} f(t)= & \frac{e^z}{\sigma {(1+{e^z})^{2}}} \\ z= & \frac{t-\mu }{\sigma } \\ \sigma > & 0 \end{align}$$

where:

$$ \mu = \text{location parameter,also denoted as }$$ $$\overline{T}$$

$$ \sigma=\text{scale parameter} $$

The logistic distribution and its characteristics are presented in more detail in Chapter 10.

The Loglogistic Distribution
As may be summarized from the name, the loglogistic distribution is similar to the logistic distribution. Specifically, the data follows a loglogistic distribution when the natural logarithms of the times-to-failure follow a logistic distribution. Accordingly, the loglogistic and lognormal distributions also share many similarities. The $$pdf$$ of the loglogistic distribution is given by:

$$ \begin{align} f(t)= & \frac{e^z}{\sigma t{(1+{e^z})^2}} \\ z= & \frac{T'-{\mu }'}{\sigma } \\ f(t)\ge & 0,t>0,{{\sigma }_{T'}}>0, \\ {T}'= & ln(t) \end{align}$$

where,

$$\begin{align} \mu'= & \text{scale parameter} \\ \sigma_{T}=& \text{shape parameter} \end{align}$$

The loglogistic distribution and its characteristics are presented in more detail in Chapter 10.

The Gumbel Distribution
The Gumbel distribution is also referred to as the Smallest Extreme Value (SEV) distribution or the Smallest Extreme Value (Type 1) distribution. The Gumbel distribution is appropriate for modeling strength, which is sometimes skewed to the left (few weak units fail under low stress, while the rest fail at higher stresses). The Gumbel distribution could also be appropriate for modeling the life of products that experience very quick wear out after reaching a certain age. The $$pdf$$ of the Gumbel distribution is given by:

$$\begin{align} f(t)= & \frac{1}{\sigma }{{e}^{z-{e^z}}} \\ z= &\frac{t-\mu }{\sigma } \\ f(T)\ge & 0,\sigma >0 \end{align}$$

where,

$$\begin{align} \mu = & \text{location parameter} \\ \sigma = & \text{scale parameter} \end{align}$$

The Gumbel distribution and its characteristics are presented in more detail in Chapter 10.

Parameter Estimation
Once a distribution has been selected, the parameters of the distribution need to be estimated. Several parameter estimation methods are available. This section will present an overview of these methods, starting with the relatively simple method of probability plotting and continuing with the more sophisticated least squares and maximum likelihood methods.

Probability Plotting
The least mathematically intensive method for parameter estimation is the method of probability plotting. As the term implies, probability plotting involves a physical plot of the data on specially constructed probability plotting paper. This method is easily implemented by hand, given that one can obtain the appropriate probability plotting paper. The method of probability plotting takes the $$cdf$$ of the distribution and attempts to linearize it by employing a specially constructed paper. For example, in the case of the two-parameter Weibull distribution, the $$cdf$$ and unreliability $$Q(T),$$ can be shown to be:

$$F(T)=Q(T)=1-{e^{-(\tfrac{T}{\eta})^{\beta}}}$$

This function can then be linearized (i.e. put in the common form of $$y=ax+b$$ format) as follows:

$$\begin{align} Q(T)= & 1-{e^{-(\tfrac{T}{\eta})^{\beta}}}  \\ \ln (1-Q(T))= & \ln {e^{-(\tfrac{T}{\eta})^{\beta}}} \\ \ln (1-Q(T))=& -(\tfrac{T}{\eta})^{\beta} \\ \ln ( -\ln (1-Q(T)))= & \beta (\ln( \frac{T}{\eta })) \\ \ln ( \ln( \frac{1}{1-Q(T)}) = & \beta\ln{ T} -\beta(\eta ) \\ \end{align}$$

Then setting:

$$y=\ln \left( \ln \left( \frac{1}{1-Q(T)} \right) \right)$$

and:

$$x=\ln \left( T \right)$$

the equation can be rewritten as,

$$y=\beta x-\beta \ln \left( \eta \right)$$

which is now a linear equation with a slope of $$\beta $$ and an intercept of $$\beta ln(\eta)$$. The next task is to construct a paper with the appropriate $$y$$ - and $$x$$ -axes. The x-axis calculation is easy since it is simply logarithmic. The y-axis, however, has to represent,

$$y=\ln (\ln ( \frac{1}{1-Q(T)} )))$$

where $$Q(T)$$ is the unreliability (or a double log reciprocal scale). Such papers have been created by different vendors and are called  probability plotting papers. To illustrate, consider the following probability plot on a Weibull probability paper.

This paper is constructed based on the mentioned $$y$$ - and $$x$$ -transformations, where the y-axis represents unreliability and the x-axis represents time. Both of these values must be known for each time-to-failure point we want to plot. Then, given the $$y$$ and $$x$$ value for each point, the points can easily be put on the plot. Once the points have been placed on the plot, the best possible straight line is drawn through these points. Once the line has been drawn, the slope of the line can be obtained (some probability papers include a slope indicator to simplify this calculation). This is the parameter $$\beta,$$ which is the value of the slope. To determine the scale parameter, $$\eta $$ (also called the characteristic life by some authors), one must simply set $$t = \eta $$ in the $$cdf$$ equation. Note that from before:

so at $$T=\eta$$

$$\begin{align} Q(T)= & 1-{{e}^{-{{\left( \tfrac{\eta }{\eta } \right)}^{\beta }}}} \\ = & 1-{{e}^{-1}} \\ = & 0.632 \\  = & 63.2%  \end{align}$$

Thus, if we enter the $$y$$ axis at $$Q(T)=63.2%,$$ the corresponding value of $$T$$will be equal to $$\eta .$$ Thus, using this simple but rather time-consuming methodology, the parameters of the Weibull distribution can be estimated.

Determining the $$X$$ and $$Y$$ Position of the Plot Points
The points on the plot represent our data or, more specifically, our times-to-failure data. If, for example, we tested four units that failed at 10, 20, 30 and 40 hours, we would use these times as our $$x$$ values or time values. Determining what the appropriate $$y$$ plotting positions, or the unreliability values, should be is a little more complex. To determine the $$y$$ plotting positions, we must first determine a value indicating the corresponding unreliability for that failure. In other words, we need to obtain the cumulative percent failed for each time-to-failure. In this example, and by 10 hours, the cumulative percent failed is 25%, by 20 hours 50%, and so forth. This is a simple method illustrating the idea. The problem with this simple method is the fact that the 100% point is not defined on most probability plots, thus an alternative and more robust approach must be used. The most widely used method of determining this value is the method of obtaining the median rank for each failure. This method is discussed next.

Median Ranks
Median ranks are used to obtain an estimate of the unreliability, $$Q({T_j})$$ for each failure. It is the value that the true probability of failure, $$Q({{T}_{j}}),$$ should have at the $${{j}^{th}}$$ failure out of a sample of $$N$$ units at a $$50%$$ confidence level. This essentially means that this is our best estimate for the unreliability. Half of the time the true value will be greater than the 50% confidence estimate, the other half of the time the true value will be less than the estimate. This estimate is based on a solution of the binomial equation. The rank can be found for any percentage point, $$P$$, greater than zero and less than one, by solving the cumulative binomial equation for $$Z$$. This represents the rank, or unreliability estimate, for the $${{j}^{th}}$$ failure[15; 16] in the following equation for the cumulative binomial:

$$P=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix}  N  \\   k  \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}}$$

where $$N$$ is the sample size and $$j$$ the order number. The median rank is obtained by solving this equation for $$Z$$ at $$P=0.50,$$

$$0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix}  N  \\   k  \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}}$$

For example, if $$N=4$$ and we have four failures, we would solve the median rank equation, Eqn. (medrank), four times; once for each failure with $$j=$$ 1, 2, 3 and 4, for the value of $$Z.$$ This result can then be used as the unreliability estimate for each failure or the $$y$$ plotting position. (The Weibull distribution chapter, Chapter 6, presents a step-by-step example for this method.) The solution of Eqn. (medrank) for $$Z$$ requires the use of numerical methods. A more straightforward and easier method of estimating median ranks is by applying two transformations to Eqn. (medrank), first to the beta distribution and then to the F distribution, resulting in [12;13],

$$\begin{array}{*{35}{l}} MR & = & \tfrac{1}{1+\tfrac{N-j+1}{j}{{F}_{0.50;m;n}}} \\ m & = & 2(N-j+1) \\ n & = & 2j \\ \end{array}$$

$${F_{0.50;m;n}}$$ denotes the F distribution at the 0.50 point, with $$m$$ and $$n$$ degrees of freedom, for the $${{j}^{th}}$$ failure out of $$N$$ units. Weibull++ uses this formulation when determining the median ranks. Another quick, and less accurate, approximation of the median ranks is also given by [15]:

$$MR = \frac$$

This approximation of the median ranks is also known as Benard's approximation.

Kaplan-Meier
The Kaplan-Meier estimator is used as an alternative to the median ranks method for calculating the estimates of the unreliability for probability plotting purposes. The equation of the estimator is given by,

$$\widehat{F}({{t}_{i}})=1-\underset{j=1}{\overset{i}{\mathop \prod }}\,\frac{{{n}_{j}}-{{r}_{j}}},\text{ }i=1,...,m$$

where,

$$\begin{align} m = & {\text{total number of data points}} \\ n = & {\text{the total number of units}} \\ {n_i} = & n - \sum_{j = 0}^{i - 1}{s_j} - \sum_{j = 0}^{i - 1}{r_j}, \text{i = 1,...,m }\\ {r_j} = & {\text{ number of failures in the}}{j^{th}}{\text{ data group, and}} \\ {s_j} = & {\text{number of surviving units in the }}{j^{th}}{\text{ data group}} \\ \end{align} $$

Weibull++ provides the option to select whether the median ranks or the Kaplan-Meier estimator is used for the unreliability estimates for probability plotting and regression. By default, the median ranks are used.

Probability Plots for Other Distributions
This same methodology can be applied to other distributions which have $$cdf$$ equations that can be linearized. Different probability papers exist for each distribution, since different distributions have different $$cdf$$ equations. Weibull++ automatically creates these plots for you when choosing a probability plot for a particular distribution. Special scales on these plots allow the parameter estimates to be derived directly from the plots, similar to the way $$\beta $$ and $$\eta $$ were obtained from the Weibull probability plot. These will be discussed in subsequent chapters on the individual distributions.

Some Shortfalls of Manual Probability Plotting
Besides the most obvious drawback to probability plotting, which is the amount of effort required, manual probability plotting is not always consistent in the results. Two people plotting a straight line through a set of points will not always draw this line the same way, and thus will come up with slightly different results. This method was used primarily before the widespread use of computers that could easily perform the calculations for more complicated parameter estimation methods, such as the least squares and maximum likelihood methods.

Least Squares Parameter Estimation (Regression Analysis)
Using the idea of probability plotting, regression analysis mathematically fits the best straight line to a set of points, in an attempt to estimate the parameters. Essentially, this is a mathematically based version of the probability plotting method discussed previously.

Background Theory
The method of linear least squares is used for all regression analysis performed by Weibull++, except for the cases of the three-parameter Weibull, mixed Weibull, gamma and generalized gamma distributions where a non-linear regression technique is employed. The terms linear regression and least squares are used synonymously in this reference. The term rank regression is used instead of least squares, or linear regression, because the regression is performed on the rank values, more specifically, the median rank values (represented on the y-axis). The method of least squares requires that a straight line be fitted to a set of data points, such that the sum of the squares of the distance of the points to the fitted line is minimized. This minimization can be performed in either the vertical or horizontal direction. If the regression is on $$X$$, then the line is fitted so that the horizontal deviations from the points to the line are minimized. If the regression is on Y, then this means that the distance of the vertical deviations from the points to the line is minimized. This is illustrated in the following figure.

Rank Regression on $$Y$$
Assume that a set of data pairs $$({x_1},{y_1})$$, $$({{x}_{2}},{{y}_{2}})$$,..., $$({{x}_{N}},{{y}_{N}})$$ were obtained and plotted, and that the $$x$$ -values are known exactly. Then, according to the least squares principle, which minimizes the vertical distance between the data points and the straight line fitted to the data, the best fitting straight line to these data is the straight line $$y=\hat{a}+\hat{b}x$$ (where the recently introduced $$(\hat{ })$$ symbol indicates that this value is an estimate) such that: .. and where $$\hat{a}$$ and $$\hat b$$ are the least squares estimates of $$a$$ and $$b$$,and $$N$$ is the number of data points. These equations are minimized by estimates of $$\widehat a$$ and $$\widehat{b}$$ such that:

$$\hat{a}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}-\hat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}}{N}=\bar{y}-\hat{b}\bar{x}$$

and:

$$\hat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,x_{i}^{2}-\tfrac{N}}$$

Rank Regression on X
Assume that a set of data pairs .., $$({x_2},{y_2})$$,..., $$({x_N},{y_N})$$were obtained and plotted, and that the y-values are known exactly. The same least squares principle is applied, this time minimizing the horizontal distance between the data points and the straight line fitted to the data. The best fitting straight line to these data is the straight line $$x=\widehat{a}+\widehat{b}y$$  such that:

$$\underset{i=1}{\overset{N}{\mathop \sum }}\,{{(\widehat{a}+\widehat{b}{{y}_{i}}-{{x}_{i}})}^{2}}=min(a,b)\underset{i=1}{\overset{N}{\mathop \sum }}\,{{(a+b{{y}_{i}}-{{x}_{i}})}^{2}}$$

Again, $$\widehat{a}$$ and $$\widehat b$$ are the least squares estimates of  and $$b,$$ and $$N$$ is the number of data points. These equations are minimized by estimates of $$\widehat a$$ and $$\widehat{b}$$ such that:

$$\hat{a}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}}{N}-\hat{b}\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}=\bar{x}-\hat{b}\bar{y}$$

and:

$$\widehat{b}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{N}}$$

The corresponding relations for determining the parameters for specific distributions (i.e., Weibull, exponential, etc.), are presented in the chapters covering that distribution.

The Correlation Coefficient
The correlation coefficient is a measure of how well the linear regression model fits the data and is usually denoted by $$\rho $$. In the case of life data analysis, it is a measure for the strength of the linear relation (correlation) between the median ranks and the data. The population correlation coefficient is defined as follows:

$$\rho =\frac$$

where $${{\sigma }_{xy}}=$$ covariance of  and  $$y$$,  $${{\sigma }_{x}}=$$  standard deviation of  $$x$$ , and  $${\sigma _y} = $$ standard deviation of $$y$$. The estimator of $$\rho $$ is the sample correlation coefficient, $$\hat{\rho }$$, given by,

$$\hat{\rho }=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}{{y}_{i}}-\tfrac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{x}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{y}_{i}}}{N}}{\sqrt{\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,x_{i}^{2}-\tfrac{N} \right)\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,y_{i}^{2}-\tfrac{N} \right)}}$$

The range of $$\hat \rho $$  is  $$-1\le \hat{\rho }\le 1.$$

The closer the value is to $$\pm 1$$, the better the linear fit. Note that +1 indicates a perfect fit (the paired values ( $${{x}_{i}},{{y}_{i}}$$ ) lie on a straight line) with a positive slope, while -1 indicates a perfect fit with a negative slope. A correlation coefficient value of zero would indicate that the data are randomly scattered and have no pattern or correlation in relation to the regression line model.

Comments on the Least Squares Method
The least squares estimation method is quite good for functions that can be linearized.undefined For these distributions, the calculations are relatively easy and straightforward, having closed-form solutions which can readily yield an answer without having to resort to numerical techniques or tables. Further, this technique provides a good measure of the goodness-of-fit of the chosen distribution in the correlation coefficient. Least squares is generally best used with data sets containing complete data, that is, data consisting only of single times-to-failure with no censored or interval data. Chapter 4 details the different data types, including complete, left censored, right censored (or suspended) and interval data.

MLE (Maximum Likelihood) Parameter Estimation for Complete Data
From a statistical point of view, the method of maximum likelihood estimation is, with some exceptions, considered to be the most robust of the parameter estimation techniques discussed here. This method is presented in this section for complete data, that is, data consisting only of single times-to-failure.

Background on Theory
The basic idea behind MLE is to obtain the most likely values of the parameters, for a given distribution, that will best describe the data. As an example, consider the following data (-3, 0, 4) and assume that you are trying to estimate the mean of the data. Now, if you have to choose the most likely value for the mean from -5, 1 and 10, which one would you choose? In this case, the most likely value is 1 (given your limit on choices). Similarly, under MLE, one determines the most likely values for the parameters of the assumed distribution. It is mathematically formulated as follows: If $$x$$ is a continuous random variable with $$pdf:$$

$$f(x;{{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}})$$

where $${\theta _1},{\theta _2},...,{\theta _k}$$ are $$k$$ unknown parameters which need to be estimated, with  independent observations, , which correspond in the case of life data analysis to failure times. The likelihood function is given by:

$$L({{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}}|{{x}_{1}},{{x}_{2}},...,{{x}_{R}})=L=\underset{i=1}{\overset{R}{\mathop \prod }}\,f({{x}_{i}};{{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}}) $$

$$i=1,2,...,R$$

The logarithmic likelihood function is given by:

undefined $$\Lambda = \ln L =\sum_{i = 1}^R \ln f({x_i};{\theta _1},{\theta _2},...,{\theta _k}) $$

The maximum likelihood estimators (or parameter values) of $${{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}},$$ are obtained by maximizing $$L$$ or $$\Lambda .$$ By maximizing $$\Lambda ,$$ which is much easier to work with than $$L$$, the maximum likelihood estimators (MLE) of $${{\theta }_{1}},{{\theta }_{2}},...,{{\theta }_{k}}$$ are the simultaneous solutions of $$k$$ equations such that:

$$\frac{\partial{\Lambda}}{\partial{\theta_j}}=0, \text{ j=1,2...,k} $$

Even though it is common practice to plot the MLE solutions using median ranks (points are plotted according to median ranks and the line according to the MLE solutions), this is not completely representative. As can be seen from the equations above, the MLE method is independent of any kind of ranks. For this reason, the MLE solution often appears not to track the data on the probability plot. This is perfectly acceptable since the two methods are independent of each other, and in no way suggests that the solution is wrong.

Comments on the MLE Method
The MLE method has many large sample properties that make it attractive for use. It is asymptotically consistent, which means that as the sample size gets larger, the estimates converge to the right values. It is asymptotically efficient, which means that for large samples, it produces the most precise estimates. It is asymptotically unbiased, which means that for large samples one expects to get the right value on average. The distribution of the estimates themselves is normal, if the sample is large enough, and this is the basis for the usual Fisher Matrix confidence bounds discussed later. These are all excellent large sample properties. Unfortunately, the size of the sample necessary to achieve these properties can be quite large: thirty to fifty to more than a hundred exact failure times, depending on the application. With fewer points, the methods can be badly biased. It is known, for example, that MLE estimates of the shape parameter for the Weibull distribution are badly biased for small sample sizes, and the effect can be increased depending on the amount of censoring. This bias can cause major discrepancies in analysis. There are also pathological situations when the asymptotic properties of the MLE do not apply. One of these is estimating the location parameter for the three-parameter Weibull distribution when the shape parameter has a value close to 1. These problems, too, can cause major discrepancies. However, MLE can handle suspensions and interval data better than rank regression, particularly when dealing with a heavily censored data set with few exact failure times or when the censoring times are unevenly distributed. It can also provide estimates with one or no observed failures, which rank regression cannot do. As a rule of thumb, our recommendation is to use rank regression techniques when the sample sizes are small and without heavy censoring (censoring is discussed in Chapter 4). When heavy or uneven censoring is present, when a high proportion of interval data is present and/or when the sample size is sufficient, MLE should be preferred.

Bayesian Statistics
Up to this point, we have dealt exclusively with what is commonly referred to as classical statistics. In this section, another school of thought in statistical analysis will be introduced, namely Bayesian statistics. The premise of Bayesian statistics (within the context of life data analysis) is to incorporate prior knowledge, along with a given set of current observations, in order to make statistical inferences. The prior information could come from operational or observational data, from previous comparable experiments or from engineering knowledge. This type of analysis can be particularly useful when there is limited test data for a given design or failure mode but there is a strong prior understanding of the failure rate behavior for that design or mode. By incorporating prior information about the parameter(s), a posterior distribution for the parameter(s) can be obtained and inferences on the model parameters and their functions can be made. This section is intended to give a quick and elementary overview of Bayesian methods, focused primarily on the material necessary for understanding the Bayesian analysis methods available in Weibull++. Extensive coverage of the subject can be found in numerous books dealing with Bayesian statistics.

Bayes’s Rule
Bayes’s rule provides the framework for combining prior information with sample data. In this reference, we apply Bayes’s rule for combining prior information on the assumed distribution's parameter(s)  with sample data in order to make inferences based on the model. The prior knowledge about the parameter(s) is expressed in terms of a    $$\varphi (\theta ),$$ called the prior distribution. The posterior distribution of $$\theta $$ given the sample data, using Bayes rule, provides the updated information about the parameters $$\theta $$. This is expressed with the following posterior $$pdf$$:

$$ f(\theta |Data) = \frac{L(Data|\theta )\varphi (\theta )}{\int_{\zeta}^{} L(Data|\theta )\varphi(\theta )d (\theta)} $$

where:

$$\theta $$ is a vector of the parameters of the chosen distribution,

$$\zeta$$ is the range of $$\theta$$ ,

$$ L(Data|\theta)$$is the likelihood function based on the chosen distribution and data

$$\varphi(\theta )$$ is the prior distribution for each of the parameters.

The integral in Eqn. (BayesRuleGeneral) is often referred to as the marginal probability and can be interpreted as the probability of obtaining the sample data given a prior distribution and it's a constant number. Generally, the integral in Eqn. (BayesRuleGeneral) does not have a closed form solution and numerical methods are needed for its solution. As can be seen from Eqn. (BayesRuleGeneral), there is a significant difference between classical and Bayesian statistics. First, the idea of prior information does not exist in classical statistics. All inferences in classical statistics are based on the sample data. On the other hand, in the Bayesian framework, prior information constitutes the basis of the theory. Another difference is in the overall approach of making inferences and their interpretation. For example, in Bayesian analysis the parameters of the distribution to be fitted are the random variables. In reality, there is no distribution fitted to the data in the Bayesian case. For instance, consider the case where data is obtained from a reliability test. Based on prior experience on a similar product, the analyst believes that shape parameter of the Weibull distribution has a value between $${\beta _1}$$ and $${{\beta }_{2}}$$ and wants to utilize this information. This can be achieved by using the Bayes theorem. At this point, the analyst is automatically forcing the Weibull distribution as a model for the data and with a shape parameter between $${\beta _1}$$ and $${\beta _2}$$. In this example, the range of values for the shape parameter is the prior distribution, which in this case is Uniform. By applying Eqn. (BayesRuleGeneral), the posterior distribution of the shape parameter will be obtained. Thus, we end up with a distribution for the parameter rather than an estimate of the parameter, as in classical statistics. To better illustrate the example, assume that a set of failure data was provided along with a distribution for the shape parameter (i.e. uniform prior) of the Weibull (automatically assuming that the data are Weibull distributed). Based on that, a new distribution (the posterior) for that parameter is then obtained using Eqn. (BayesRuleGeneral). This posterior distribution of the parameter may or may not resemble in form the assumed prior distribution. In other words, in this example the prior distribution of $$\beta $$ was assumed to be uniform but the posterior is most likely not a uniform distribution. The question now becomes: what is the value of the shape parameter? What about the reliability and other results of interest? In order to answer these questions, we have to remember that in the Bayesian framework all of these metrics are random variables. Therefore, in order to obtain an estimate, a probability needs to be specified or we can use the expected value of the posterior distribution. In order to demonstrate the procedure of obtaining results from the posterior distribution, we will rewrite Eqn. (BayesRuleGeneral) for a single parameter $${\theta _1}$$:

$$ f(\theta |Data) = \frac{L(Data|\theta_1 )\varphi (\theta_1 )}{\int_{\zeta}^{} L(Data|\theta_1 )\varphi(\theta_1 )d (\theta)} $$

The expected value (or mean value) of the parameter $${{\theta }_{1}}$$ can be obtained using Eqns. (mean) and (BayesRuleSingle):

$$E({\theta _1}) = {m_} = \int_{\zeta}^{}{\theta _1} \cdot f({\theta _1}|Data)d{\theta _1}$$

An alternative result for $${\theta _1}$$  would be the median value. Using Eqns. (median) and (BayesRuleSingle):

$$\int_{-\infty ,0}^{{\theta }_{0.5}}f({{\theta }_{1}}|Data)d{{\theta }_{1}}=0.5$$

Eqn. (bayesMedian) is solved for $${\theta _{0.5}}$$ the median value of $${\theta _1}$$ Similarly, any other percentile of the posterior $$pdf$$ can be calculated and reported. For example, one could calculate the $$90th$$ percentile of $${\theta _1}$$’s posterior $$pdf$$:

$$\int_{-\infty ,0}^f({{\theta }_{1}}|Data)d{{\theta }_{1}}=0.9$$

This calculation will be used in Chapter 5 for obtaining confidence bounds on the parameter(s).undefined The next step will be to make inferences on the reliability. Since the parameter $${\theta _1}$$ is a random variable described by the posterior $$pdf,$$ all subsequent functions of $${{\theta }_{1}}$$ are distributed random variables as well and entirely based on the posterior $$pdf$$ of $${{\theta }_{1}}$$. Therefore, expected value, median or other percentile values will also need to be calculated. For example, the expected reliability at time $$T$$ is:

$$E[R(T|Data)] = \int_{\varsigma}^{} R(T)f(\theta |Data)d{\theta}$$

In other words, at a given time $$T$$, there is a distribution that governs the reliability value at that time, $$T$$, and by using Eqn. (BayesRel), the expected (or mean) value of the reliability is obtained. Other percentiles of this distribution can also be obtained. A similar procedure is followed for other functions of $${\theta _1}$$, such as failure rate, reliable life, etc.

Prior Distributions
Prior distributions play a very important role in Bayesian Statistics. They are essentially the basis in Bayesian analysis. Different types of prior distributions exist, namely informative and non-informative. Non-informative prior distributions (a.k.a. vague, flat and diffuse) are distributions that have no population basis and play a minimal role in the posterior distribution. The idea behind the use of non-informative prior distributions is to make inferences that are not greatly affected by external information or when external information is not available. The uniform distribution is frequently used as a non-informative prior. On the other hand, informative priors have a stronger influence on the posterior distribution. The influence of the prior distribution on the posterior is related to the sample size of the data and the form of the prior. Generally speaking, large sample sizes are required to modify strong priors, where weak priors are overwhelmed by even relatively small sample sizes. Informative priors are typically obtained from past data.