Template:BayesianParameterEstimationMethod

Bayesian Parameter Estimation Methods
Up to this point, we have dealt exclusively with what is commonly referred to as classical statistics. In this section, another school of thought in statistical analysis will be introduced, namely Bayesian statistics. The premise of Bayesian statistics (within the context of life data analysis) is to incorporate prior knowledge, along with a given set of current observations, in order to make statistical inferences. The prior information could come from operational or observational data, from previous comparable experiments or from engineering knowledge. This type of analysis can be particularly useful when there is limited test data for a given design or failure mode but there is a strong prior understanding of the failure rate behavior for that design or mode. By incorporating prior information about the parameter(s), a posterior distribution for the parameter(s) can be obtained and inferences on the model parameters and their functions can be made. This section is intended to give a quick and elementary overview of Bayesian methods, focused primarily on the material necessary for understanding the Bayesian analysis methods available in Weibull++. Extensive coverage of the subject can be found in numerous books dealing with Bayesian statistics.

Bayes’s Rule
Bayes’s rule provides the framework for combining prior information with sample data. In this reference, we apply Bayes’s rule for combining prior information on the assumed distribution's parameter(s)  with sample data in order to make inferences based on the model. The prior knowledge about the parameter(s) is expressed in terms of a    $$\varphi (\theta ),$$ called the prior distribution. The posterior distribution of $$\theta $$ given the sample data, using Bayes rule, provides the updated information about the parameters $$\theta $$. This is expressed with the following posterior $$pdf$$:


 * $$ f(\theta |Data) = \frac{L(Data|\theta )\varphi (\theta )}{\int_{\zeta}^{} L(Data|\theta )\varphi(\theta )d (\theta)}

$$


 * where:


 * $$\theta $$ is a vector of the parameters of the chosen distribution,
 * $$\zeta$$ is the range of $$\theta$$ ,
 * $$ L(Data|\theta)$$ is the likelihood function based on the chosen distribution and data
 * $$\varphi(\theta )$$ is the prior distribution for each of the parameters.

The integral in the Bayes' rule equation is often referred to as the marginal probability, which is a constant number that can be interpreted as the probability of obtaining the sample data given a prior distribution. Generally, the integral in the Bayes' rule equation does not have a closed form solution and numerical methods are needed for its solution.

As can be seen from the Bayes' rule equation, there is a significant difference between classical and Bayesian statistics. First, the idea of prior information does not exist in classical statistics. All inferences in classical statistics are based on the sample data. On the other hand, in the Bayesian framework, prior information constitutes the basis of the theory. Another difference is in the overall approach of making inferences and their interpretation. For example, in Bayesian analysis, the parameters of the distribution to be fitted are the random variables. In reality, there is no distribution fitted to the data in the Bayesian case.

For instance, consider the case where data is obtained from a reliability test. Based on prior experience on a similar product, the analyst believes that the shape parameter of the Weibull distribution has a value between $${\beta _1}$$ and $${{\beta }_{2}}$$ and wants to utilize this information. This can be achieved by using the Bayes theorem. At this point, the analyst is automatically forcing the Weibull distribution as a model for the data and with a shape parameter between $${\beta _1}$$ and $${\beta _2}$$. In this example, the range of values for the shape parameter is the prior distribution, which in this case is Uniform. By applying the Bayes' rule, the posterior distribution of the shape parameter will be obtained. Thus, we end up with a distribution for the parameter rather than an estimate of the parameter, as in classical statistics.

To better illustrate the example, assume that a set of failure data was provided along with a distribution for the shape parameter (i.e., uniform prior) of the Weibull (automatically assuming that the data are Weibull distributed). Based on that, a new distribution (the posterior) for that parameter is then obtained using Bayes' rule. This posterior distribution of the parameter may or may not resemble in form the assumed prior distribution. In other words, in this example the prior distribution of $$\beta $$ was assumed to be uniform but the posterior is most likely not a uniform distribution.

The question now becomes: what is the value of the shape parameter? What about the reliability and other results of interest? In order to answer these questions, we have to remember that in the Bayesian framework all of these metrics are random variables. Therefore, in order to obtain an estimate, a probability needs to be specified or we can use the expected value of the posterior distribution.

In order to demonstrate the procedure of obtaining results from the posterior distribution, we will rewrite Bayes' rule equation for a single parameter $${\theta _1}$$:


 * $$ f(\theta |Data) = \frac{L(Data|\theta_1 )\varphi (\theta_1 )}{\int_{\zeta}^{} L(Data|\theta_1 )\varphi(\theta_1 )d (\theta)}

$$

The expected value (or mean value) of the parameter $${{\theta }_{1}}$$ can be obtained using the equation for the mean and the Bayes' rule equation for single parameter:


 * $$E({\theta _1}) = {m_} = \int_{\zeta}^{}{\theta _1} \cdot f({\theta _1}|Data)d{\theta _1}$$

An alternative result for $${\theta _1}$$  would be the median value. Using the equation for the median and the Bayes' rule equation for single parameter:


 * $$\int_{-\infty ,0}^{{\theta }_{0.5}}f({{\theta }_{1}}|Data)d{{\theta }_{1}}=0.5$$

The equation for the median is solved for $${\theta _{0.5}}$$ the median value of $${\theta _1}$$

Similarly, any other percentile of the posterior $$pdf$$ can be calculated and reported. For example, one could calculate the $$90th$$ percentile of $${\theta _1}$$’s posterior $$pdf$$:


 * $$\int_{-\infty ,0}^f({{\theta }_{1}}|Data)d{{\theta }_{1}}=0.9$$

This calculation will be used in Chapter 5 for obtaining confidence bounds on the parameter(s).undefined

The next step will be to make inferences on the reliability. Since the parameter $${\theta _1}$$ is a random variable described by the posterior $$pdf,$$ all subsequent functions of $${{\theta }_{1}}$$ are distributed random variables as well and are entirely based on the posterior $$pdf$$ of $${{\theta }_{1}}$$. Therefore, expected value, median or other percentile values will also need to be calculated. For example, the expected reliability at time $$T$$ is:


 * $$E[R(T|Data)] = \int_{\varsigma}^{} R(T)f(\theta |Data)d{\theta}$$

In other words, at a given time $$T$$, there is a distribution that governs the reliability value at that time, $$T$$, and by using the Bayes' rule, the expected (or mean) value of the reliability is obtained. Other percentiles of this distribution can also be obtained. A similar procedure is followed for other functions of $${\theta _1}$$, such as failure rate, reliable life, etc.

Prior Distributions
Prior distributions play a very important role in Bayesian Statistics. They are essentially the basis in Bayesian analysis. Different types of prior distributions exist, namely informative and non-informative. Non-informative prior distributions (a.k.a. vague, flat and diffuse) are distributions that have no population basis and play a minimal role in the posterior distribution. The idea behind the use of non-informative prior distributions is to make inferences that are not greatly affected by external information or when external information is not available. The uniform distribution is frequently used as a non-informative prior.

On the other hand, informative priors have a stronger influence on the posterior distribution. The influence of the prior distribution on the posterior is related to the sample size of the data and the form of the prior. Generally speaking, large sample sizes are required to modify strong priors, where weak priors are overwhelmed by even relatively small sample sizes. Informative priors are typically obtained from past data.