Reliability Test Design

This chapter discusses several methods for designing reliability tests. This includes:


 * Reliability Demonstration Tests (RDT): Often used to demonstrate if the product reliability can meet the requirement. For this type of test design, four methods are supported in Weibull++:
 * Parametric Binomial: Used when the test duration is different from the time of the required reliability. An underlying distribution should be assumed.
 * Non-Parametric Binomial: No distribution assumption is needed for this test design method. It can be used for one shot devices.
 * Exponential Chi-Squared: Designed for exponential failure time.
 * Non-Parametric Bayesian: Integrated Bayesian theory with the traditional non-parametric binomial method.
 * Expected Failure Times Plot: Can help the engineer determine the expected test duration when the total sample size is known and the allowed number of failures is given.
 * Difference Detection Matrix: Can help the engineer design a test to compare the BX% life or mean life of two different designs/products.
 * Simulation: Simulation can be used to help the engineer determine the sample size, test duration or expected number of failures in a test. To determine these variables, analytical methods need to make assumptions such as the distribution of model parameters. The simulation method does not need any assumptions. Therefore, it is more accurate than the analytical method, especially when the sample size is small.

Readers may also be interested in test design methods for quantitative accelerated life tests. That topic is discussed in the Accelerated Life Testing Reference.

=Reliability Demonstration Tests= Frequently, a manufacturer will have to demonstrate that a certain product has met a goal of a certain reliability at a given time with a specific confidence. Several methods have been designed to help engineers: Cumulative Binomial, Non-Parametric Binomial, Exponential Chi-Squared and Non-Parametric Bayesian. They are discussed in the following sections.

Cumulative Binomial
This methodology requires the use of the cumulative binomial distribution in addition to the assumed distribution of the product's lifetimes. Not only does the life distribution of the product need to be assumed beforehand, but a reasonable assumption of the distribution's shape parameter must be provided as well. Additional information that must be supplied includes: a) the reliability to be demonstrated, b) the confidence level at which the demonstration takes place, c) the acceptable number of failures and d) either the number of available units or the amount of available test time. The output of this analysis can be the amount of time required to test the available units or the required number of units that need to be tested during the available test time. Usually the engineer designing the test will have to study the financial trade-offs between the number of units and the amount of test time needed to demonstrate the desired goal. In cases like this, it is useful to have a "carpet plot" that shows the possibilities of how a certain specification can be met.

Test to Demonstrate Reliability
Frequently, the entire purpose of designing a test with few or no failures is to demonstrate a certain reliability, $${{R}_{DEMO}}\,\!$$, at a certain time. With the exception of the exponential distribution (and ignoring the location parameter for the time being), this reliability is going to be a function of time, a shape parameter and a scale parameter.


 * $${{R}_{DEMO}}=g({{t}_{DEMO}};\theta ,\phi )$$

where:


 * $${{t}_{DEMO}}\,\!$$ is the time at which the demonstrated reliability is specified.
 * $$\theta\,\!$$ is the shape parameter.
 * $$\phi\,\!$$ is the scale parameter.

Since required inputs to the process include $${{R}_{DEMO}}\,\!$$, $${{t}_{DEMO}}\,\!$$  and  $$\theta\,\!$$, the value of the scale parameter can be backed out of the reliability equation of the assumed distribution, and will be used in the calculation of another reliability value,  $${{R}_{TEST}}\,\!$$, which is the reliability that is going to be incorporated into the actual test calculation. How this calculation is performed depends on whether one is attempting to solve for the number of units to be tested in an available amount of time, or attempting to determine how long to test an available number of test units.

Determining Units for Available Test Time

If one knows that the test is to last a certain amount of time, $${{t}_{TEST}}\,\!$$, the number of units that must be tested to demonstrate the specification must be determined. The first step in accomplishing this involves calculating the $${{R}_{TEST}}\,\!$$  value.

This should be a simple procedure since:


 * $${{R}_{TEST}}=g({{t}_{TEST}};\theta ,\phi )$$

and $${{t}_{DEMO}}\,\!$$,  $$\theta \,\!$$  and  $$\phi \,\!$$  are already known, and it is just a matter of plugging these values into the appropriate reliability equation.

We now incorporate a form of the cumulative binomial distribution in order to solve for the required number of units. This form of the cumulative binomial appears as:


 * $$1-CL=\underset{i=0}{\overset{f}{\mathop \sum }}\,\frac{n!}{i!\cdot (n-i)!}\cdot {{(1-{{R}_{TEST}})}^{i}}\cdot R_{TEST}^{(n-i)}$$

where:


 * $$CL\,\!$$ = the required confidence level
 * $$f\,\!$$ = the allowable number of failures
 * $$n\,\!$$ = the total number of units on test
 * $${{R}_{TEST}}\,\!$$ = the reliability on test

Since $$CL\,\!$$  and  $$f\,\!$$  are required inputs to the process and  $${{R}_{TEST}}\,\!$$  has already been calculated, it merely remains to solve the cumulative binomial equation for  $$n\,\!$$, the number of units that need to be tested.

Determining Test Time for Available Units

The way that one determines the test time for the available number of test units is quite similar to the process described previously. In this case, one knows beforehand the number of units, $$n\,\!$$, the number of allowable failures, $$f\,\!$$, and the confidence level,  $$CL\,\!$$. With this information, the next step involves solving the binomial equation for $${{R}_{TEST}}\,\!$$. With this value known, one can use the appropriate reliability equation to back out the value of $${{t}_{TEST}}\,\!$$, since  $${{R}_{TEST}}=g({{t}_{TEST}};\theta ,\phi )\,\!$$, and  $${{R}_{TEST}}\,\!$$,  $$\theta\,\!$$  and  $$\phi\,\!$$  have already been calculated or specified.

Test to Demonstrate MTTF
Designing a test to demonstrate a certain value of the $$MTTF$$  is identical to designing a reliability demonstration test, with the exception of how the value of the scale parameter  $$\phi \,\!$$  is determined. Given the value of the $$MTTF$$  and the value of the shape parameter  $$\theta \,\!$$, the value of the scale parameter  $$\phi \,\!$$  can be calculated. With this, the analysis can proceed as with the reliability demonstration methodology.

Non-Parametric Binomial
The binomial equation can also be used for non-parametric demonstration test design. There is no time value associated with this methodology, so one must assume that the value of $${{R}_{TEST}}\,\!$$  is associated with the amount of time for which the units were tested.

In other words, in cases where the available test time is equal to the demonstration time, the following non-parametric binomial equation is widely used in practice:


 * $$\begin{align}

1-CL=\sum_{i=0}^{f}\binom{n}{i}(1-{{R}_{TEST}})^{i}{{R}_{TEST}}^{n-i} \end{align}$$

where $$CL\,\!$$ is the confidence level, $$f\,\!$$ is the number of failures, $$n\,\!$$ is the sample size and $${{R}_{TEST}}\,\!$$ is the demonstrated reliability. Given any three of them, the remaining one can be solved for.

Non-parametric demonstration test design is also often used for one shot devices where the reliability is not related to time. In this case, $${{R}_{TEST}}\,\!$$ can be simply written as $${R}\,\!$$.

Constant Failure Rate/Chi-Squared
Another method for designing tests for products that have an assumed constant failure rate, or exponential life distribution, draws on the chi-squared distribution. These represent the true exponential distribution confidence bounds referred to in The Exponential Distribution. This method only returns the necessary accumulated test time for a demonstrated reliability or $$MTTF\,\!$$, not a specific time/test unit combination that is obtained using the cumulative binomial method described above. The accumulated test time is equal to the total amount of time experienced by all of the units on test. Assuming that the units undergo the same amount of test time, this works out to be:


 * $${{T}_{a}}=n\cdot {{t}_{TEST}}$$

where $$n\,\!$$  is the number of units on test and  $${{t}_{TEST}}\,\!$$  is the test time. The chi-squared equation for test time is:


 * $${{T}_{a}}=\frac{MTTF\cdot \chi _{1-CL;2f+2}^{2}}{2}$$

where:
 * $$\chi _{1-CL;2f+2}^{2}\,\!$$ = the chi-squared distribution
 * $${{T}_{a}}\,\!$$ = the necessary accumulated test time
 * $$CL\,\!$$ = the confidence level
 * $$f\,\!$$ = the number of failures

Since this methodology only applies to the exponential distribution, the exponential reliability equation can be rewritten as:


 * $$MTTF=\frac{t}{-ln(R)}$$

and substituted into the chi-squared equation for developing a test that demonstrates reliability at a given time, rather than $$MTTF\,\!$$ :


 * $${{T}_{a}}=\frac{\tfrac{-ln(R)}\cdot \chi _{1-CL;2f+2}^{2}}{2}$$

Bayesian Non-Parametric Test Design
The regular non-parametric analyses performed based on either the binomial or the chi-squared equation were performed with only the direct system test data. However, if prior information regarding system performance is available, it can be incorporated into a Bayesian non-parametric analysis. This subsection will demonstrate how to incorporate prior information about system reliability and also how to incorporate prior information from subsystem tests into system test design.

If we assume the system reliability follows a beta distribution, the values of system reliability, R, confidence level, CL, number of units tested, n, and number of failures, r, are related by the following equation:


 * $$1-CL=\text{Beta}\left(R,\alpha,\beta\right)=\text{Beta}\left(R,n-r+\alpha_{0},r+\beta_{0}\right)$$

where $$Beta\,\!$$ is the incomplete beta function. If $${{\alpha}_{0}} >0\,\!$$ and $${{\beta}_{0}} >0\,\!$$ are known, then any quantity of interest can be calculated using the remaining three. The next two examples demonstrate how to calculate $${{\alpha}_{0}} >0\,\!$$ and $${{\beta}_{0}} >0\,\!$$ depending on the type of prior information available.

Use Prior Expert Opinion on System Reliability
Prior information on system reliability can be exploited to determine $$\alpha_{0}\,\!$$ and $$\beta_{0}\,\!$$. To do so, first approximate the expected value and variance of prior system reliability $$R_{0}\,\!$$. This requires knowledge of the lowest possible reliability, the most likely possible reliability and the highest possible reliability of the system. These quantities will be referred to as a, b and c, respectively. The expected value of the prior system reliability is approximately given as:


 * $$ E\left(R_{0}\right)=\frac{a+4b+c}{6} $$

and the variance is approximately given by:


 * $$Var({{R}_{0}})={{\left( \frac{c-a}{6} \right)}^{2}}$$

These approximate values of the expected value and variance of the prior system reliability can then be used to estimate the values of $$\alpha_{0}\,\!$$ and $$\beta_{0}\,\!$$, assuming that the prior reliability is a beta-distributed random variable. The values of $$\alpha_{0}\,\!$$ and $$\beta_{0}\,\!$$ are calculated as:


 * $$ \alpha_{0}=E\left(R_{0}\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$ $$ \beta_{0}=\left(1-E\left(R_{0}\right)\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$

With $$\alpha_{0}\,\!$$ and $$\beta_{0}\,\!$$ known, the above beta distribution equation can now be used to calculate a quantity of interest.

Use Prior Information from Subsystem Tests
Prior information from subsystem tests can also be used to determine values of alpha and beta. Information from subsystem tests can be used to calculate the expected value and variance of the reliability of individual components, which can then be used to calculate the expected value and variance of the reliability of the entire system. $$\alpha_{0}\,\!$$ and $$\beta_{0}\,\!$$ are then calculated as before:


 * $$\alpha_{0}=E\left(R_{0}\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$


 * $$\beta_{0}=\left(1-E\left(R_{0}\right)\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right]$$

For each subsystem i, from the beta distribution, we can calculate the expected value and the variance of the subsystem’s reliability $$R_{i}\,\!$$ as [38]:


 * $$E\left(R_{i}\right)=\frac{s_{i}}{n_{i}+1}$$


 * $$Var\left(R_{i}\right)=\frac{s_{i}\left(n_{i}+1-s_{i}\right)}{\left(n_{i}+1\right)^{2}\left(n_{i}+2\right)}$$

Assuming that all the subsystems are in a series reliability-wise configuration, the expected value and variance of the system’s reliability $$R\,\!$$ can then be calculated as [38]:


 * $$E\left(R_{0}\right)=(i=1)^{k} E\left(R_{i}\right)=E\left(R_{1}\right)\times E\left(R_{2}\right)\ldots E\left(R_{k}\right)$$


 * $$Var\left(R_{0}\right)=\prod_{i=1}^{k}\left[E^{2}\left(R_{i}\right)+Var\left(R_{i}\right)\right]-\prod_{i=1}^{k}\left[E^{2}\left(R_{i}\right)\right]$$

With the above prior information on the expected value and variance of the system reliability, all the calculations can now be calculated as before.

Example
=Expected Failure Times Plots= Test duration is one of the key factors that should be considered in designing a test. If the expected test duration can be estimated prior to the test, test resources can be better allocated. In this section, we will explain how to estimate the expected test time based on test sample size and the assumed underlying failure distribution.

The binomial equation used in non-parametric demonstration test design is the base for predicting expected failure times. The equation is:


 * $$1-CL=\underset{i=0}{\overset{r}{\mathop \sum }}\,\frac{n!}{i!\cdot (n-i)!}\cdot {{(1-{{R}_{TEST}})}^{i}}\cdot R_{TEST}^{(n-i)}$$

where:


 * $$CL\,\!$$ = the required confidence level
 * $$r\,\!$$ = the number of failures
 * $$n\,\!$$ = the total number of units on test
 * $${{R}_{TEST}}\,\!$$ = the reliability on test

If CL, r and n are given, the R value can be solved from the above equation. When CL=0.5, the solved R (or Q, the probability of failure whose value is 1-R) is the so called median rank for the corresponding failure. (For more information on median ranks, please see Parameter Estimation).

For example, given n = 4, r = 2 and CL = 0.5, the calculated Q is 0.385728. This means, at the time when the second failure occurs, the estimated system probability of failure is 0.385728. The median rank can be calculated in Weibull++ using the Quick Statistical Reference, as shown below:



Similarly, if we set r = 3 for the above example, we can get the probability of failure at the time when the third failure occurs. Using the estimated median rank for each failure and the assumed underlying failure distribution, we can calculate the expected time for each failure. Assume the failure distribution is Weibull, then we know:


 * $$Q=1-{{e}^-}$$

where:
 * $$\beta \,\!$$ is the shape parameter
 * $$\eta\,\!$$ is the scale parameter

Using the above equation, for a given Q, we can get the corresponding time t. The above calculation gives the median of each failure time for CL = 0.5. If we set CL at different values, the confidence bounds of each failure time can be obtained. For the above example, if we set CL=0.9, from the calculated Q we can get the upper bound of the time for each failure. The calculated Q is given in the next figure:



If we set CL=0.1, from the calculated Q we can get the lower bound of the time for each failure. The calculated Q is given in the figure below:



Example
=Life Difference Detection Matrix= Engineers often need to design tests for detecting life differences between two or more product designs. The questions are how many samples and how long should the test be conducted in order to detect a certain amount of difference. There are no simple answers. Usually, advanced design of experiments (DOE) techniques should be utilized. For a simple case, such as comparing two designs, the Difference Detection Matrix in Weibull++ can be used. The Dfference Detection Matrix graphically indicates the amount of test time required to detect a statistical difference in the lives of two populations.

As discussed in the test design using Expected Failure Times plot, if the sample size is known, the expected failure time of each test unit can be obtained based on the assumed failure distribution. Now let's go one step further. With these failure times, we can then estimate the failure distribution and calculate any reliability metrics. This process is similar to the simulation used in SimuMatic where random failure times are generated from simulation and then used to estimate the failure distribution. This approach is also used by the Difference Detection Matrix.

Assume we want to compare the B10 lives (or mean lives) of two designs. The test is time-terminated and the termination time is set to T. Using the method given in Expected Failure Times Plots, we can generate the failure times. For any failure time greater than T, it is a suspension and the suspension time is T. For each design, its B10 life and confidence bounds can be estimated from the generated failure/suspension times. If the two estimated confidence intervals overlap with each other, it means the difference of the two B10 lives cannot be detected from this test. We have to either increase the sample size or the test duration.

Example
=Simulation= Monte Carlo simulation provides another useful tool for test design. The SimuMatic utility in Weibull++ can be used for this purpose. SimuMatic is simulating the outcome from a particular test design that is intended to demonstrate a target reliability. You can specify various factors of the design, such as the test duration (for a time-terminated test), number of failures (for a failure-terminated test) and sample size. By running the simulations you can assess whether the planned test design can achieve the reliability target. Depending on the results, you can modify the design by adjusting these factors and repeating the simulation process—in effect, simulating a modified test design—until you arrive at a modified design that is capable of demonstrating the target reliability within the available time and sample size constraints.

Of course, all the design factors mentioned in SimuMatic also can be calculated using analytical methods as discussed in previous sections. However, all of the analytical methods need assumptions. When sample size is small or test duration is short, these assumptions may not be accurate enough. The simulation method usually does not require any assumptions. For example, the confidence bounds of reliability from SimuMatic are purely based on simulation results. In analytical methods, both Fisher bounds and likelihood ratio bounds need to use assumptions. Another advantage of using the simulation method is that it is straightforward and results can be visually displayed in SimuMatic.

For details, see the Weibull++ SimuMatic chapter.