Reliability Test Design

Quite often, there is a desire to design reliability demonstration tests that have few or no failures. These tests are often required to demonstrate customer reliability and confidence requirements. While it is desirable to be able to test a large population of units to failure in order to obtain information on a design's reliability, time and resource constraints sometimes make this impossible. In these cases, a test can be run on a specified number of units for a specified amount of time that will demonstrate that the product has met or exceeded a given reliability at a given confidence level. In order to do so without a large amount of cumulative test time or failure data, it is necessary to make assumptions about the failure distribution of the product. In the final analysis, the actual reliability of the units will, of course, remain unknown, but the reliability engineer will be able to state that certain specifications have been met.

Demonstration Test Design
Frequently, a manufacturer will have to demonstrate that a certain product has met a goal of a certain reliability at a given time with a specific confidence. Often, it will be desired to demonstrate that this goal has been met with a zero-failure test. In order to design and conduct such a test, something about the failure behavior of the product will need to be known (i.e., the shape parameter of the product's life distribution). Beyond this, nothing more about the test is known, and usually the engineer designing the test will have to study the financial trade-offs between the number of units and the amount of test time needed to demonstrate the desired goal. In cases like this, it is useful to have a "carpet plot" that shows the possibilities of how a certain specification can be met.

This methodology requires the use of the cumulative binomial distribution in addition to the assumed distribution of the product's lifetimes. Not only does the life distribution of the product need to be assumed beforehand, but a reasonable assumption of the distribution's shape parameter must be provided as well. Additional information that must be supplied includes: a) the reliability to be demonstrated, b) the confidence level at which the demonstration takes place, c) the acceptable number of failures and d) either the number of available units or the amount of available test time. The output of this analysis can be the amount of time required to test the available units or the required number of units that need to be tested during the available test time.

Reliability Demonstration
Frequently, the entire purpose of designing a test with few or no failures is to demonstrate a certain reliability, $${{R}_{DEMO}}$$, at a certain time. With the exception of the exponential distribution (and ignoring the location parameter for the time being), this reliability is going to be a function of time, a shape parameter and a scale parameter.

$${{R}_{DEMO}}=g({{t}_{DEMO}};\theta ,\phi )$$

where:


 * $${{t}_{DEMO}}$$ is the time at which the demonstrated reliability is specified.
 * $$\theta$$ is the shape parameter.
 * $$\phi$$ is the scale parameter.

Since required inputs to the process include $${{R}_{DEMO}}$$, $${{t}_{DEMO}}$$  and  $$\theta$$, the value of the scale parameter can be backed out of the reliability equation of the assumed distribution, and will be used in the calculation of another reliability value,  $${{R}_{TEST}}$$, which is the reliability that is going to be incorporated into the actual test calculation. How this calculation is performed depends on whether one is attempting to solve for the number of units to be tested in an available amount of time, or attempting to determine how long to test an available number of test units.

Determining Units for Available Test Time

If one knows that the test is to last a certain amount of time, $${{t}_{TEST}}$$, the number of units that must be tested to demonstrate the specification must be determined. The first step in accomplishing this involves calculating the $${{R}_{TEST}}$$  value.

This should be a simple procedure since:

$${{R}_{TEST}}=g({{t}_{TEST}};\theta ,\phi )$$

and $${{t}_{DEMO}}$$,  $$\theta $$  and  $$\phi $$  are already known, and it is just a matter of plugging these values into the appropriate reliability equation.

We now incorporate a form of the cumulative binomial distribution in order to solve for the required number of units. This form of the cumulative binomial appears as:

$$1-CL=\underset{i=0}{\overset{f}{\mathop \sum }}\,\frac{n!}{i!\cdot (n-i)!}\cdot {{(1-{{R}_{TEST}})}^{i}}\cdot R_{TEST}^{(n-i)}$$

where:


 * $$\begin{align}

& CL= \text{the required confidence level} \\ & f= \text{the allowable number of failures} \\ & n= \text{the total number of units on test} \\ & {{R}_{TEST}}= \text{the reliability on test} \end{align}$$

Since $$CL$$  and  $$f$$  are required inputs to the process and  $${{R}_{TEST}}$$  has already been calculated, it merely remains to solve the cumulative binomial equation for  $$n$$, the number of units that need to be tested.

Determining Test Time for Available Units

The way that one determines the test time for the available number of test units is quite similar to the process described previously. In this case, one knows beforehand the number of units, $$n$$, the number of allowable failures,  $$f$$, and the confidence level,  $$CL$$. With this information, the next step involves solving the binomial equation for $${{R}_{TEST}}$$. With this value known, one can use the appropriate reliability equation to back out the value of $${{t}_{TEST}}$$, since  $${{R}_{TEST}}=g({{t}_{TEST}};\theta ,\phi )$$, and  $${{R}_{TEST}}$$,  $$\theta$$  and  $$\phi$$  have already been calculated or specified.

MTTF Demonstration
Designing a test to demonstrate a certain value of the $$MTTF$$  is identical to designing a reliability demonstration test, with the exception of how the value of the scale parameter  $$\phi $$  is determined. Given the value of the $$MTTF$$  and the value of the shape parameter  $$\theta $$, the value of the scale parameter  $$\phi $$  can be calculated. With this, the analysis can proceed as with the reliability demonstration methodology.

Non-Parametric Test Design
The binomial equation can be used for nonparametric demonstration test design. One must merely assume values for three of the inputs of $$CL$$,  $${{R}_{TEST}}$$,  $$n$$, and  $$f$$, and solve for the fourth. Note that there is no time value associated with this methodology, so one must assume that the value of $${{R}_{TEST}}$$  is associated with the amount of time for which the units were tested.

Example 1:

Example 2:

Constant Failure Rate/Chi-Squared Test Design
Another method for designing tests for products that have an assumed constant failure rate, or exponential life distribution, draws on the chi-squared distribution. These represent the true exponential distribution confidence bounds referred to in The Exponential Distribution. This method only returns the necessary accumulated test time for a demonstrated reliability or $$MTTF$$, not a specific time/test unit combination that is obtained using the cumulative binomial method described above. The accumulated test time is equal to the total amount of time experienced by all of the units on test. Assuming that the units undergo the same amount of test time, this works out to be:

$${{T}_{a}}=n\cdot {{t}_{TEST}}$$

where $$n$$  is the number of units on test and  $${{t}_{TEST}}$$  is the test time. The chi-squared equation for test time is:

$${{T}_{a}}=\frac{MTTF\cdot \chi _{1-CL;2f+2}^{2}}{2}$$

where:


 * $$\begin{align}

& \chi _{1-CL;2f+2}^{2}= \text{the chi squared distribution} \\ & {{T}_{a}}= \text{the necessary accumulated test time} \\ & CL= \text{the confidence level} \\ & f=\text{the number of failures} \end{align}$$

Since this methodology only applies to the exponential distribution, the exponential reliability equation can be rewritten as:

$$MTTF=\frac{t}{-ln(R)}$$

and substituted into the chi-squared equation for developing a test that demonstrates reliability at a given time, rather than $$MTTF$$ :

$${{T}_{a}}=\frac{\tfrac{-ln(R)}\cdot \chi _{1-CL;2f+2}^{2}}{2}$$

Example 3:

Bayesian Non-Parametric Test Design
The regular non-parametric analyses performed based on either the binomial or the chi-squared equation were performed with only the direct system test data. However, if prior information regarding system performance is available, it can be incorporated into a Bayesian non-parametric analysis. This subsection will demonstrate how to incorporate prior information about system reliability and also how to incorporate prior information from subsystem tests into system test design.

Assumption on System Reliability
If we assume the system reliability follows a beta distribution, the values of system reliability, R, confidence level, CL, number of units tested, n, and number of failures, r, are related by the following equation:

$$1-CL=\text{Beta}\left(R,\alpha,\beta\right)=\text{Beta}\left(R,n-r+\alpha_{0},r+\beta_{0}\right)$$

where $$Beta$$ is the incomplete beta function. If α 0 and β 0  are known, then any quantity of interest can be calculated using the remaining three. The next two examples demonstrate how to calculate α 0 and β 0  depending on the type of prior information available.

Use Prior Expert Opinion on System Reliability
Prior information on system reliability can be exploited to determine $$\alpha_{0}$$ and $$\beta_{0}$$. To do so, first approximate the expected value and variance of prior system reliability R0. This requires knowledge of the lowest possible reliability, the most likely possible reliability and the highest possible reliability of the system. These quantities will be referred to as a, b and c, respectively. The expected value of the prior system reliability is approximately given as:
 * $$ E\left(R_{0}\right)=\frac{a+4b+c}{6} $$

and the variance is approximately given by:
 * $$ Var\left(R_{0}\right)=\frac{c-a}{6} $$

These approximate values of the expected value and variance of the prior system reliability can then be used to estimate the values of $$\alpha_{0}$$ and $$\beta_{0}$$, assuming that the prior reliability is a beta-distributed random variable. The values of $$\alpha_{0}$$ and $$\beta_{0}$$ are calculated as:

$$ \alpha_{0}=E\left(R_{0}\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$ $$ \beta_{0}=\left(1-E\left(R_{0}\right)\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$ With $$\alpha_{0}$$ and $$\beta_{0}$$ known, the above beta distribution equation can now be used to calculate a quantity of interest.

Example 3:

Use Prior Information from Subsystem Tests
Prior information from subsystem tests can also be used to determine values of alpha and beta. Information from subsystem tests can be used to calculate the expected value and variance of the reliability of individual components, which can then be used to calculate the expected value and variance of the reliability of the entire system. $$\alpha_{0}$$ and $$\beta_{0}$$ are then calculated as before:

$$ \alpha_{0}=E\left(R_{0}\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right] $$

$$

\beta_{0}=\left(1-E\left(R_{0}\right)\right)\left[\frac{E\left(R_{0}\right)-E^{2}\left(R_{0}\right)}{Var\left(R_{0}\right)}-1\right]$$

For each subsystem i, from the beta distribution, we can calculate the expected value and the variance of the subsystem’s reliability $$R_{i}$$ as [38]:

$$ E\left(R_{i}\right)=\frac{s_{i}}{n_{i}+1} $$

$$ Var\left(R_{i}\right)=\frac{s_{i}\left(n_{i}+1-s_{i}\right)}{\left(n_{i}+1\right)^{2}\left(n_{i}+2\right)} $$ Assuming that all the subsystems are in a series reliability-wise configuration, the expected value and variance of the system’s reliability $$R$$ can then be calculated as [38]:

$$ E\left(R_{0}\right)=(i=1)^{k} E\left(R_{i}\right)=E\left(R_{1}\right)\times E\left(R_{2}\right)\ldots E\left(R_{k}\right) $$                                                 	 $$ Var\left(R_{0}\right)=\prod_{i=1}^{k}\left[E^{2}\left(R_{i}\right)+Var\left(R_{i}\right)\right]-\prod_{i=1}^{k}\left[E^{2}\left(R_{i}\right)\right] $$ With the above prior information on the expected value and variance of the system reliability, all the calculations can now be calculated as before.

Example 5:

Test Design Using Expected Failure Time Plots
Test duration is one of the key factors that should be considered in designing a test. If the expected test duration can be estimated prior to the test, test resources can be better allocated. In this section, we will explain how to estimate the expected test time based on test sample size and the assumed underlying failure distribution.

The binomial equation used in non-parametric demonstration test design is the base for predicting expected failure times. The equation is:

$$1-CL=\underset{i=0}{\overset{r}{\mathop \sum }}\,\frac{n!}{i!\cdot (n-i)!}\cdot {{(1-{{R}_{TEST}})}^{i}}\cdot R_{TEST}^{(n-i)}$$

where:


 * $$\begin{align}

& CL= \text{the required confidence level} \\ & r= \text{the number of failures} \\ & n= \text{the total number of units on test} \\ & {{R}_{TEST}}= \text{the reliability on test} \end{align}$$

If CL, r and n are given, the R value can be solved from the above equation. When CL=0.5, the solved R (or Q, the probability of failure whose value is 1-R) is the so called median rank for the corresponding failure. Please see Median Ranks.

For example, given n = 4, r = 2 and CL = 0.5, the calculated Q is 0.385728. This means, at the time when the second failure occurs, the estimated system probability of failure is 0.385728. The median rank can be calculated in Weibull++ using the Quick Statistical Reference, as shown below:



Similarly, if we set r = 3 for the above example, we can get the probability of failure at the time when the third failure occurs. Using the estimated median rank for each failure and the assumed underlying failure distribution, we can calculate the expected time for each failure. Assume the failure distribution is Weibull, then we know:

$$Q=1-{{e}^}$$

where:
 * $$\beta $$ is the shape parameter
 * $$\eta$$ is the scale parameter

Using the above equation, for a given Q, we can get the corresponding time t. The above calculation gives the median of each failure time for CL = 0.5. If we set CL at different values, the confidence bounds of each failure time can be obtained. For the above example, if we set CL=0.9, from the calculated Q we can get the upper bound of the time for each failure. The calculated Q is given in the next figure:



If we set CL=0.1, from the calculated Q we can get the lower bound of the time for each failure. The calculated Q is given in the figure below:



Example 6:

Test Design Using a Life Difference Detection Matrix
Engineers often need to design tests for detecting life differences between two or more product designs. The questions are how many samples and how long should the test be conducted in order to detect a certain amount of difference. There are no simple answers. Usually, advanced design of experiments (DOE) techniques should be utilized. For a simple case, such as comparing two designs, the Difference Detection Matrix in Weibull++ can be used. The difference detection matrix graphically indicates the amount of test time required to detect a statistical difference in the lives of two populations.

As discussed in the test design using expected failure times plot, if the sample size is known, the expected failure time of each test unit can be obtained based on the assumed failure distribution. Now let's go one step further. With these failure times, we can then estimate the failure distribution and calculate any reliability metrics. This process is similar to the simulation used in Simumatic where random failure times are generated from simulation and then used to estimate the failure distribution. This approach is also used by the difference detection matrix.

Assume we want to compare the B10 lifes (or mean lifes) of two designs. The test is time-terminated and the termination time is set to T. Using the method given in Test Design Using Expected Failure Time Plots, we can generate the failure times. For any failure time greater than T, it is a suspension and the suspension time is T. For each design, its B10 life and confidence bounds can be estimated from the generated failure/suspension times. If the two estimated confidence intervals overlap with each other, it means the difference of the two B10 lifes cannot be detected from this test. We have to either increase the sample size or the test duration.

Example 7: