Competing Failure Modes Analysis

Often, a group of products will fail due to more than one failure mode. One can take the view that the products could have failed due to any one of the possible failure modes, but since an item cannot fail more than one time, there can only be one failure mode for each failed product. In this view, the failure modes compete as to which causes the failure for each particular item. This can be viewed as a series system reliability model, with each failure mode composing a block of the series system. Competing failure modes (CFM) analysis segregates the analyses of failure modes and then combines the results to provide an overall model for the product in question.

CFM Analysis Approach
In order to begin analyzing data sets with more than one competing failure mode, one must perform a separate analysis for each failure mode. During each of these analyses, the failure times for all other failure modes not being analyzed are considered to be suspensions. This is because the units under test would have failed at some time in the future due to the failure mode being analyzed, had the unrelated (not analyzed) mode not occurred. Thus, in this case, the information available is that the mode under consideration did not occur and the unit under consideration accumulated test time without a failure due to the mode under consideration (or a suspension due to that mode).

Once the analysis for each separate failure mode has been completed (using the same principles as before), the resulting reliability equation for all modes is the product of the reliability equation for each mode, or:


 * $$R(t)={{R}_{1}}(t)\cdot {{R}_{2}}(t)\cdot ...\cdot {{R}_{n}}(t)$$

where $$n$$  is the total number of failure modes considered. This is the product rule for the reliability of series systems with statistically independent components, which states that the reliability for a series system is equal to the product of the reliability values of the components comprising the system. Do note that the above equation is the reliability function based on any assumed life distribution. In Weibull++ this life distribution can be either the 2-parameter Weibull, lognormal, normal or the 1-parameter exponential.

CFM Example
The following example demonstrates how you can use the reliability equation to determine the overall reliability of a component. (This example has been abstracted from Example 15.6 from the Meeker and Escobar textbook Statistical Methods for Reliability Data [27].)

An electronic component has two competing failure modes. One failure mode is due to random voltage spikes, which cause failure by overloading the system. The other failure mode is due to wearout failures, which usually happen only after the system has run for many cycles. The objective is to determine the overall reliability for the component at 100,000 cycles.

30 units are tested, and the failure times are recorded in the following table. The failures that are due to the random voltage spikes are denoted by a V. The failures that are due to wearout failures are denoted by a W.

$$\begin{matrix} Number & Failure & Failure & Number & Failure & Failure \\ in State & Time* & Mode & in State & Time* & Mode \\ \text{1} & \text{2} & \text{V} & \text{1} & \text{147} & \text{W} \\ \text{1} & \text{10} & \text{V} & \text{1} & \text{173} & \text{V} \\ \text{1} & \text{13} & \text{V} & \text{1} & \text{181} & \text{W} \\ \text{2} & \text{23} & \text{V} & \text{1} & \text{212} & \text{W} \\ \text{1} & \text{28} & \text{V} & \text{1} & \text{245} & \text{W} \\ \text{1} & \text{30} & \text{V} & \text{1} & \text{247} & \text{V} \\ \text{1} & \text{65} & \text{V} & \text{1} & \text{261} & \text{V} \\ \text{1} & \text{80} & \text{V} & \text{1} & \text{266} & \text{W} \\ \text{1} & \text{88} & \text{V} & \text{1} & \text{275} & \text{W} \\ \text{1} & \text{106} & \text{V} & \text{1} & \text{293} & \text{W} \\ \text{1} & \text{143} & \text{V} & \text{8} & \text{300} & \text{suspended} \\ \end{matrix}$$

*Failure times given are in thousands of cycles.

Solution

To obtain the overall reliability of the component, we will first need to analyze the data set due to each failure mode. For example, to obtain the reliability of the component due to voltage spikes, we must consider all of the failures for the wear-out mode to be suspensions. We do the same for analyzing the wear-out failure mode, counting only the wear-out data as failures and assuming that the voltage spike failures are suspensions. Once we have obtained the reliability of the component due to each mode, we can use the system Reliability Equation to determine the overall component reliability.

The following analysis shows the data set for the voltage spikes. Using the Weibull distribution and the MLE analysis method (recommended due to the number of suspensions in the data), the parameters are $${{\beta }_{V}}=0.671072\,\!$$ and  $${{\eta }_{V}}=449.427230\,\!$$. The reliability for this failure mode at $$t=100\,\!$$  is  $${{R}_{V}}(100)=0.694357\,\!$$.



The following analysis shows the data set for the wearout failure mode. Using the same analysis settings (i.e., Weibull distribution and MLE analysis method), the parameters are $${{\beta }_{W}}=4.337278\,\!$$ and  $${{\eta }_{W}}=340.384242\,\!$$. The reliability for this failure mode at $$t=100\,\!$$  is  $${{R}_{W}}(100)=0.995084\,\!$$.



Using the Reliability Equation to obtain the overall component reliability at 100,000 cycles, we get:
 * $$\begin{align}

& {{R}_{sys}}(100)= {{R}_{V}}(100)\cdot {{R}_{W}}(100) \\ & = 0.694357\cdot 0.995084 \\ & = 0.690943 \end{align}$$

Or the reliability of the unit (or system) under both modes is $${{R}_{sys}}(100)=69.094%\,\!$$.

You can also perform this analysis using Weibull++'s built-in CFM analysis options, which allow you to generate a probability plot that contains the combined mode line as well as the individual mode lines.

Confidence Bounds for CFM Analysis
The method available in Weibull++ for estimating the different types of confidence bounds, for competing failure modes analysis, is the Fisher matrix method, and is presented in this section.

Variance/Covariance Matrix
The variances and covariances of the parameters are estimated from the inverse local Fisher matrix, as follows:


 * $$\begin{align}

& \left( \begin{matrix}  Var({{{\hat{a}}}_{1}}) & Cov({{{\hat{a}}}_{1}},{{{\hat{b}}}_{1}}) & 0 & 0 & 0 & 0 & 0  \\   Cov({{{\hat{a}}}_{1}},{{{\hat{b}}}_{1}}) & Var({{{\hat{b}}}_{1}}) & 0 & 0 & 0 & 0 & 0  \\   0 & 0 & \cdot  & 0 & 0 & 0 & 0  \\   0 & 0 & 0 & \cdot  & 0 & 0 & 0  \\   0 & 0 & 0 & 0 & \cdot  & 0 & 0  \\   0 & 0 & 0 & 0 & 0 & Var({{{\hat{a}}}_{n}}) & Cov({{{\hat{a}}}_{n}},{{{\hat{b}}}_{n}})  \\   0 & 0 & 0 & 0 & 0 & Cov({{{\hat{a}}}_{n}},{{{\hat{b}}}_{n}}) & Var({{{\hat{b}}}_{n}})  \\ \end{matrix} \right) \\ & =\left( \begin{matrix}  -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^{2}} & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^ – \partial {{b}_{1}}} & 0 & 0 & 0 & 0 & 0  \\   -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^ – \partial {{b}_{1}}} & -\frac{{{\partial }^{2}}\Lambda }{\partial b_{1}^{2}} & 0 & 0 & 0 & 0 & 0  \\   0 & 0 & \cdot  & 0 & 0 & 0 & 0  \\   0 & 0 & 0 & \cdot  & 0 & 0 & 0  \\   0 & 0 & 0 & 0 & \cdot  & 0 & 0  \\   0 & 0 & 0 & 0 & 0 & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^{2}} & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^ – \partial {{b}_{n}}}  \\   0 & 0 & 0 & 0 & 0 & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^ – \partial {{b}_{n}}} & -\frac{{{\partial }^{2}}\Lambda }{\partial b_{n}^{2}}  \\ \end{matrix} \right) \\ \end{align}$$

where $$\Lambda $$ is the log-likelihood function of the failure distribution, described in Parameter Estimation.

Bounds on Reliability
The competing failure modes reliability function is given by:


 * $$\widehat{R}=\underset{i=1}{\overset{n}{\mathop \prod }}\,{{\hat{R}}_{i}}$$

where:
 * $${{R}_{i}}\,\!$$ is the reliability of the  $${{i}^{th}}\,\!$$  mode.
 * $$n\,\!$$ is the number of failure modes.

The upper and lower bounds on reliability are estimated using the logit transformation:


 * $$\begin{align}

& {{R}_{U}}= & \frac{\widehat{R}}{\widehat{R}+(1-\widehat{R}){{e}^{-\tfrac{{{K}_{\alpha }}\sqrt{Var(\widehat{R})}}{\widehat{R}(1-\widehat{R})}}}} \\ & {{R}_{L}}= & \frac{\widehat{R}}{\widehat{R}+(1-\widehat{R}){{e}^{\tfrac{{{K}_{\alpha }}\sqrt{Var(\widehat{R})}}{\widehat{R}(1-\widehat{R})}}}} \end{align}$$

where $$\widehat{R}$$  is calculated using the reliability equation for competing failure modes. $${{K}_{\alpha }}$$ is defined by:


 * $$\alpha =\frac{1}{\sqrt{2\pi }}\underset{\overset{\infty }{\mathop \int }}\,{{e}^{-\tfrac{2}}}dt=1-\Phi ({{K}_{\alpha }})$$

(If $$\delta $$  is the confidence level, then  $$\alpha =\tfrac{1-\delta }{2}$$  for the two-sided bounds, and  $$\alpha =1-\delta $$  for the one-sided bounds.)

The variance of $$\widehat{R}$$  is estimated by:


 * $$Var(\widehat{R})=\underset{i=1}{\overset{n}{\mathop \sum }}\,{{\left( \frac{\partial R}{\partial {{R}_{i}}} \right)}^{2}}Var({{\hat{R}}_{i}})$$


 * $$\frac{\partial R}{\partial {{R}_{i}}}=\underset{j=1,j\ne i}{\overset{n}{\mathop \prod }}\,\widehat$$

Thus:


 * $$Var(\widehat{R})=\underset{i=1}{\overset{n}{\mathop \sum }}\,\left( \underset{j=1,j\ne i}{\overset{n}{\mathop \prod }}\,\widehat{R}_{j}^{2} \right)Var({{\hat{R}}_{i}})$$


 * $$Var({{\hat{R}}_{i}})=\underset{i=1}{\overset{n}{\mathop \sum }}\,{{\left( \frac{\partial {{R}_{i}}}{\partial {{a}_{i}}} \right)}^{2}}Var({{\hat{a}}_{i}})$$

where $$\widehat$$  is an element of the model parameter vector.

Therefore, the value of $$Var({{\hat{R}}_{i}})$$  is dependent on the underlying distribution.

For the Weibull distribution:


 * $$Var({{\hat{R}}_{i}})={{\left( {{{\hat{R}}}_{i}}{{e}^{{{{\hat{u}}}_{i}}}} \right)}^{2}}Var({{\hat{u}}_{i}})$$

where:


 * $${{\hat{u}}_{i}}={{\hat{\beta }}_{i}}(\ln (t-{{\hat{\gamma }}_{i}})-\ln {{\hat{\eta }}_{i}})$$

and $$Var(\widehat)$$  is given in The Weibull Distribution.

For the exponential distribution:


 * $$Var({{\hat{R}}_{i}})={{\left( {{{\hat{R}}}_{i}}(t-{{{\hat{\gamma }}}_{i}}) \right)}^{2}}Var({{\hat{\lambda }}_{i}})$$

where $$Var(\widehat)$$  is given in The Exponential Distribution.

For the normal distribution:


 * $$Var({{\hat{R}}_{i}})={{\left( f({{{\hat{z}}}_{i}})\hat{\sigma } \right)}^{2}}Var({{\hat{z}}_{i}})$$


 * $${{\hat{z}}_{i}}=\frac{t-{{{\hat{\mu }}}_{i}}}$$

where $$Var(\widehat)$$  is given in The Normal Distribution.

For the lognormal distribution:


 * $$Var({{\hat{R}}_{i}})={{\left( f({{{\hat{z}}}_{i}})\cdot {{{\hat{\sigma }}}^{\prime }} \right)}^{2}}Var({{\hat{z}}_{i}})$$


 * $${{\hat{z}}_{i}}=\frac{\ln \text{(}t)-\hat{\mu }_{i}^{\prime }}{\hat{\sigma }_{i}^{\prime }}$$

where $$Var(\widehat)$$  is given in The Lognormal Distribution.

Bounds on Time
The bounds on time are estimate by solving the reliability equation with respect to time. From the reliabilty equation for competing faiure modes, we have that:


 * $$\hat{t}=\varphi (R,{{\hat{a}}_{i}},{{\hat{b}}_{i}})$$


 * $$i=1,...,n$$

where:
 * •	 $$\varphi $$ is inverse function for the reliabilty equation for competing faiure modes.
 * •	for the Weibull distribution $${{\hat{a}}_{i}}$$  is  $${{\hat{\beta }}_{i}}$$, and  $${{\hat{b}}_{i}}$$  is  $${{\hat{\eta }}_{i}}$$
 * •	for the exponential distribution $${{\hat{a}}_{i}}$$  is  $${{\hat{\lambda }}_{i}}$$, and  $${{\hat{b}}_{i}}$$  =0
 * •	for the normal distribution $${{\hat{a}}_{i}}$$  is  $${{\hat{\mu }}_{i}}$$, and  $${{\hat{b}}_{i}}$$  is  $${{\hat{\sigma }}_{i}}$$ , and
 * •	for the lognormal distribution $${{\hat{a}}_{i}}$$  is  $$\hat{\mu }_{i}^{\prime }$$, and  $${{\hat{b}}_{i}}$$  is  $$\hat{\sigma }_{i}^{\prime }$$

Set:


 * $$\begin{align}

u=\ln (t) \end{align}$$

The bounds on $$u$$  are estimated from:


 * $${{u}_{U}}=\widehat{u}+{{K}_{\alpha }}\sqrt{Var(\widehat{u})}$$

and:


 * $${{u}_{L}}=\widehat{u}-{{K}_{\alpha }}\sqrt{Var(\widehat{u})}$$

Then the upper and lower bounds on time are found by using the equations:


 * $${{t}_{U}}={{e}^}$$

and:


 * $${{t}_{L}}={{e}^}$$

$${{K}_{\alpha }}$$  is calculated using the inverse standard normal distribution and  $$Var(\widehat{u})$$  is computed as:


 * $$Var(\widehat{u})=\underset{i=1}{\overset{n}{\mathop \sum }}\,\left( {{\left( \frac{\partial u}{\partial {{a}_{i}}} \right)}^{2}}Var(\widehat)+{{\left( \frac{\partial u}{\partial {{b}_{i}}} \right)}^{2}}Var(\widehat)+2\frac{\partial u}{\partial {{a}_{i}}}\frac{\partial u}{\partial {{b}_{i}}}Cov(\widehat,\widehat) \right)$$

Complex Failure Modes Analysis
In addition to being viewed as a series system, the relationship between the different competing failures modes can be more complex. After performing separate analysis for each failure mode, a diagram that describes how each failure mode can result in a product failure can be used to perform analysis for the item in question. Such diagrams are usually referred to as Reliability Block Diagrams (RBD) (for more on RBDs see ReliaSoft's System Analysis Reference and ReliaSoft's BlockSim software).

A reliability block diagram is made of blocks that represent the failure modes and arrows and connects the blocks in different configurations. Note that the blocks can also be used to represent different components or subsystems that make up the product. Weibull ++ provides the capability to use a diagram to model, series, parallel, k-out-of-n configurations in addition to any complex combinations of these configurations.

In this analysis, the failure modes are assumed to be statistically independent. (Note: In the context of this reference, statistically independent implies that failure information for one failure mode provides no information about, i.e. does not affect, other failure mode). Analysis of dependent modes is more complex. Advanced RBD software such as ReliaSoft's BlockSim can handle and analyze such dependencies, as well as provide more advanced constructs and analyses (see http://www.ReliaSoft.com/BlockSim).

Failure Modes Configurations
Series Configuration

The basic competing failure modes configuration, which has already been discussed, is a series configuration. In a series configuration, the occurrence of any failure mode results in failure for the product.



The equation that describes series configuration is:


 * $$R(t)={{R}_{1}}(t)\cdot {{R}_{2}}(t)\cdot ...\cdot {{R}_{n}}(t)$$

where $$n$$  is the total number of failure modes considered.

Parallel

In a simple parallel configuration, at least one of the failure modes must not occur for the product to continue operation.



The equation that describes the parallel configuration is:


 * $$R(t)=1-\underset{i=1}{\overset{n}{\mathop \prod }}\,(1-{{R}_{i}}(t))$$

where $$n$$  is the total number of failure modes considered.

Combination of Series and Parallel

While many smaller products can be accurately represented by either a simple series or parallel configuration, there may be larger products that involve both series and parallel configurations in the overall model of the product. Such products can be analyzed by calculating the reliabilities for the individual series and parallel sections and then combining them in the appropriate manner.



k-out-of-n Parallel Configuration=

The k-out-of-n configuration is a special case of parallel redundancy. This type of configuration requires that at least $$k$$  failure modes do not happen out of the total  $$n$$  parallel failure modes for the product to succeed. The simplest case of a k-out-of-n configuration is when the failure modes are independent and identical and have the same failure distribution and uncertainties about the parameters (in other words they are derived from the same test data). In this case, the reliability of the product with such a configuration can be evaluated using the binomial distribution, or:


 * $$R(t)=\overset{n}{\mathop{\underset{r=k}{\mathop{\underset{}{\overset{}{\mathop \sum }}\,}}\,}}\,\left( \underset{k}{\mathop{\overset{n}{\mathop – }\,}}\, \right){{R}^{r}}(t){{(1-R(t))}^{n-r}}$$

In the case where the k-out-of-n failure modes are not identical, other approaches for calculating the reliability must be used (e.g. the event space method). Discussion of these is beyond the scope of this reference. Interested readers can consult the System Analysis Reference book.

Complex Systems

In many cases, it is not easy to recognize which components are in series and which are in parallel in a complex system.



The previous configuration cannot be broken down into a group of series and parallel configurations. This is primarily due to the fact that failure mode C has two paths leading away from it, whereas B and D have only one. Several methods exist for obtaining the reliability of a complex configuration including the decomposition method, the event space method and the path-tracing method. Discussion of these is beyond the scope of this reference. Interested readers can consult the System Analysis Reference book.