Competing Failure Modes Analysis

New format available! This reference is now available in a new format that offers faster page load, improved display for calculations and images, more targeted search and the latest content available as a PDF. As of September 2023, this Reliawiki page will not continue to be updated. Please update all links and bookmarks to the latest reference at help.reliasoft.com/reference/life_data_analysis

Chapter 18: Competing Failure Modes Analysis

Chapter 18

Available Software:
Weibull++

More Resources:
Weibull++ Examples Collection

Often, a group of products will fail due to more than one failure mode. One can take the view that the products could have failed due to any one of the possible failure modes, but since an item cannot fail more than one time, there can only be one failure mode for each failed product. In this view, the failure modes compete as to which causes the failure for each particular item. This can be viewed as a series system reliability model, with each failure mode composing a block of the series system. Competing failure modes (CFM) analysis segregates the analyses of failure modes and then combines the results to provide an overall model for the product in question.

CFM Analysis Approach

In order to begin analyzing data sets with more than one competing failure mode, one must perform a separate analysis for each failure mode. During each of these analyses, the failure times for all other failure modes not being analyzed are considered to be suspensions. This is because the units under test would have failed at some time in the future due to the failure mode being analyzed, had the unrelated (not analyzed) mode not occurred. Thus, in this case, the information available is that the mode under consideration did not occur and the unit under consideration accumulated test time without a failure due to the mode under consideration (or a suspension due to that mode).

Once the analysis for each separate failure mode has been completed (using the same principles as before), the resulting reliability equation for all modes is the product of the reliability equation for each mode, or:

[math]\displaystyle{ R(t)={{R}_{1}}(t)\cdot {{R}_{2}}(t)\cdot ...\cdot {{R}_{n}}(t)\,\! }[/math]

where [math]\displaystyle{ n\,\! }[/math] is the total number of failure modes considered. This is the product rule for the reliability of series systems with statistically independent components, which states that the reliability for a series system is equal to the product of the reliability values of the components comprising the system. Do note that the above equation is the reliability function based on any assumed life distribution. In Weibull++ this life distribution can be either the 2-parameter Weibull, lognormal, normal or the 1-parameter exponential.

CFM Example

The following example demonstrates how you can use the reliability equation to determine the overall reliability of a component. (This example has been abstracted from Example 15.6 from the Meeker and Escobar textbook Statistical Methods for Reliability Data [27].)

An electronic component has two competing failure modes. One failure mode is due to random voltage spikes, which cause failure by overloading the system. The other failure mode is due to wearout failures, which usually happen only after the system has run for many cycles. The objective is to determine the overall reliability for the component at 100,000 cycles.

30 units are tested, and the failure times are recorded in the following table. The failures that are due to the random voltage spikes are denoted by a V. The failures that are due to wearout failures are denoted by a W.

Number in State	Failure Time*	Failure Mode	Number in State	Failure Time*	Failure Mode
1	2	V	1	147	W
1	10	V	1	173	V
1	13	V	1	181	W
2	23	V	1	212	W
1	28	V	1	245	W
1	30	V	1	247	V
1	65	V	1	261	V
1	80	V	1	266	W
1	88	V	1	275	W
1	106	V	1	293	W
1	143	V	1	300	suspended

*Failure times given are in thousands of cycles.

Solution

To obtain the overall reliability of the component, we will first need to analyze the data set due to each failure mode. For example, to obtain the reliability of the component due to voltage spikes, we must consider all of the failures for the wear-out mode to be suspensions. We do the same for analyzing the wear-out failure mode, counting only the wear-out data as failures and assuming that the voltage spike failures are suspensions. Once we have obtained the reliability of the component due to each mode, we can use the system Reliability Equation to determine the overall component reliability.

The following analysis shows the data set for the voltage spikes. Using the Weibull distribution and the MLE analysis method (recommended due to the number of suspensions in the data), the parameters are [math]\displaystyle{ {{\beta }_{V}}=0.671072\,\! }[/math] and [math]\displaystyle{ {{\eta }_{V}}=449.427230\,\! }[/math]. The reliability for this failure mode at [math]\displaystyle{ t=100\,\! }[/math] is [math]\displaystyle{ {{R}_{V}}(100)=0.694357\,\! }[/math].

The following analysis shows the data set for the wearout failure mode. Using the same analysis settings (i.e., Weibull distribution and MLE analysis method), the parameters are [math]\displaystyle{ {{\beta }_{W}}=4.337278\,\! }[/math] and [math]\displaystyle{ {{\eta }_{W}}=340.384242\,\! }[/math]. The reliability for this failure mode at [math]\displaystyle{ t=100\,\! }[/math] is [math]\displaystyle{ {{R}_{W}}(100)=0.995084\,\! }[/math].

Using the Reliability Equation to obtain the overall component reliability at 100,000 cycles, we get:

[math]\displaystyle{ \begin{align} & {{R}_{sys}}(100)= {{R}_{V}}(100)\cdot {{R}_{W}}(100) \\ & = 0.694357\cdot 0.995084 \\ & = 0.690943 \end{align}\,\! }[/math]

Or the reliability of the unit (or system) under both modes is [math]\displaystyle{ {{R}_{sys}}(100)=69.094%\,\! }[/math].

You can also perform this analysis using Weibull++'s built-in CFM analysis options, which allow you to generate a probability plot that contains the combined mode line as well as the individual mode lines.

Confidence Bounds for CFM Analysis

The method available in Weibull++ for estimating the different types of confidence bounds, for competing failure modes analysis, is the Fisher matrix method, and is presented in this section.

Variance/Covariance Matrix

The variances and covariances of the parameters are estimated from the inverse local Fisher matrix, as follows:

[math]\displaystyle{ \begin{align} & \left( \begin{matrix} Var({{{\hat{a}}}_{1}}) & Cov({{{\hat{a}}}_{1}},{{{\hat{b}}}_{1}}) & 0 & 0 & 0 & 0 & 0 \\ Cov({{{\hat{a}}}_{1}},{{{\hat{b}}}_{1}}) & Var({{{\hat{b}}}_{1}}) & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \cdot & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \cdot & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \cdot & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & Var({{{\hat{a}}}_{n}}) & Cov({{{\hat{a}}}_{n}},{{{\hat{b}}}_{n}}) \\ 0 & 0 & 0 & 0 & 0 & Cov({{{\hat{a}}}_{n}},{{{\hat{b}}}_{n}}) & Var({{{\hat{b}}}_{n}}) \\ \end{matrix} \right) \\ & ={\left( \begin{matrix} -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^{2}} & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^{{}}\partial {{b}_{1}}} & 0 & 0 & 0 & 0 & 0 \\ -\frac{{{\partial }^{2}}\Lambda }{\partial a_{1}^{{}}\partial {{b}_{1}}} & -\frac{{{\partial }^{2}}\Lambda }{\partial b_{1}^{2}} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \cdot & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \cdot & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \cdot & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^{2}} & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^{{}}\partial {{b}_{n}}} \\ 0 & 0 & 0 & 0 & 0 & -\frac{{{\partial }^{2}}\Lambda }{\partial a_{n}^{{}}\partial {{b}_{n}}} & -\frac{{{\partial }^{2}}\Lambda }{\partial b_{n}^{2}} \\ \end{matrix} \right)}^{-1} \\ \end{align}\,\! }[/math]

where [math]\displaystyle{ \Lambda \,\! }[/math] is the log-likelihood function of the failure distribution, described in Parameter Estimation.

Bounds on Reliability

The competing failure modes reliability function is given by:

[math]\displaystyle{ \widehat{R}=\underset{i=1}{\overset{n}{\mathop \prod }}\,{{\hat{R}}_{i}}\,\! }[/math]

where:

[math]\displaystyle{ {{R}_{i}}\,\! }[/math] is the reliability of the [math]\displaystyle{ {{i}^{th}}\,\! }[/math] mode.
[math]\displaystyle{ n\,\! }[/math] is the number of failure modes.

The upper and lower bounds on reliability are estimated using the logit transformation:

[math]\displaystyle{ \begin{align} & {{R}_{U}}= & \frac{\widehat{R}}{\widehat{R}+(1-\widehat{R}){{e}^{-\tfrac{{{K}_{\alpha }}\sqrt{Var(\widehat{R})}}{\widehat{R}(1-\widehat{R})}}}} \\ & {{R}_{L}}= & \frac{\widehat{R}}{\widehat{R}+(1-\widehat{R}){{e}^{\tfrac{{{K}_{\alpha }}\sqrt{Var(\widehat{R})}}{\widehat{R}(1-\widehat{R})}}}} \end{align}\,\! }[/math]

where [math]\displaystyle{ \widehat{R}\,\! }[/math] is calculated using the reliability equation for competing failure modes. [math]\displaystyle{ {{K}_{\alpha }}\,\! }[/math] is defined by:

[math]\displaystyle{ \alpha =\frac{1}{\sqrt{2\pi }}\underset{{{K}_{\alpha }}}{\overset{\infty }{\mathop \int }}\,{{e}^{-\tfrac{{{t}^{2}}}{2}}}dt=1-\Phi ({{K}_{\alpha }})\,\! }[/math]

(If [math]\displaystyle{ \delta \,\! }[/math] is the confidence level, then [math]\displaystyle{ \alpha =\tfrac{1-\delta }{2}\,\! }[/math] for the two-sided bounds, and [math]\displaystyle{ \alpha =1-\delta \,\! }[/math] for the one-sided bounds.)

The variance of [math]\displaystyle{ \widehat{R}\,\! }[/math] is estimated by:

[math]\displaystyle{ Var(\widehat{R})=\underset{i=1}{\overset{n}{\mathop \sum }}\,{{\left( \frac{\partial R}{\partial {{R}_{i}}} \right)}^{2}}Var({{\hat{R}}_{i}})\,\! }[/math]

[math]\displaystyle{ \frac{\partial R}{\partial {{R}_{i}}}=\underset{j=1,j\ne i}{\overset{n}{\mathop \prod }}\,\widehat{{{R}_{j}}}\,\! }[/math]

Thus:

[math]\displaystyle{ Var(\widehat{R})=\underset{i=1}{\overset{n}{\mathop \sum }}\,\left( \underset{j=1,j\ne i}{\overset{n}{\mathop \prod }}\,\widehat{R}_{j}^{2} \right)Var({{\hat{R}}_{i}})\,\! }[/math]

[math]\displaystyle{ Var({{\hat{R}}_{i}})=\underset{i=1}{\overset{n}{\mathop \sum }}\,{{\left( \frac{\partial {{R}_{i}}}{\partial {{a}_{i}}} \right)}^{2}}Var({{\hat{a}}_{i}})\,\! }[/math]

where [math]\displaystyle{ \widehat{{{a}_{i}}}\,\! }[/math] is an element of the model parameter vector.

Therefore, the value of [math]\displaystyle{ Var({{\hat{R}}_{i}})\,\! }[/math] is dependent on the underlying distribution.

For the Weibull distribution:

[math]\displaystyle{ Var({{\hat{R}}_{i}})={{\left( {{{\hat{R}}}_{i}}{{e}^{{{{\hat{u}}}_{i}}}} \right)}^{2}}Var({{\hat{u}}_{i}})\,\! }[/math]

where:

[math]\displaystyle{ {{\hat{u}}_{i}}={{\hat{\beta }}_{i}}(\ln (t-{{\hat{\gamma }}_{i}})-\ln {{\hat{\eta }}_{i}})\,\! }[/math]

and [math]\displaystyle{ Var(\widehat{{{u}_{i}}})\,\! }[/math] is given in The Weibull Distribution.

For the exponential distribution:

[math]\displaystyle{ Var({{\hat{R}}_{i}})={{\left( {{{\hat{R}}}_{i}}(t-{{{\hat{\gamma }}}_{i}}) \right)}^{2}}Var({{\hat{\lambda }}_{i}})\,\! }[/math]

where [math]\displaystyle{ Var(\widehat{{{\lambda }_{i}}})\,\! }[/math] is given in The Exponential Distribution.

For the normal distribution:

[math]\displaystyle{ Var({{\hat{R}}_{i}})={{\left( f({{{\hat{z}}}_{i}})\hat{\sigma } \right)}^{2}}Var({{\hat{z}}_{i}})\,\! }[/math]

[math]\displaystyle{ {{\hat{z}}_{i}}=\frac{t-{{{\hat{\mu }}}_{i}}}{{{{\hat{\sigma }}}_{i}}}\,\! }[/math]

where [math]\displaystyle{ Var(\widehat{{{z}_{i}}})\,\! }[/math] is given in The Normal Distribution.

For the lognormal distribution:

[math]\displaystyle{ Var({{\hat{R}}_{i}})={{\left( f({{{\hat{z}}}_{i}})\cdot {{{\hat{\sigma }}}^{\prime }} \right)}^{2}}Var({{\hat{z}}_{i}})\,\! }[/math]

[math]\displaystyle{ {{\hat{z}}_{i}}=\frac{\ln \text{(}t)-\hat{\mu }_{i}^{\prime }}{\hat{\sigma }_{i}^{\prime }}\,\! }[/math]

where [math]\displaystyle{ Var(\widehat{{{z}_{i}}})\,\! }[/math] is given in The Lognormal Distribution.

Bounds on Time

The bounds on time are estimate by solving the reliability equation with respect to time. From the reliabilty equation for competing faiure modes, we have that:

[math]\displaystyle{ \hat{t}=\varphi (R,{{\hat{a}}_{i}},{{\hat{b}}_{i}})\,\! }[/math]

[math]\displaystyle{ i=1,...,n\,\! }[/math]

where:

• [math]\displaystyle{ \varphi \,\! }[/math] is inverse function for the reliabilty equation for competing faiure modes.

• for the Weibull distribution [math]\displaystyle{ {{\hat{a}}_{i}}\,\! }[/math] is [math]\displaystyle{ {{\hat{\beta }}_{i}}\,\! }[/math], and [math]\displaystyle{ {{\hat{b}}_{i}}\,\! }[/math] is [math]\displaystyle{ {{\hat{\eta }}_{i}}\,\! }[/math]

• for the exponential distribution [math]\displaystyle{ {{\hat{a}}_{i}}\,\! }[/math] is [math]\displaystyle{ {{\hat{\lambda }}_{i}}\,\! }[/math], and [math]\displaystyle{ {{\hat{b}}_{i}}\,\! }[/math] =0

• for the normal distribution [math]\displaystyle{ {{\hat{a}}_{i}}\,\! }[/math] is [math]\displaystyle{ {{\hat{\mu }}_{i}}\,\! }[/math], and [math]\displaystyle{ {{\hat{b}}_{i}}\,\! }[/math] is [math]\displaystyle{ {{\hat{\sigma }}_{i}}\,\! }[/math], and

• for the lognormal distribution [math]\displaystyle{ {{\hat{a}}_{i}}\,\! }[/math] is [math]\displaystyle{ \hat{\mu }_{i}^{\prime }\,\! }[/math], and [math]\displaystyle{ {{\hat{b}}_{i}}\,\! }[/math] is [math]\displaystyle{ \hat{\sigma }_{i}^{\prime }\,\! }[/math]

Set:

[math]\displaystyle{ \begin{align} u=\ln (t) \end{align}\,\! }[/math]

The bounds on [math]\displaystyle{ u\,\! }[/math] are estimated from:

[math]\displaystyle{ {{u}_{U}}=\widehat{u}+{{K}_{\alpha }}\sqrt{Var(\widehat{u})}\,\! }[/math]

and:

[math]\displaystyle{ {{u}_{L}}=\widehat{u}-{{K}_{\alpha }}\sqrt{Var(\widehat{u})}\,\! }[/math]

Then the upper and lower bounds on time are found by using the equations:

[math]\displaystyle{ {{t}_{U}}={{e}^{{{u}_{U}}}}\,\! }[/math]

and:

[math]\displaystyle{ {{t}_{L}}={{e}^{{{u}_{L}}}}\,\! }[/math]

[math]\displaystyle{ {{K}_{\alpha }}\,\! }[/math] is calculated using the inverse standard normal distribution and [math]\displaystyle{ Var(\widehat{u})\,\! }[/math] is computed as:

[math]\displaystyle{ Var(\widehat{u})=\underset{i=1}{\overset{n}{\mathop \sum }}\,\left( {{\left( \frac{\partial u}{\partial {{a}_{i}}} \right)}^{2}}Var(\widehat{{{a}_{i}}})+{{\left( \frac{\partial u}{\partial {{b}_{i}}} \right)}^{2}}Var(\widehat{{{b}_{i}}})+2\frac{\partial u}{\partial {{a}_{i}}}\frac{\partial u}{\partial {{b}_{i}}}Cov(\widehat{{{a}_{i}}},\widehat{{{b}_{i}}}) \right)\,\! }[/math]

Complex Failure Modes Analysis

In addition to being viewed as a series system, the relationship between the different competing failures modes can be more complex. After performing separate analysis for each failure mode, a diagram that describes how each failure mode can result in a product failure can be used to perform analysis for the item in question. Such diagrams are usually referred to as Reliability Block Diagrams (RBD) (for more on RBDs see System analysis reference and BlockSim software).

A reliability block diagram is made of blocks that represent the failure modes and arrows and connects the blocks in different configurations. Note that the blocks can also be used to represent different components or subsystems that make up the product. Weibull++ provides the capability to use a diagram to model, series, parallel, k-out-of-n configurations in addition to any complex combinations of these configurations.

In this analysis, the failure modes are assumed to be statistically independent. (Note: In the context of this reference, statistically independent implies that failure information for one failure mode provides no information about, i.e. does not affect, other failure mode). Analysis of dependent modes is more complex. Advanced RBD software such as ReliaSoft's BlockSim can handle and analyze such dependencies, as well as provide more advanced constructs and analyses (see http://www.ReliaSoft.com/BlockSim).

Failure Modes Configurations

Series Configuration

The basic competing failure modes configuration, which has already been discussed, is a series configuration. In a series configuration, the occurrence of any failure mode results in failure for the product.

The equation that describes series configuration is:

[math]\displaystyle{ R(t)={{R}_{1}}(t)\cdot {{R}_{2}}(t)\cdot ...\cdot {{R}_{n}}(t)\,\! }[/math]

where [math]\displaystyle{ n\,\! }[/math] is the total number of failure modes considered.

Parallel

In a simple parallel configuration, at least one of the failure modes must not occur for the product to continue operation.

The equation that describes the parallel configuration is:

[math]\displaystyle{ R(t)=1-\underset{i=1}{\overset{n}{\mathop \prod }}\,(1-{{R}_{i}}(t))\,\! }[/math]

where [math]\displaystyle{ n\,\! }[/math] is the total number of failure modes considered.

Combination of Series and Parallel

While many smaller products can be accurately represented by either a simple series or parallel configuration, there may be larger products that involve both series and parallel configurations in the overall model of the product. Such products can be analyzed by calculating the reliabilities for the individual series and parallel sections and then combining them in the appropriate manner.

k-out-of-n Parallel Configuration=

The k-out-of-n configuration is a special case of parallel redundancy. This type of configuration requires that at least [math]\displaystyle{ k\,\! }[/math] failure modes do not happen out of the total [math]\displaystyle{ n\,\! }[/math] parallel failure modes for the product to succeed. The simplest case of a k-out-of-n configuration is when the failure modes are independent and identical and have the same failure distribution and uncertainties about the parameters (in other words they are derived from the same test data). In this case, the reliability of the product with such a configuration can be evaluated using the binomial distribution, or:

[math]\displaystyle{ R(t)=\overset{n}{\mathop{\underset{r=k}{\mathop{\underset{}{\overset{}{\mathop \sum }}\,}}\,}}\,\left( \underset{k}{\mathop{\overset{n}{\mathop{{}}}\,}}\, \right){{R}^{r}}(t){{(1-R(t))}^{n-r}}\,\! }[/math]

In the case where the k-out-of-n failure modes are not identical, other approaches for calculating the reliability must be used (e.g. the event space method). Discussion of these is beyond the scope of this reference. Interested readers can consult the System analysis reference.

Complex Systems

In many cases, it is not easy to recognize which components are in series and which are in parallel in a complex system.

The previous configuration cannot be broken down into a group of series and parallel configurations. This is primarily due to the fact that failure mode C has two paths leading away from it, whereas B and D have only one. Several methods exist for obtaining the reliability of a complex configuration including the decomposition method, the event space method and the path-tracing method. Discussion of these is beyond the scope of this reference. Interested readers can consult the System analysis reference.

Complex Failure Modes Example

Assume that a product has five independent failure modes: A, B, C, D and E. Furthermore, assume that failure of the product will occur if mode A occurs, modes B and C occur simultaneously or if modes D and E occur simultaneously. The objective is to estimate the reliability of the product at 100 hours, with 90% two-sided confidence bounds.

The product is tested to failure, and the failure times due to each mode are recorded in the following table.

TTF for A	TTF for B	TTF for C	TTF for D	TTF for E
276	23	499	467	67
320	36	545	540	72
323	57	661	716	81
558	89	738	737	108
674	99	987	761	110
829	154	1165	1093	127
878	200	1337	1283	148

Solution

The reliability block diagram (RBD) approach can be used to analyze the reliability of the product. But before creating a diagram, the data sets of the failure modes need to be segregated so that each mode can be represented by a single block in the diagram. Recall that when you analyze a particular mode, the failure times for all other competing modes are considered to be suspensions. This captures the fact that those units operated for a period of time without experiencing the failure mode of interest before they were removed from observation when they failed due to another mode. We can easily perform this step via Weibull++'s Batch Auto Run utility. To do this, enter the data from the table into a single data sheet. Choose the 2P-Weibull distribution and the MLE analysis method, and then click the Batch Auto Run icon on the control panel. When prompted to select the subset IDs, select them all. Click the Processing Preferences tab. In the Extraction Options area, select the second option, as shown next.

This will extract the data sets that are required for the analysis. Select the check box in the Calculation Options area and click OK. The data sets are extracted into separate data sheets in the folio and automatically calculated.

Next, create a diagram by choosing Insert > Tools > Diagram. Add blocks by right-clicking the diagram and choosing Add Block on the shortcut menu. When prompted to select the data sheet of the failure mode that the block will represent, select the data sheet for mode A. Use the same approach to add the blocks that will represent failure modes B, C , D and E. Add a connector by right-clicking the diagram sheet and choosing Connect Blocks, and then connect the blocks in an appropriate configuration to describe the relationships between the failure modes. To insert a node, which acts as a switch that the diagram paths move through, right-click the diagram and choose Add Node. Specify the number of required paths in the node by double-clicking the node and entering the appropriate number (use 2 in both nodes).

The following figure shows the completed diagram.

Click Analyze to analyze the diagram, and then use the Quick Calculation Pad (QCP) to estimate the reliability. The estimated R(100 hours) and the 90% two-sided confidence bounds are:

[math]\displaystyle{ \begin{matrix} {{{\hat{R}}}_{U}}(100)=0.895940 \\ \hat{R}(100)=0.824397 \\ {{{\hat{R}}}_{L}}(100)=0.719090 \\ \end{matrix}\,\! }[/math]

Competing Failure Modes Analysis

Contents