Non-Parametric Recurrent Events Data Analysis

=Recurrent Events Data Analysis=

Recurrent Events Data Analysis, also called Recurrence Data Analysis (RDA), can be used in various applied fields such as reliability, medicine, social sciences, economics, business and criminology.

Whereas in life data analysis (LDA) it was assumed that events (failures) were independent and identically distributed (iid), there are many cases where events are dependent and not identically distributed (such as repairable system data) or where the analyst is interested in modeling the number of occurrences of events over time rather than the length of time prior to the first event, as in LDA.

Weibull++ provides both parametric and non-parametric approaches to analyze such data.

•	The non-parametric approach is based on the well-known Mean Cumulative Function (MCF). The Weibull++ module for this type of analysis builds upon the work of Dr. Wayne Nelson, who has written extensively on the calculation and applications of MCF [31].

•	The parametric approach is based on the General Renewal Process (GRP) model, which is particularly useful in understanding the effects of the repairs on the age of a system. Traditionally, the commonly used models for analyzing repairable systems data are perfect renewal processes (PRP), corresponding to perfect repairs, and nonhomogeneous Poisson processes (NHPP), corresponding to minimal repairs. However, most repair activities may realistically not result in such extreme situations but in a complicated intermediate one (general repair or imperfect repair/maintenance), which are well treated with the GRP model.

Introduction
Non-parametric recurrence data analysis provides a nonparametric graphical estimate of the mean cumulative number or cost of recurrence per unit versus age. In the reliability field, the Mean Cumulative Function (MCF) can be used to: [31]

•	Evaluate whether the population repair (or cost) rate increases or decreases with age (this is useful for product retirement and burn-in decisions).

•	Estimate the average number or cost of repairs per unit during warranty or some time period.

•	Compare two or more sets of data from different designs, production periods, maintenance policies, environments, operating conditions, etc.

•	Predict future numbers and costs of repairs, such as, the next month, quarter, or year.

•	Reveal unexpected information and insight.

The Mean Cumulative Function (MCF) for Recurrence Data
In non-parametric analysis of recurrent events data, each population unit can be described by a cumulative history function for the cumulative number of recurrences. It is a staircase function that depicts the cumulative number of recurrences of a particular event, such as repairs, over time. Figure CHM depicts a unit's cumulative history function.



The nonparametric model for a population of units is described as the population of cumulative history functions (curves). It is the population of all staircase functions of every unit in the population. At age t, the units have a distribution of their cumulative number of events. That is, a fraction of the population has accumulated 0 recurrences, another fraction has accumulated 1 recurrence, another fraction has accumulated 2 recurrences, etc. This distribution differs at different ages, $$t$$, and has a mean  $$M(t)$$  called the mean cumulative function (MCF). The $$M(t)$$  is the pointwise average of all population cumulative history functions(see Fig.MCFDemo).

For the case of uncensored data, the mean cumulative function $$M{{(t)}_{i}}\ $$ values at different recurrence ages  $${{t}_{i}}$$  are estimated by calculating the average of the cumulative number of recurrences of events for each unit in the population at  $${{t}_{i}}$$. When the histories are censored, the following steps are applied.

1st Step - Order All Ages: Order all recurrence and censoring ages from smallest to largest. If a recurrence age for a unit is the same as its censoring (suspension) age, the recurrence age goes first. If multiple units have a common recurrence or censoring age, then these units could be put in a certain order or be sorted randomly.

2nd Step - Calculate the Number, $${{r}_{i}}$$, of Units that Passed Through Age  $${{t}_{i}}$$ :


 * $$\begin{align}

& {{r}_{i}}= & {{r}_{i-1}}\quad \quad \text{if }{{t}_{i}}\text{ is a recurrence age} \\ & {{r}_{i}}= & {{r}_{i-1}}-1\text{  if }{{t}_{i}}\text{ is a censoring age} \end{align}$$

$$N$$ is the total number of units and $${{r}_{1}}=N$$  at the first observed age which could be a recurrence or suspension.

3rd Step - Calculate $$MCF$$  Estimate,  $${{M}^{*}}(t)$$: For each sample recurrence age  $${{t}_{i}},$$  calculate the mean cumulative function estimate as follows:

$${{M}^{*}}({{t}_{i}})=\frac{1}+{{M}^{*}}({{t}_{i-1}})$$

where $${{M}^{*}}(t)=\tfrac{1}$$  at the earliest observed recurrence age,  $${{t}_{1}}$$.

Example 1
A health care company maintains five identical pieces of equipment used by a hospital. When a piece of equipment fails, the company sends a crew to repair it. The following table gives the failure and censoring ages for each machine, where the + sign indicates a censoring age.

$$\begin{matrix} Equipment ID & Months \\ \text{1} & \text{5, 10, 15, 17+} \\ \text{2} & \text{6, 13, 17, 19+} \\ \text{3} & \text{12, 20, 25, 26+} \\ \text{4} & \text{13, 15, 24+} \\ \text{5} & \text{16, 22, 25, 28+} \\ \end{matrix}$$ $$$$ Estimate the MCF values and the 95% confidence limits ignoring the repair duration.

Solution to Example 1
The MCF estimate is obtained as follows:

$$\begin{matrix} ID & Months ({{t}_{i}}) & State & {{r}_{i}} & 1/{{r}_{i}} & {{M}^{*}}({{t}_{i}}) \\ \text{1} & \text{5} & \text{F} & \text{5} & \text{0}\text{.20} & \text{0}\text{.20} \\ \text{2} & \text{6} & \text{F} & \text{5} & \text{0}\text{.20} & \text{0}\text{.20 + 0}\text{.20 = 0}\text{.40} \\ \text{1} & \text{10} & \text{F} & \text{5} & \text{0}\text{.20} & \text{0}\text{.40 + 0}\text{.20 = 0}\text{.60} \\ \text{3} & \text{12} & \text{F} & \text{5} & \text{0}\text{.20} & \text{0}\text{.60 + 0}\text{.20 = 0}\text{.80} \\ \text{2} & \text{13} & \text{F} & \text{5} & \text{0}\text{.20} & \text{0}\text{.80+0}\text{.20 =1}\text{.00} \\ \text{4} & \text{13} & \text{F} & \text{5} & \text{0}\text{.20} & \text{1}\text{.00 + 0}\text{.20 = 1}\text{.20} \\ \text{1} & \text{15} & \text{F} & \text{5} & \text{0}\text{.20} & \text{1}\text{.20 + 0}\text{.20 =1}\text{.40} \\ \text{4} & \text{15} & \text{F} & \text{5} & \text{0}\text{.20} & \text{1}\text{.40 + 0}\text{.20 = 1}\text{.60} \\ \text{5} & \text{16} & \text{F} & \text{5} & \text{0}\text{.20} & \text{1}\text{.60 + 0}\text{.20 = 1}\text{.80} \\ \text{2} & \text{17} & \text{F} & \text{5} & \text{0}\text{.20} & \text{1}\text{.80 + 0}\text{.20 = 2}\text{.0} \\ \text{1} & \text{17} & \text{S} & \text{4} & {} & {} \\ \text{2} & \text{19} & \text{S} & \text{3} & {} & {} \\ \text{3} & \text{20} & \text{F} & \text{3} & \text{0}\text{.33} & \text{2}\text{.00 + 0}\text{.33 = 2}\text{.33} \\ \text{5} & \text{22} & \text{F} & \text{3} & \text{0}\text{.33} & \text{2}\text{.33 + 0}\text{.33 = 2}\text{.66} \\ \text{4} & \text{24} & \text{S} & \text{2} & {} & {} \\ \text{3} & \text{25} & \text{F} & \text{2} & \text{0}\text{.50} & \text{2}\text{.66 + 0}\text{.50 = 3}\text{.16} \\ \text{5} & \text{25} & \text{F} & \text{2} & \text{0}\text{.50} & \text{3}\text{.16 + 0}\text{.50 = 3}\text{.66} \\ \text{3} & \text{26} & \text{S} & \text{1} & {} & {} \\ \text{5} & \text{28} & \text{S} & \text{0} & {} & {} \\ \end{matrix}$$

Confidence Limits for the MCF
Upper and lower conifidence limits for $$M({{t}_{i}})$$  are:


 * $$\begin{align}

& {{M}_{U}}({{t}_{i}})= & {{M}^{*}}({{t}_{i}}).{{e}^{\tfrac{{{K}_{\alpha }}.\sqrt{Var[{{M}^{*}}({{t}_{i}})]}}{{{M}^{*}}({{t}_{i}})}}} \\ & {{M}_{L}}({{t}_{i}})= & \frac{{{M}^{*}}({{t}_{i}})} \end{align}$$

where $$\alpha $$  ( $$50%<\alpha <100%$$ ) is  confidence level,  $${{K}_{\alpha }}$$  is the  $$\alpha $$  standard normal percentile and  $$Var[{{M}^{*}}({{t}_{i}})]$$  is the variance of the MCF estimate at recurrence age  $${{t}_{i}}$$. The variance is calculated as follows:


 * $$Var[{{M}^{*}}({{t}_{i}})]=Var[{{M}^{*}}({{t}_{i-1}})]+\frac{1}{r_{i}^{2}}\left[ \underset{j\in {{R}_{i}}}{\overset{}{\mathop \sum }}\,{{\left( {{d}_{ji}}-\frac{1}{{{r}_{i}}} \right)}^{2}} \right]$$

where $$r$$  is defined in Eqn.(R),  $${{R}_{i}}$$   is the set of the units that have not been suspended by  $$i$$  and  $${{d}_{ji}}$$  is defined as follows:


 * $$\begin{align}

& {{d}_{ji}}= & 1\text{ if the }{{j}^{\text{th }}}\text{unit had an event recurrence at age }{{t}_{i}} \\ & {{d}_{ji}}= & 0\text{ if the }{{j}^{\text{th }}}\text{unit did not have an event reoccur at age }{{t}_{i}} \end{align}$$

Example 2
Using the data in Example 1, estimate the 95% confidence bounds.

Solution to Example 2
Using Eq. MCF Var the following table of variance values can be obtained:

Using Eqn.(MCF Bounds) and $${{K}_{5}}=1.644$$  for a 95% confidence level, the confidence bounds can be obtained as follows:

$$\begin{matrix} ID & Months & State & MC{{F}_{i}} & Va{{r}_{i}} & MC{{F}_} & MC{{F}_} \\ \text{1} & \text{5} & \text{F} & \text{0}\text{.20} & \text{0}\text{.032} & 0.0459 & 0.8709 \\ \text{2} & \text{6} & \text{F} & \text{0}\text{.40} & \text{0}\text{.064} & 0.1413 & 1.1320 \\ \text{1} & \text{10} & \text{F} & \text{0}\text{.60} & \text{0}\text{.096} & 0.2566 & 1.4029 \\ \text{3} & \text{12} & \text{F} & \text{0}\text{.80} & \text{0}\text{.128} & 0.3834 & 1.6694 \\ \text{2} & \text{13} & \text{F} & \text{1}\text{.00} & \text{0}\text{.160} & 0.5179 & 1.9308 \\ \text{4} & \text{13} & \text{F} & \text{1}\text{.20} & \text{0}\text{.192} & 0.6582 & 2.1879 \\ \text{1} & \text{15} & \text{F} & \text{1}\text{.40} & \text{0}\text{.224} & 0.8028 & 2.4413 \\ \text{4} & \text{15} & \text{F} & \text{1}\text{.60} & \text{0}\text{.256} & 0.9511 & 2.6916 \\ \text{5} & \text{16} & \text{F} & \text{1}\text{.80} & \text{0}\text{.288} & 1.1023 & 2.9393 \\ \text{2} & \text{17} & \text{F} & \text{2}\text{.0} & \text{0}\text{.320} & 1.2560 & 3.1848 \\ \text{1} & \text{17} & \text{S} & {} & {} & {} & {} \\ \text{2} & \text{19} & \text{S} & {} & {} & {} & {} \\ \text{3} & \text{20} & \text{F} & \text{2}\text{.33} & \text{0}\text{.394} & 1.4990 & 3.6321 \\ \text{5} & \text{22} & \text{F} & \text{2}\text{.66} & \text{0}\text{.468} & 1.7486 & 4.0668 \\ \text{4} & \text{24} & \text{S} & {} & {} & {} & {} \\ \text{3} & \text{25} & \text{F} & \text{3}\text{.16} & \text{0}\text{.593} & 2.1226 & 4.7243 \\ \text{5} & \text{25} & \text{F} & \text{3}\text{.66} & \text{0}\text{.718} & 2.5071 & 5.3626 \\ \text{3} & \text{26} & \text{S} & {} & {} & {} & {} \\ \text{5} & \text{28} & \text{S} & {} & {} & {} & {} \\ \end{matrix}$$

The analysis presented in Examples 1 and 2 can be obtained automatically in Weibull ++ using the Non-Parametric RDA Specialized Folio, as shown next.



Note: In the above Folio, the $$F$$  refers to failures and  $$E$$  refers to suspensions (or censoring ages).

The results with calculated MCF values and upper and lower 95% confidence limits are shown next along with the graphical plot.





Example 3
The following table displays transmission repairs on a sample of 14 cars with manual transmission in a preproduction road test [31]. Here + denotes the censoring ages (how long a car has been observed).

$$\begin{matrix} Car ID & Mileage \\ \text{1} & \text{27099+} \\ \text{2} & \text{21999+} \\ \text{3} & \text{11891, 27583+} \\ \text{4} & \text{19966+} \\ \text{5} & \text{26146+} \\ \text{6} & \text{3648, 13957, 23193+} \\ \text{7} & \text{19823+} \\ \text{8} & \text{2890, 22707+} \\ \text{9} & \text{2714, 19275+} \\ \text{10} & \text{19803+} \\ \text{11} & \text{19630+} \\ \text{12} & \text{22056+} \\ \text{13} & \text{22940+} \\ \text{14} & \text{3240, 7690, 18965+} \\ \end{matrix}$$

The car manufacturer seeks to estimate the mean cumulative number of repairs per car by 24,000 test miles (equivalently 5.5 x 24,000 = 132,000 customer miles) and to observe whether the population repair rate increases or decreases as a population ages.

Solution to Example 3
The data is entered into a Non-Parametric RDA Specialized Folio in Weibull++ as follows.



The results are as follows,



The results indicate that after 13,957 miles of testing, the estimated mean cumulative number of repairs per car is 0.5. Therefore, by 24,000 test miles, the estimated mean cumulative number of repairs per car is 0.5.

The MCF plot is shown next.



A smooth curve through the MCF plot has a derivative that decreases as the population ages. That is, the repair rate decreases as each population ages. This is typical of products with manufacturing defects.