Template:Grouped Data Parameter Estimation: Difference between revisions

From ReliaWiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 52: Line 52:
===  When Using Maximum Likelihood===
===  When Using Maximum Likelihood===
When using maximum likelihood methods, each individual time is explicitly used in the calculation of the parameters, thus there is no difference in the entry of a group of 10 units failing at 100 hours and 10 individual entries of 100 hours. This is inherent in the standard MLE method. In other words, no matter how the data were entered (i.e. as grouped or non-grouped) the results will be identical.  When using maximum likelihood, we highly recommend entering redundant data in groups, as this significantly speeds up the calculations.
When using maximum likelihood methods, each individual time is explicitly used in the calculation of the parameters, thus there is no difference in the entry of a group of 10 units failing at 100 hours and 10 individual entries of 100 hours. This is inherent in the standard MLE method. In other words, no matter how the data were entered (i.e. as grouped or non-grouped) the results will be identical.  When using maximum likelihood, we highly recommend entering redundant data in groups, as this significantly speeds up the calculations.
== ReliaSoft's Alternate Ranking Method (RRM) Step-by-Step Example==
This section illustrates the ReliaSoft ranking method (RRM), which is an iterative improvement on the standard ranking method (SRM). This method is illustrated in this section using an example for the two-parameter Weibull distribution. This method can also be easily generalized for other models.
Consider the following test data, as shown in the following Table B.1.
===  Initial parameter estimation===
As a preliminary step, we need to provide a crude estimate of the Weibull parameters for this data. To begin, we will extract the exact times-to-failure: 10, 40, and 50 and append them to the midpoints of the interval failures: 50 (for the interval of 20 to 80) and 47.5 (for the interval of 10 to 85). Now, our extracted list consists of the data in Table B.2.
Using the traditional rank regression, we obtain the first initial estimates:
<math>\begin{align}
  & {{\widehat{\beta }}_{0}}= & 1.91367089 \\
& {{\widehat{\eta }}_{0}}= & 43.91657736 
\end{align}</math>
Step 1
For all intervals, we obtain a weighted ``midpoint'' using:
<math>\begin{align}
  {{{\hat{t}}}_{m}}\left( \hat{\beta },\hat{\eta } \right)= & \frac{\int_{LI}^{TF}t\text{ }f(t;\hat{\beta },\hat{\eta })dt}{\int_{LI}^{TF}f(t;\hat{\beta },\hat{\eta })dt}, \\
  = & \frac{\int_{LI}^{TF}t\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt}{\int_{LI}^{TF}\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt} 
\end{align}</math>
This transforms our data into the format in Table B.3.
Step 2
Now we arrange the data as in Table B.4.
Step 3
We now consider the left and right censored data, as in Table B.5.
In general, for left censored data:
• The increment term for <math>n</math> left censored items at time <math>={{t}_{0}},</math> with a time-to-failure of .. when <math>{{t}_{0}}\le {{t}_{i-1}}</math> is zero.
• When <math>{{t}_{0}}>{{t}_{i-1}},</math> the contribution is:
<math>\frac{n}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}\underset{{{t}_{i-1}}}{\overset{MIN({{t}_{i}},{{t}_{0}})}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt</math>
or:
<math>n\frac{{{F}_{0}}(MIN({{t}_{i}},{{t}_{0}}))-{{F}_{0}}({{t}_{i-1}})}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}</math>
where <math>{{t}_{i-1}}</math> is the time-to-failure previous to the <math>{{t}_{i}}</math> time-to-failure and <math>n</math> is the number of units associated with that time-to-failure (or units in the group).
In general, for right censored data:
• The increment term for <math>n</math> right censored at time <math>={{t}_{0}},</math> with a time-to-failure of <math>{{t}_{i}}</math>, when <math>{{t}_{0}}\ge {{t}_{i}}</math> is zero.
• When <math>{{t}_{0}}<{{t}_{i}},</math> the contribution is:
<math>\frac{n}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}\underset{MAX({{t}_{0}},{{t}_{i-1}})}{\overset{{{t}_{i}}}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt</math>
or:
<math>n\frac{{{F}_{0}}({{t}_{i}})-{{F}_{0}}(MAX({{t}_{0}},{{t}_{i-1}}))}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}</math>
where <math>{{t}_{i-1}}</math> is the time-to-failure previous to the <math>{{t}_{i}}</math> time-to-failure and <math>n</math> is the number of units associated with that time-to-failure (or units in the group).
Step 4
Sum up the increments (horizontally in rows), as in Table B.6.
Step 5
Compute new mean order numbers (MON), as shown Table B.7, utilizing the increments obtained in Table B.6, by adding the ``number of items'' plus the ``previous MON'' plus the current ``increment.''
Step 6
Compute the median ranks based on these new MONs as shown in Table B.8.
Step 7
Compute new <math>\beta </math> and <math>\eta ,</math> using standard rank regression and based upon the data as shown in Table B.9.
Step 8
Return and repeat the process from Step 1 until an acceptable convergence is reached on the parameters (i.e. the parameter values stabilize).
===Results===
The results of the first five iterations are shown in Table B.10.
Using Weibull++ with rank regression on X yields:
<math>{{\widehat{\beta }}_{RRX}}=1.82890,\text{ }{{\widehat{\eta }}_{RRX}}=41.69774</math>
The direct MLE solution yields:
<math>{{\widehat{\beta }}_{MLE}}=2.10432,\text{ }{{\widehat{\eta }}_{MLE}}=42.31535</math>

Revision as of 15:49, 24 June 2011

New format available! This reference is now available in a new format that offers faster page load, improved display for calculations and images, more targeted search and the latest content available as a PDF. As of September 2023, this Reliawiki page will not continue to be updated. Please update all links and bookmarks to the latest reference at help.reliasoft.com/reference/life_data_analysis

Chapter 4C: Grouped Data Parameter Estimation


Weibullbox.png

Chapter 4C  
Grouped Data Parameter Estimation  

Synthesis-icon.png

Available Software:
Weibull++

Examples icon.png

More Resources:
Weibull++ Examples Collection


Grouped Data Analysis

The grouped data type in Weibull++ is used for tests where there are groups of units having the same time-to-failure, or units are grouped together in intervals, or there are groups of units suspended at the same time. However, you must be cautious in using the different parameter estimation methods because different methods treat grouped data in different ways. ReliaSoft designed Weibull++ to treat grouped data in different ways to maximize the options available to you.

When Using Rank Regression (Least Squares)

When using grouped data, Weibull++ plots the data point corresponding to the highest rank position in each group. In other words, given 3 groups of 10 units, each failing at 100, 200, and 300 hours respectively, the three plotted points will be the end point of each group, or the 10th rank position out of 30, the 20th rank position out of 30 and the 30th rank position out of 30. This procedure is identical to standard procedures for using grouped data [19]. In cases where grouped data is used, it is assumed that the failures occurred at some time in the interval between the previous and current time to failure. In our example, this would be the same as saying that 10 units have failed in the interval between zero and 100 hours, another 10 units failed in the interval between 100 and 200 hours, and in the interval from 200 to 300 hours another 10 units failed. The rank regression analysis automatically takes this into account. If this assumption of interval failure is incorrect and 10 units failed exactly at 100 hours, 10 failed exactly at 200 hours and 10 failed exactly at 300 hours, it is recommended that you enter the data as non-grouped when using rank regression, or select the Use all data if grouped option from the Folio Control Panel's Set Analysis tab.

The Mathematics

Median ranks are used to obtain an estimate of the unreliability, [math]\displaystyle{ Q({{T}_{j}}), }[/math] for each failure at a [math]\displaystyle{ 50% }[/math] confidence level. In the case of grouped data, the ranks are estimated for each group of failures, instead of each failure. For example, when using a group of 10 failures at 100 hours, 10 at 200 hours and 10 at 300 hours, Weibull++ estimates the median ranks ([math]\displaystyle{ Z }[/math] values) by solving the cumulative binomial equation with the appropriate values for order number and total number of test units. For 10 failures at 100 hours, the median rank, [math]\displaystyle{ Z, }[/math] is estimated by using:

[math]\displaystyle{ 0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}} }[/math]


with:

[math]\displaystyle{ N=30,\text{ }J=10 }[/math]


where one [math]\displaystyle{ Z }[/math] is obtained for the group, to represent the probability of 10 failures occurring out of 30. For 10 failures at 200 hours, [math]\displaystyle{ Z }[/math] is estimated by using:

[math]\displaystyle{ 0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}} }[/math]


where:

[math]\displaystyle{ N=30,\text{ }J=20 }[/math]


to represent the probability of 20 failures out of 30. For 10 failures at 300 hours, [math]\displaystyle{ Z }[/math] is estimated by using:

[math]\displaystyle{ 0.50=\underset{k=j}{\overset{N}{\mathop \sum }}\,\left( \begin{matrix} N \\ k \\ \end{matrix} \right){{Z}^{k}}{{\left( 1-Z \right)}^{N-k}} }[/math]


where:

[math]\displaystyle{ N=30,\text{ }J=30 }[/math]


to represent the probability of 30 failures out of 30.

When Using Maximum Likelihood

When using maximum likelihood methods, each individual time is explicitly used in the calculation of the parameters, thus there is no difference in the entry of a group of 10 units failing at 100 hours and 10 individual entries of 100 hours. This is inherent in the standard MLE method. In other words, no matter how the data were entered (i.e. as grouped or non-grouped) the results will be identical. When using maximum likelihood, we highly recommend entering redundant data in groups, as this significantly speeds up the calculations.