Preventive Maintenance

This article also appears in the System analysis reference. Preventive maintenance (PM) is a schedule of planned maintenance actions aimed at the prevention of breakdowns and failures. The primary goal of preventive maintenance is to prevent the failure of equipment before it actually occurs. It is designed to preserve and enhance equipment reliability by replacing worn components before they actually fail. Preventive maintenance activities include equipment checks, partial or complete overhauls at specified periods, oil changes, lubrication and so on. In addition, workers can record equipment deterioration so they know to replace or repair worn parts before they cause system failure. Recent technological advances in tools for inspection and diagnosis have enabled even more accurate and effective equipment maintenance. The ideal preventive maintenance program would prevent all equipment failure before it occurs.

Value of Preventive Maintenance
There are multiple misconceptions about preventive maintenance. One such misconception is that PM is unduly costly. This logic dictates that it would cost more for regularly scheduled downtime and maintenance than it would normally cost to operate equipment until repair is absolutely necessary. This may be true for some components; however, one should compare not only the costs but the long-term benefits and savings associated with preventive maintenance. Without preventive maintenance, for example, costs for lost production time from unscheduled equipment breakdown will be incurred. Also, preventive maintenance will result in savings due to an increase of effective system service life.

Long-term benefits of preventive maintenance include:


 * •	Improved system reliability.
 * •	Decreased cost of replacement.
 * •	Decreased system downtime.
 * •	Better spares inventory management.

Long-term effects and cost comparisons usually favor preventive maintenance over performing maintenance actions only when the system fails.

When Does Preventive Maintenance Make Sense?
Preventive maintenance is a logical choice if, and only if, the following two conditions are met:


 * Condition #1: The component in question has an increasing failure rate. In other words, the failure rate of the component increases with time, implying wear-out.  Preventive maintenance of a component that is assumed to have an exponential distribution (which implies a constant failure rate) does not make sense!
 * Condition #2: The overall cost of the preventive maintenance action must be less than the overall cost of a corrective action.

If both of these conditions are met, then preventive maintenance makes sense. Additionally, based on the costs ratios, an optimum time for such action can be easily computed for a single component. This is detailed in later sections.

The Fallacy of "Constant Failure Rate" and "Preventive Replacement"
Even though we alluded to the fact in the last section, it is important to make it explicitly clear that if a component has a constant failure rate (i.e., defined by an exponential distribution), then preventive maintenance of the component will have no effect on the component's failure occurrences. To illustrate this, consider a component with an $$MTTF\,\!$$ = $$100\,\!$$ hours, or $$\lambda =0.01\,\!$$, and with preventive replacement every 50 hours. The reliability vs. time graph for this case is illustrated in the following figure, where the component is replaced every 50 hours, thereby resetting the component's reliability to one. At first glance, it may seem that the preventive maintenance action is actually maintaining the component at a higher reliability.



However, consider the following cases for a single component:

Case 1: The component's reliability from 0 to 60 hours:

Case 2: The component's reliability from 50 to 60 hours:
 * With preventive maintenance, the component was replaced with a new one at 50 hours so the overall reliability is based on the reliability of the new component for 10 hours, $$R(t=10)=90.48%\,\!$$, times the reliability of the previous component, $$R(t=50)=60.65%\,\!$$. The result is $$R(t=60)=54.88%.\,\!$$
 * Without preventive maintenance, the reliability would be the reliability of the same component operating to 60 hours, or $$R(t=60)=54.88%\,\!$$.
 * With preventive maintenance, the component was replaced at 50 hours, so this is solely based on the reliability of the new component for a mission of 10 hours, or $$R(t=10)=90.48%\,\!$$.
 * Without preventive maintenance, the reliability would be the conditional reliability of the same component operating to 60 hours, having already survived to 50 hours, or $${{R}_{C}}(t=10|T=50)=R(60)/R(50)=90.48%\,\!$$.

As can be seen, both cases — with and without preventive maintenance — yield the same results.

Determining Preventive Replacement Time
As mentioned earlier, if the component has an increasing failure rate, then a carefully designed preventive maintenance program is beneficial to system availability. Otherwise, the costs of preventive maintenance might actually outweigh the benefits. The objective of a good preventive maintenance program is to either minimize the overall costs (or downtime, etc.) or meet a reliability objective. In order to achieve this, an appropriate interval (time) for scheduled maintenance must be determined. One way to do that is to use the optimum age replacement model, as presented next. The model adheres to the conditions discussed previously:
 * •	The component is exhibiting behavior associated with a wear-out mode. That is, the failure rate of the component is increasing with time.
 * •	The cost for planned replacements is significantly less than the cost for unplanned replacements.

The following figure shows the Cost Per Unit Time vs. Time plot and it can be seen that the corrective replacement costs increase as the replacement interval increases. In other words, the less often you perform a PM action, the higher your corrective costs will be. Obviously, as we let a component operate for longer times, its failure rate increases to a point that it is more likely to fail, thus requiring more corrective actions. The opposite is true for the preventive replacement costs. The longer you wait to perform a PM, the less the costs; if you do PM too often, the costs increase. If we combine both costs, we can see that there is an optimum point that minimizes the costs. In other words, one must strike a balance between the risk (costs) associated with a failure while maximizing the time between PM actions.



Optimum Age Replacement Policy
To determine the optimum time for such a preventive maintenance action (replacement), we need to mathematically formulate a model that describes the associated costs and risks. In developing the model, it is assumed that if the unit fails before time $$t\,\!$$, a corrective action will occur and if it does not fail by time $$t\,\!$$, a preventive action will occur. In other words, the unit is replaced upon failure or after a time of operation, $$t\,\!$$, whichever occurs first. Thus, the optimum replacement time can be found by minimizing the cost per unit time, $$CPUT\left( t \right).\,\!$$ $$CPUT\left( t \right)\,\!$$ is given by:


 * $$\begin{align}

CPUT\left( t \right)= & \frac{\text{Total Expected Replacement Cost per Cycle}}{\text{Expected Cycle Length}} \\ = & \frac{{{C}_{P}}\cdot R\left( t \right)+{{C}_{U}}\cdot \left[ 1-R\left( t \right) \right]}{\int_{0}^{t}R\left( s \right)ds} \end{align}\,\!$$

where:


 * $$R(t)\,\!$$ = reliability at time $$t\,\!$$.
 * $${{C}_{P}}\,\!$$ = cost of planned replacement.
 * $${{C}_{U}}\,\!$$ = cost of unplanned replacement.

The optimum replacement time interval, $$t\,\!$$, is the time that minimizes $$CPUT\left( t \right).\,\!$$ This can be found by solving for $$t\,\!$$ such that:


 * $$\frac{\partial \left[ CPUT(t) \right]}{\partial t}=0\,\!$$

Or by solving for a $$t\,\!$$ that satisfies the following equation:


 * $$\frac{\partial \left[ \tfrac{{{C}_{P}}\cdot R\left( t \right)+{{C}_{U}}\cdot \left[ 1-R\left( t \right) \right]}{\int_{0}^{t}R\left( s \right)ds} \right]}{\partial t}=0\,\!$$

Interested readers can refer to Barlow and Hunter [2] for more details on this model.

In BlockSim (Version 8 and above), you can use the Optimum Replacement window to determine the optimum replacement time either for an individual block or for multiple blocks in a diagram simultaneously. When working with multiple blocks, the calculations can be for individual blocks or for one or more groups of blocks. For each item that is included in the optimization calculations, you will need to specify the cost for a planned replacement and the cost for an unplanned replacement. This is done by calculating the costs for replacement based on the item settings using equations or simulation and then, if desired, manually entering any additional costs for either type of replacement in the corresponding columns of the table.

The equations used to calculate the costs of planned and unplanned tasks for each item based on its associated URD are as follows:


 * For the cost of planned tasks, here denoted as PM cost:


 * $$\begin{align}

\text{PM Cost}= \left(\text{PM Down Time Rate}+ \text{Block Level Down Time Rate} \right) \cdot \left( \text{MTTPM}+\text{Pool Delay} +\text{Crew Delay} \right) \\ + \text{Crew Labor Rate} \cdot \text{MTTPM} + \text{Cost per PM} + \text{Cost per Pool} +\text{Cost per Crew} \end{align}\,\!$$


 * Only PM tasks based on item age or system age (fixed or dynamic intervals) are considered. If there is more than one PM task based on item age, only the first one is considered.


 * For the cost of the unplanned task, here denoted as CM cost:
 * $$\begin{align}

\text{CM Cost}= & \left(\text{CM Down Time Rate}+ \text{Block Level Down Time Rate} \right) \cdot \left( \text{MTTR}+\text{Pool Delay} +\text{Crew Delay} \right) \\ & + \text{Crew Labor Rate} \cdot \text{MTTR} + \text{Cost per CM} + \text{Cost per Pool} +\text{Cost per Crew} +\text{Block Level Cost per Failure} \end{align}\,\!$$

When using simulation, for costs associated with planned replacements, all preventive tasks based on item age or system age (fixed or dynamic intervals) are considered. Because each item is simulated as a system (i.e., in isolation from any other item), tasks triggered in other ways are not considered.