Additional Analyses

Repairable Systems Analysis Through Simulation described the process of using discrete event simulation to perform basic system reliability, availability and maintainability analyses. This chapter discusses two additional types of analyses that can be performed with simulation: Throughput Analysis and Life Cycle Cost Analysis

=Throughput Analysis=

In the prior sections, we concentrated on failure and repair actions of a system. In doing so, we viewed the system from a system/component up/down perspective. One could take this analysis a step further and consider system throughput as a part of the analysis. To define system throughput, assume that each component in the system processed (or made) something while operating. As an example, consider the case shown next with two components in series, each processing/producing 10 and 20 items per unit time (ipu) respectively.



In this case, the system configuration is not only the system's reliability-wise configuration but also its production/processing sequence. In other words, the first component processes/produces 10 ipu and the second component can process/produce up to 20 ipu. However, a block can only process/produce items it receives from the blocks before it. Therefore, the second component in this case is only receiving 10 ipu from the block before it. If we assume that neither component can fail, then the maximum items processed from this configuration would be 10 ipu. If the system were to operate for 100 time units, then the throughput of this system would be 1000 items, or $$(100\cdot 10)$$.

=Throughput Metrics and Terminology= In looking at throughput, one needs to define some terminology and metrics that will describe the behavior of the system and its components when doing such analyses. Some of the terminology used in BlockSim is given next.
 * System Throughput: System throughput is the total amount of items processed/produced by the system over the defined period of time. In the two-component example, this is  $$1000$$  items over  $$100$$  time units.
 * Component Throughput: The total amount of items processed or produced by each component (block). In the two-component example, this is  $$1000$$  items each.
 * Component Maximum Capacity: The maximum number of items that the component (block) could have processed/produced. This is simply the block's throughput rate multiplied by the run time.  In the two-component example, this is  $$100\cdot 10=1000$$  for the first component and  $$100\cdot 20=2000$$  for the second component.
 * Component Uptime Capacity: The maximum number of items the component could have processed/produced while it was up and running. In the two-component example, this is  $$1000$$  for the first component and  $$2000$$  for the second component, since we are assuming that the components cannot fail.  If the components could fail, this number would be the component's uptime multiplied by its throughput rate.
 * Component Excess Capacity: The additional amount a component could have processed/produced while up and running. In the two-component example, this is  $$0$$  for the first component and  $$1000$$  for the second component.
 * Component Actual Utilization: The ratio of the component throughput and the component maximum capacity. In the two-component example, this is  $$100%$$  for the first component and  $$50%$$  for the second component.
 * Component Uptime Utilization: The ratio of the component throughput and the component uptime capacity. In the two-component example, this is  $$100%$$  for the first component and  $$50%$$  for the second component.  Note that if the components had failed and experienced downtime, this number would be different for each component.
 * Backlog: Items that the component could not process are kept in a backlog. Depending on the settings, a backlog may or may not be processed when an opportunity arises.  The available backlog metrics include:
 * Component Backlog: The amount of a backlog present at the component at the end of the run (simulation).
 * Component Processed Backlog: The amount of backlog processed by the component.
 * Excess Backlog: Under specific settings in BlockSim, components can accept only a limited backlog. In these cases, a backlog that was rejected is stored in the Excess Backlog category.

Overview of Throughput Analysis
To examine throughput, consider the following scenarios.

Scenario 1


Consider the system shown in figure above. Blocks $$A$$  through  $$I$$  produce a number of items per unit time as identified next to each letter (e.g.  $$A$$  : 100 implies  $$100$$  items per time unit for  $$A$$ ). The connections shown in the RBD show the physical path of the items through the process (or production line). For the sake of simplicity, also assume that the blocks can never fail and that items are routed equally to each path. This then implies that the following occurs over a single time unit:
 * •	Unit $$A$$  makes 100 items and routes 33.33 to  $$B$$, 33.33 to  $$C$$  and 33.33 to  $$D$$.
 * •	In turn, $$B$$,  $$C$$  and  $$D$$  route their 33.33 to  $$E$$ ,  $$F$$ ,  $$G$$  and  $$H$$  (8.33 to each path).
 * •	 $$E$$, $$F$$ ,  $$G$$  and  $$H$$  route 25 each to  $$I$$.
 * •	 $$I$$ processes all 100.
 * •	The system produces 100 items.

Thus, the following table would represent the throughput and excess capacity of each block after one time unit. Run summary for Scenario 1.

Scenario 2
Now consider figure below where it is assumed that block $$E$$  has failed. Then:
 * •	Unit $$A$$  makes 100 items and routes 33.33 to  $$B$$, 33.33 to  $$C$$  and 33.33 to  $$D$$.
 * •	In turn, $$B$$,  $$C$$  and  $$D$$  route their 33.33 to  $$F$$ ,  $$G$$  and  $$H$$  (11.11 to each path that has an operating block at the end).
 * •	 $$F$$, $$G$$  and  $$H$$  route 33.33 each to  $$I$$.
 * •	 $$I$$ processes all 100.
 * •	The system produces 100 items.

A summary result table is shown next: Run summary for Scenario 2.

Scenario 3
Consider figure below where both $$E$$  and  $$H$$  have failed. Then:
 * •	Unit $$A$$  makes 100 items and routes 33.33 to  $$B$$, 33.33 to  $$C$$  and 33.33 to  $$D$$.
 * •	In turn, $$B$$,  $$C$$  and  $$D$$  route their 33.33 to  $$F$$  and  $$G$$  (16.66 to each path that has an operating block at the end).
 * •	 $$F$$ and  $$G$$  get 50 items each.
 * •	 $$F$$ and  $$G$$  process and route 40 each (their maximum processing capacity) to  $$I$$ .  Both have a backlog of 10 since they could not process all 50 items they received.
 * •	 $$I$$ processes all 80.
 * • The system produces 80 items.

Run summary for Scenario 3. '''Utilization summary for Scenario 3.

It can be easily seen that the bottlenecks in the system are the blocks $$F$$  and  $$G$$.

Block Settings
In BlockSim, specific throughput properties can be set, as shown in figure below. Throughput: The number of items that the block can process per unit time.

Allocation: specify the allocation scheme across multiple paths (i.e. equal or weighted). This option is shown in figure above. To explain these settings, consider the example shown in figure below, which uses the same notation as before.



If the Weighted allocation across paths option is chosen, then the 60 items made by $$A$$  will be allocated to  $$B$$,  $$C$$  and  $$D$$  based on their throughput capabilities. Specifically, the portion that each block will receive, $${{P}_{i}}$$, is:


 * $${{P}_{i}}=\frac{Throughpu{{t}_{i}}}{\underset{j=1}{\overset{N}{\mathop{\sum }}}\,Throughpu{{t}_{j}}} \ (eqn 1)$$

The actual amount is then the (portion $$\cdot $$  available units). In this case, the portion allocated to $$B$$  is  $$\tfrac{10}{60},$$  the portion allocated to  $$C$$  is  $$\tfrac{20}{60}$$  and the portion allocated to  $$D$$  is  $$\tfrac{30}{60}$$. When a total of 60 units is processed through $$A$$,  $$B$$  will get 10,  $$C$$  will get 20 and  $$D$$  will get 30.

The results would then be as shown in table below.  Throughput summary using weighted allocation across paths. If the Allocate equal share to all paths option is chosen, then 20 units will be sent to $$B$$,  $$C$$ and $$D$$  regardless of their processing capacity, yielding the results shown in table below.

 Throughput summary using an equal allocation across paths. Send units to failed blocks: decide whether items should be sent to failed parts. If this option is not selected, the throughput units are allocated only to operational units. Otherwise, if this option is selected, units are also allocated to failed blocks and they become part of the failed block's backlog.

In the special case in which one or more blocks fail, causing a disruption in the path, and the Send units to failed blocks option is not selected, then the blocks that have a path to the failed block(s) will not be able to process any items, given the fact that they cannot be sent to the forward block. In this case, these blocks will keep all items received in their backlog. As an example, and using figure below, if $$E$$  is failed (and  $$E$$  cannot accept items in its backlog while failed), then  $$B$$,  $$C$$  and  $$D$$  cannot forward any items to it. Thus, they will not process any items sent to them from $$A$$. Items sent from $$A$$  will be placed in the backlogs of items  $$B$$,  $$C$$  and  $$D.$$

Process/Ignore backlog: Identify how a block handles backlog. A block can ignore or process backlog. Items that cannot be processed are kept in a backlog bin and are processed as needed. Additionally, you can set the maximum number of items that can be stored in the backlog. When you choose to ignore the backlog, BlockSim will still report items that cannot be processed in the backlog column. However, it will let the backlog accumulate and never process it. In the case of a limited backlog, BlockSim will not accumulate more backlog than the maximum allowed and will discard all items sent to the block if they exceed its backlog capacity. It will keep count of the units that did not make it in the backlog in a category called Excess Backlog. To illustrate this, reconsider Scenario 3, but with both $$F$$  and  $$G$$  having a limited backlog of 5. After a single time unit of operation, the results would be as shown in tables below.

 Scenario 3 summary with F and G having a limited blacklog and after one time unit.

 Scenario 3 summary with F and G having a limited backlog and after two time units.

Note that the blocks will never be able to process the backlog in this example. However, if we were to observe the system for a longer operation time and through failures and repairs of the other blocks, there would be opportunities for the blocks to process their backlogs and catch up. It is very important to note that when simulating a system with failures and repairs in BlockSim, you must define the block as one that operates through system failure if you wish for backlog to be processed. If this option is not set, the block will not operate through system failure and thus will not be able to process any backlog items when components that cause system failure (from an RBD perspective) fail.

Variable Throughput
In many real-world cases throughput can change over time ( $$i.e.$$ throughput through a single component is not a constant but a function of time). The discussion in this chapter is devoted to cases of constant, non-variable throughput. BlockSim does model variable throughput using phase diagrams. These are discussed in Introduction to Reliability Phase Diagrams.

Performing Throughput Analysis
The prior sections illustrated the basics concepts in throughput analysis. However, they did not take into account the reliability and maintenance properties of the blocks and the system. In a complete analysis, these would also need to be incorporated. The following simple example incorporates failures and repairs.

A Simple Throughput Analysis Example
To illustrate failures and repairs and their effects on system throughput, consider the simple system shown in figure below, but with $$E$$  operating.

In addition, consider the following deterministic failure and repair characteristics:

Also:
 * •	Set all units to operate through system failure.
 * •	Do not add spare part pools or crews (use defaults).
 * •	Do not send items to failed units.
 * •	Use a weighted allocation scheme.

Then the system behavior from 0 to 100 time units is given in table below. The system event history is as follows:



Once the system history has been established, we can examine the throughput behavior of this system from 0 to 100 by observing the sequence of events and their subsequent effect on system throughput.

Event 1: $$B$$ Fails at 50
Event overview:
 * •	At 50, $$B$$  fails.
 * •	From 0 to 50, $$A$$  processes  $$50\cdot 60=3000$$  items.
 * •	500 are sent to $$B$$, 1000 to  $$C$$  and 1500 to  $$D$$ .  There is no excess capacity at  $$B$$ ,  $$C$$  or  $$D$$.
 * •	 $$B$$, $$C$$  and  $$D$$  process and send 3000 items to  $$E$$ .  Because the capacity of  $$E$$  is 3500,  $$E$$  now has an excess capacity of 500.
 * •	The next table summarizes these results:



Event 2: $$B$$ is Down 50 to 54
Event overview:
 * •	From 50 to 54, $$B$$  is down.
 * •	 $$A$$ processes 240 items and sends 96 to  $$C$$  and 144 to  $$D$$.
 * •	 $$D$$ and  $$C$$  can only process 80 and 120 respectively during this time.  Thus, they get backlogs of 16 and 24 respectively.
 * •	The 200 processed are sent to $$E$$ .   $$E$$  has an excess capacity of 80 during this time period.
 * •	The next table summarizes these results:



Event 3: All Up 54 to 55
The next table summarizes the results:

Event 4: $$C$$ is Down 55 to 59
The next table summarizes the results:

Event 5: All Up 59 to 60
The next table summarizes the results:

Event 6: $$D$$ is Down 60 to 64
The next table summarizes the results:

Event 7: All Up 64 to 65
The next table summarizes the results:

Event 8: $$A$$ is Down 65 to 69
Between 65 and 69, $$A$$  fails. This stops the flow of items in the system and provides an opportunity for the other blocks to process their backlogs. As an example, $$B$$  processes 40 items from the 60 items in its backlog. Specifically:

Event 9: All Up 69 to 70
The next table summarizes the results:

Event 10: $$E$$ is Down 60 to 64
From 70 to 74, $$E$$  is down. Because we specified that we will not send items to failed units, $$B$$,  $$C$$  and  $$D$$  receive items from  $$A$$  but they do not process them, since processing would require that items be sent to  $$E$$. The items received by $$B$$,  $$C$$  and  $$D$$  are added to their respective backlogs. Furthermore, since they could have processed them if $$E$$  had been up, all three blocks have an excess capacity for this period. Specifically:

It should be noted that if we had allowed items to be sent to failed blocks, $$B$$,  $$C$$  and  $$D$$  would have processed the items received and the backlog would have been at  $$E$$. The rest of the time, all units are up.

Event 11: All Up 74 to 100
The next table summarizes the results:

Exploring the Results
BlockSim provides all of these results via the Simulation Results Explorer. Figure below shows the system throughput summary. System level results present the total system throughput, which is 5484 items in this example. Additionally, the results include the uptime utilization of each component. The block level result summary, shown in Figure "Block level summary", provides additional results for each item. Finally, specific throughput results and metrics for each block are provided, as shown in Figures "Specific results per block".



More Complex Scenarios
The examples presented here, even though trivial, form the basis of throughput analysis in BlockSim. The principles remain the same no matter how complex the system.

=Life Cycle Cost Analysis=

A life cycle cost analysis involves the analysis of the costs of a system or a component over its entire life span. Typical costs for a system may include:
 * •	Acquisition costs (or design and development costs).
 * •	Operating costs:
 * o	Cost of failures.
 * o	Cost of repairs.
 * o	Cost for spares.
 * o	Downtime costs.
 * o	Loss of production.
 * •	Disposal costs.

A complete life cycle cost (LCC) analysis may also include other costs, as well as other accounting/financial elements (such as discount rates, interest rates, depreciation, present value of money, etc.).

For the purpose of this reference, it is sufficient to say that if one has all the required cost values (inputs), then a complete LCC analysis can be performed easily in a spreadsheet, since it really involves summations of costs and perhaps some computations involving interest rates. With respect to the cost inputs for such an analysis, the costs involved are either deterministic (such as acquisition costs, disposal costs, etc.) or probabilistic (such as cost of failures, repairs, spares, downtime, etc.). Most of the probabilistic costs are directly related to the reliability and maintainability characteristics of the system.

The estimations of the associated probabilistic costs is the challenging aspect of LCC analysis. In this section, we will look at using some of the cost inputs associated with BlockSim to obtain such costs using the following example.

Obtaining Costs for an LCC Analysis
Consider the manufacturing line (or system) shown in Figure "Manufacturing line diagram". The block properties are given in Figure "Properties for blocks in manufacturing line", pool properties in Figure "Pool properties for maintenance on blocks in the manufacturing line" and crew properties in Figure "Crew properties for maintenance on blocks in the manufacturing line". All blocks identified with the same letter have the same properties (i.e. $$A=A1=A2$$,  $$B=B1=B2=B3=B4$$  and  $$C=C1=C2=C3=C4$$ ).









This system was analyzed in BlockSim for a period of operation of 8,760 hours, or one year. The simulation settings are shown in figure below.



The system overview is shown in figure below.



Most of the variable costs of interest were obtained directly from BlockSim. Figure below shows the overall system costs.

Our total costs from the summary are $92,197.64. Note that an additional cost was defined in the problem statement that is not included in the summary. This cost, the operating cost per item per hour of operation, can be obtained by looking at the uptime of each block and then multiplying this by the cost per hour. This is given next.

If we also assume a revenue of $100 per unit produced, then the total revenue is our throughput multiplied by the per unit revenue, or $$31,685\cdot \$100=\$3,168,500.$$   The total costs are  $$92,197+313,813=\$406,010.$$