Reliability HotWire: eMagazine for the Reliability Professional
Reliability HotWire

Issue 29, July 2003

Reliability Basics
Using Pools and Crews in System Analysis (Part I)

In order to make system analysis more realistic, you may wish to consider additional sources of delay times in the analysis or study the effect of limited resources. As an example, you can utilize a repair distribution to identify how long it takes to restore a component. The factors that you choose to consider in this time may include the time it takes to complete the repair and/or the time it takes to obtain a crew, spare part, etc. While all of these factors may be included in the repair duration, optimized usage of these resources can only be achieved if the resources are studied individually and their dependencies are identified.

In Part I of this article, we will examine the utilization of crews in the system analysis. In Part II, which will be presented in next month's issue of the Hotwire, we will deal with spare part pools as well as including pools and crews together.

As an example, consider the situation where two components in parallel fail at the same time and only a single repair person is available. Because this person is not able to execute the repair on both components simultaneously, an additional delay will be encountered, which also needs to be included in the modeling. One way to accomplish this is to assign a specific repair crew to each component.

Utilizing Crews

BlockSim 6 allows you to assign one or more maintenance crews to each component via the Block Properties window, as shown in Figure 1. Note that there may be different crews for each action, i.e. corrective, preventive and/or inspection.

Block Properties window

Figure 1: Block Properties window

A policy needs to be defined for each named crew, as shown in Figure 2.

Crew Policy window

Figure 2: Crew Policy window

This policy identifies basic properties for the crew, such as:

  • Logistic delays (i.e. how long it takes for the crew to arrive).
  • The number of simultaneous tasks the crew can perform.
  • The cost per hour for the crew.
  • Any additional cost per incident.

Example

To illustrate the use of crews in BlockSim 6, consider the deterministic scenario described by the following RBD and properties.

RBD

Unit Failure Repair Crew
A 100 10 Crew A: Delay = 20, Single Task
B 120 20 Crew A: Delay = 20, Single Task
C 140 20 Crew A: Delay = 20, Single Task
D 160 10 Crew A: Delay = 20, Single Task

Sequence of events using crews

Figure 3: Sequence of events using crews

The System Up/Down plot in Figure 3 illustrates the sequence of events, which are:

  1. At 100, A fails. It takes 20 to get the crew and 10 to repair the unit, thus the component is repaired by 130. The system is failed/down during this time.
  2. At 150, B fails since it would have accumulated an operating age of 120 by this time. Once again, it has to wait for the crew and is repaired by 190.
  3. At 170, C fails. Upon this failure, C requests the only available crew. However, this crew is currently engaged by B and, since the crew can only perform one task at a time, it cannot respond immediately to the request by C. Thus, C will remain failed until the crew becomes available. The crew will finish with unit B at 190 and will then be dispatched to C. Upon dispatch, the logistic delay will again be considered and C will be repaired by 230. The system continues to operate until the failures of B and C overlap (i.e. the system is down from 170 to 190).
  4. At 210, D fails. Once again, it has to wait for the crew and repair.
  5. D is up at 260.

Figure 4 displays an example of some of the possible crew results (details) and these results are discussed next.

Crew results shown in the BlockSim 6 Simulation Results Explorer

Figure 4: Crew results shown in the BlockSim 6 Simulation Results Explorer

Explanation of Crew Details

  1. Each request made to a crew is logged.

  2. If a request is successful (i.e. the crew is available), the call is logged once in the Calls Received counter and once in the Accepted Calls counter.

  3. If a request is not accepted (i.e. the crew is busy), the call is logged once in the Calls Received counter and once in the Rejected Calls counter. When the crew is free and can be called upon again, the call is logged once in the Calls Received counter and once in the Accepted Calls counter.

  4. In this scenario, there were two instances when the crew was not available (Rejected Calls = 2) and there were four instances when the crew performed an action (Calls Accepted = 4), for a total of six calls (Calls Received = 6).

  5. Percent Accepted and Percent Rejected are the ratios of calls accepted and calls rejected with respect to the total calls received.

  6. Total Utilization is the total time that the crew was utilized. It includes both the time required to complete the repair action and the logistic time. In this case, this is 140, or:

  1. Average Call Duration is the average duration of each crew utilization and it also includes both logistic and repair time. It is the total utilization divided by the number of accepted calls. In this case, this is 35.
  2. Total Wait Time is the time that blocks in need of a repair "waited" for this crew. In this case, it is 40 (C and D both waited 20 each).
  3. Total Crew Costs are the total costs for this crew. This includes the per incident charge as well as the per unit time costs. In this case, this is 180. There were four incidents at 10 each for a total of 40, as well as 140 time units of utilization at 1 cost unit per time unit.
  4. Average Cost per Call is the total cost divided by the number of accepted calls. In this case, this is 45.

Note that crew costs that are attributed to individual blocks can be obtained from the Blocks reports, as shown in Figure 5.

Allocation of crew costs

Figure 5: Allocation of crew costs

How BlockSim Handles Crews

  1. Crew logistic time is added to each repair time.

  2. The logistic time is always present and the same, regardless of where the crew was called from (i.e. whether the crew was at another job or idle at the time of the request).
  3. A crew can perform either a finite number of simultaneous tasks or an infinite number.
  4. If the finite limit of tasks is reached, the crew will not respond to any additional request until the number of tasks the crew is performing is less than its finite limit.
  5. If a crew is not available to respond, the component will "wait" until a crew becomes available.
  6. BlockSim maintains the queue of rejected calls and will dispatch the crew to the next repair on a "first come, first served" basis.
  7. Multiple crews can be assigned to a single block (see Looking at Multiple Crews).
  8. If a crew has not been assigned to a block, it is assumed that no crew restrictions exist and a default crew is utilized. The default crew can perform an infinite number of simultaneous tasks and has no delays or costs.

Looking at Multiple Crews

Multiple crews may be available to perform maintenance for a particular component. When multiple crews have been assigned to a block in BlockSim, the crews are assigned to perform maintenance based on their order in the crew list, as shown in Figure 6.

A single component with two corrective maintenance crews assigned

Figure 6: A single component with two corrective maintenance crews assigned

In the case of more than one crew being assigned to a block and if the first crew is unavailable, then the next crew is called upon, and so forth. As an example, consider the prior case but with the following modifications (i.e. Crews A and B are assigned to all blocks):

RBD

Unit Failure Repair Crew
A 100 10 A, B
B 120 20 A, B
C 140 20 A, B
D 160 10 A, B

 

Crew A; Delay = 20, Single Task
Crew B; Delay = 30, Single Task

The system would behave as shown in Figure 7.

System up/down plot utilizing two crews

Figure 8: System up/down plot utilizing two crews

In this case, Crew B was utilized for the C repair since Crew A was busy. On all others, Crew A was used. It is very important to note that once a crew has been assigned to a task it will complete the task. For example, if we were to change the delay time for Crew B to 100, the system behavior would be as shown in Figure 9.

System up/down plot with the delay time for Crew B changed to 100.

Figure 9: System up/down plot with the delay time for Crew B changed to 100.

In other words, even though Crew A would have finished the repair on C more quickly if it had been available when originally called, B was assigned the task because A was not available at the time that the crew was needed.

Additional Rules on Crews

  1. If all assigned crews are engaged, then the next crew that will be chosen is the crew that can get there first.
    • This looks at how long it would take a particular crew to complete its current task (or all tasks in its queue) and its logistic time.
  2. If a crew is available, it gets utilized regardless of what its logistic delay time is.
    • In other words, if a crew with a shorter logistic time is busy, but almost done, and another crew with a much higher logistic time is currently free, the free crew will be assigned to the task.

In Part II of this article, which will be presented in next month's issue of the Hotwire, we will take a look at utilizing spare part pools.

ReliaSoft Corporation

Copyright 2003 ReliaSoft Corporation, ALL RIGHTS RESERVED