|
Estimating the
Expected Number of Failures for Items with Minimal or Perfect
Repair
The area under the failure rate curve
constructed from the first times-to-failure of a set of components is often used as an
estimate of the number of spare parts needed during a mission. This method of estimating
spares is applicable only under certain conditions. The primary assumptions of the method
are that minimal repair is performed on the components in the population and that the size
of the population remains constant over time. However, in many cases, reliability engineers
erroneously apply the method to estimate the number of failures of non-repairable
components. This article uses an example to clarify when it is appropriate to
estimate the number of failures using the area under the failure rate curve and when other
methods must be employed.
Example
A company has produced a new widget and the management has decided
(without performing a reliability study) that they will provide a
300-hour warranty on the component. The management wants to minimize warranty costs on a population of 20 fielded
widgets. Each new widget costs $10,000 to produce. It is up to the engineer to determine
the best policy to keep the warranty costs to a minimum.
The engineer is given five widgets to test. The widgets are all tested to
failure, and the following times-to-failure are obtained: 75 hours, 123 hours, 164 hours,
170 hours and 197 hours. The engineer enters these failure times into
a Weibull++ Folio and
calculates the parameters using rank regression on X and a 2-parameter Weibull
distribution. The resulting parameters are β = 2.7948 and η = 164.9250 hours, as shown
in Figure 1.

Figure
1: Computation of Weibull Parameters from Component Test Data
The engineer’s first thought is to determine
how many widgets will fail during the warranty period. In other words, he wants to determine
the number of components out of the original 20 that will fail during a 300-hour
mission. He recalls that the cumulative distribution function, or
cdf, denoted by F(T),
describes the probability that a component will fail during a mission of duration T.
Therefore, he determines the expected number of first failures using the following
equation:

First he uses the Quick Calculation Pad to determine the percent
expected to fail by 300 hours, as shown in Figure 2, and he then multiplies this value by 20 to
find that all 20 original components are expected to fail by 300 hours.

Figure
2: Determining the Percent of Units that Experience at Least One Failure at Time = 300 hours
The engineer knows he has two options. He could
repair the widgets upon failure and send them back into the field, or he could replace the widgets
with new ones. There are two extreme situations he wants to consider. One is to perform the
minimum amount of repair to keep the original 20 widgets operating through 300 hours, and the
other is to replace each failed widget with a new one.
The engineer estimates that it would be possible to repair the widget for an average of $4500
including parts and labor. For his analysis, the engineer assumes that the widget would undergo
minimal repair, which means that the age of the widget when it is put back into service is identical
to its age at failure.
He recalls that the failure rate function, or hazard function, denoted by λ(T), describes
the number of failures per unit time for a component of age T. Integrating the failure rate
function over the warranty period will provide the expected number of (minimal) repairs
necessary per component in the population. So the expected number of repairs the engineer
will need to perform is given by:

In order to calculate this value using
Weibull++, the engineer adds a General Spreadsheet to the
project. He then uses the built-in function EFAILURES() to
compute the expected number of failures in the interval from
0.0001 hours to 300 hours, as shown in Figure 3. (Note that Weibull++ does not
allow time equal to zero as a limit for these calculations since failure rate can be
undefined at this time depending on the parameters of the chosen distribution.)

Figure
3: Using the EFAILURES Function in a Weibull++ General Spreadsheet
The engineer
determines that he will need to repair each widget an average of 5.32 times,
as shown in Figure 4.

Figure 4: Results of EFAILURES
Function
Thus, for his
20 fielded components, he will need to make about 100 repairs, and will need a warranty
budget of $450,000.
Next, the engineer considers the case where
each failed widget is replaced with a new one. This situation is referred to as
"perfect repair." In order to address this situation, the engineer decides to
simulate his population of 20 components using a reliability block diagram in
BlockSim. He uses a single block to
represent his 20 components, as shown in Figure 5.

Figure
5: Creating a Population of 20 Fielded Components
The engineer imports the parameters of
the component failure distribution from his Weibull++ spreadsheet and specifies
corrective maintenance of zero duration, as shown in Figures 6 and 7.

Figure
6: Importing the Failure Distribution from Weibull++ to BlockSim

Figure
7: Specifying the Corrective Maintenance Distribution for the 20 Fielded Components
The engineer ran
1000 simulations, and each simulation lasted 300 hours. These
settings are shown in Figure 8.

Figure
8: Specifying the Simulation Settings in BlockSim
The predicted
number of failures is computed by summing the Expected NOF column in the Block Summary
section of the simulation results, as shown in Figure 9.

Figure 9: Expected Total Number of Replacements for 20 Fielded Components
[Click
to Enlarge]
The engineer
concluded that, if the organization chooses to replace failed widgets with new ones, 33 new
widgets would be needed to support the
population of 20 fielded components during the warranty. For
this case, he would need to allocate $330,000 in warranty costs.
Comparing this cost to the estimated $450,000 required to support a
minimal repair policy, the engineer concludes it is better to
replace the failed widgets with new ones than to repair the
fielded widgets.
Both methods of
estimating the number of failures shown in this example can be
correct, but they depend on different assumptions about the
efficacy of the repair. When repair is assumed to make the
component "as bad as old," it is appropriate to estimate the
number of failures using the area under the failure rate curve.
The method of estimation using BlockSim, on the other
hand, can be made appropriate for any level of repair from "as
good as new," as shown here, to "as bad as old," by adjusting
the restoration factor on the Corrective Maintenance page of the
Block Properties window. Here, a restoration factor of 1
represents perfect repair, whereas a restoration factor of 0
would represent minimal repair. For more information on
restoration factors, please see the Reliability HotWire
article, "Restoration
Factors in BlockSim."
|