Statistical Tests for Effectiveness of
Corrective Actions in RGA 7
During a multiphase reliability growth test, failure
modes are discovered and corrective actions are implemented during or at the end
of each phase. Although growth between phases can be tracked in terms of
comparing MTBF or failure intensity, how can we verify that the observed growth
(if any) is statistically significant or, in other words, that the corrective actions
have been effective in eliminating the failure modes? In this article we present a
new tool in RGA 7 that explores the
effectiveness of corrective actions during or at the end of a phase.
Introduction
Suppose a reliability growth test is split into two phases: Phase 1 and Phase 2. The
corrective actions for the discovered failure modes are incorporated during or at
the end of Phase 1. The system is then operated during Phase 2. The general question
is whether or not the corrective actions have been effective. More specifically,
there are two questions that can be addressed regarding the effectiveness of the
corrective actions:
 Is the average failure intensity for Phase 2 statistically less than the
average failure intensity for Phase 1?
 Is the average failure intensity for Phase 2 statistically less than the
CrowAMSAA (NHPP) instantaneous failure intensity at the end of Phase 1?
The answer to each of these questions will prove whether the observed growth is
statistically significant. Next we present the methodology of answering those
questions.
Average Failure Intensities Test
This test compares the average failure intensity during Phase 2 to the average
failure intensity during Phase 1 in order to determine whether the corrective
actions have been effective.
The average failure intensity for Phase 1 is:
where
T_{1} is the Phase 1
test time and
N_{1} is the number
of failures during Phase 1.
Similarly, the average failure intensity for Phase 2 is:
where
T_{2} is the Phase 2
test time and
N_{2} is the number
of failures during Phase 2.
The overall test time,
T, is:
The overall number of failures,
N, is:
We also define the probability of
success,
P, as:
If the cumulative binomial probability
B(k;P,N) of observing up to
N_{2} failures is less than
or equal to a specified statistical significance
a, then the average failure intensity
for Phase 2 is statistically less than the average failure intensity for Phase 1
at the specified significance level.
The cumulative binomial distribution is given by:
which gives the probability that the test failures,
f, are less than or equal to the number of
allowable failures,
k, in
N trials (in this
case K = N_{2}) when
each trial has a probability of succeeding of
P.
Average vs. Demonstrated Failure Intensities Test
The purpose of this test is to compare the average failure intensity during Phase 2
with the CrowAMSAA (NHPP) instantaneous failure intensity (or demonstrated failure
intensity) at the end of Phase 1.
Once again, the average failure intensity for Phase 2 is given by:
where
T_{2} is the Phase 2
test time and
N_{2} is the number
of failures during Phase 2.
The CrowAMSAA (NHPP) model estimate of failure intensity at
time T_{1} is
. The
CrowAMSAA (NHPP) estimate is approximately distributed as a random variable with
standard deviation
[2].
We therefore
treat
as an approximate Poisson random variable with a number of failures:
The Phase 1 test time is:
The overall test time is:
The overall number of failures is:
We also define
P as:
If the cumulative binomial
probability
B(k;P,N) of observing up
to
N_{2} failures is less than or
equal to a specified statistical
significance
a, then the average failure
intensity for Phase 2 is statistically less than the CrowAMSAA (NHPP) instantaneous
failure intensity at the end of Phase 1, at the specified significance level.
Example
A manufacturer is performing a reliability growth test that is divided into two
phases. The data obtained from the growth test can be seen in Table 1. Phase 1 has
a duration of 3,000 hours and 18 failures were observed during this phase. Phase 2
has a duration of 6,000 hours and 5 failures were observed during this phase. The
CrowAMSAA (NHPP) parameters were found to
be β = 0.4056 and
λ = 0.5725.
Table 1: Data obtained from a reliability growth
test with two phases
The average failure intensity for Phase 1 is:
The average failure intensity for Phase 2 is:
The average failure intensity in Phase 2 is less than the average failure
intensity in Phase 1. However, we want to investigate whether this difference is
significant with a 10% statistical significance level.
The total test time is:
The total number of failures is:
P is calculated as:
Using the above values, the cumulative binomial probability is:
Since 1.31 x 10^{5} is less than 0.1, we can conclude that at the
10% significance level, the average failure intensity for Phase 2 is statistically
less than the average failure intensity for Phase 1.
Figure 1 shows the Statistical Test for Corrective Actions window in
RGA 7 with the result of the Average Failure Intensities Test highlighted. As
we have proven above, the result of the test comparing Phase 1 to Phase 2 at
the 10% significance is "Passed." This means that the average failure
intensity for the second phase is statistically less than the average failure
intensity for the first phase.
Figure 1: Average Failure Intensities Test
As mentioned above, the effectiveness of the corrective actions
also can be
evaluated by comparing the average failure intensity of Phase 2 with the
CrowAMSAA (NHPP) instantaneous failure intensity at the end of Phase 1.
The demonstrated failure intensity at the end of Phase 1 is:
The number of failures is:
The Phase 1 test time is:
Therefore, the overall test time is:
The overall number of failures is:
P is calculated as:
The cumulative binomial probability is:
Since 0.090577 is less than 0.1, we can conclude that at the 10% significance
level, the average failure intensity for Phase 2 is statistically less than the
instantaneous failure intensity at the end of Phase 1.
Figure 2 shows the Statistical Test for Corrective Actions window in
RGA 7 with the results of the Average vs. Demonstrated Failure Intensities
Test highlighted. Again, as proven above, we can see that the test passed; therefore
the effectiveness of the corrective actions is statistically significant.
Figure 2: Average vs. Demonstrated Failure
Intensities Test
Some of the numerical values that we calculated above can be generated by clicking
the Report button in the Statistical Test for Corrective Actions window.
Figure 3 shows the generated report.
Figure 3: Report generated for the Statistical Test
for Effectiveness of Corrective Actions
Conclusions
In this article, we discussed two methodologies for evaluating the effectiveness of
corrective actions during a multiphase reliability growth
test. As we have seen, by using the cumulative binomial probability we can
compare the failure intensities between two phases and determine whether the
difference between them is statistically significant and, therefore, whether the
corrective actions that were implemented were effective.
References
[1] ReliaSoft Corporation, Reliability Growth
& Repairable System Analysis Reference, Tucson, AZ: ReliaSoft Publishing, 2009.
[2] Crow, L. H., "Confidence Interval Procedures for the Weibull
Process with Applications to Reliability Growth", Technometrics, Vol. 24,
No. 1, pp 6772, 1982.
