Reliability HotWire Issue 107, January 2010 Reliability Basics Statistical Tests for Effectiveness of Corrective Actions in RGA 7 During a multi-phase reliability growth test, failure modes are discovered and corrective actions are implemented during or at the end of each phase. Although growth between phases can be tracked in terms of comparing MTBF or failure intensity, how can we verify that the observed growth (if any) is statistically significant or, in other words, that the corrective actions have been effective in eliminating the failure modes? In this article we present a new tool in RGA 7 that explores the effectiveness of corrective actions during or at the end of a phase. Introduction Suppose a reliability growth test is split into two phases: Phase 1 and Phase 2. The corrective actions for the discovered failure modes are incorporated during or at the end of Phase 1. The system is then operated during Phase 2. The general question is whether or not the corrective actions have been effective. More specifically, there are two questions that can be addressed regarding the effectiveness of the corrective actions: Is the average failure intensity for Phase 2 statistically less than the average failure intensity for Phase 1? Is the average failure intensity for Phase 2 statistically less than the Crow-AMSAA (NHPP) instantaneous failure intensity at the end of Phase 1? The answer to each of these questions will prove whether the observed growth is statistically significant. Next we present the methodology of answering those questions. Average Failure Intensities Test This test compares the average failure intensity during Phase 2 to the average failure intensity during Phase 1 in order to determine whether the corrective actions have been effective. The average failure intensity for Phase 1 is: where T1 is the Phase 1 test time and N1 is the number of failures during Phase 1. Similarly, the average failure intensity for Phase 2 is: where T2 is the Phase 2 test time and N2 is the number of failures during Phase 2. The overall test time, T, is: The overall number of failures, N, is: We also define the probability of success, P, as: If the cumulative binomial probability B(k;P,N) of observing up to N2 failures is less than or equal to a specified statistical significance a, then the average failure intensity for Phase 2 is statistically less than the average failure intensity for Phase 1 at the specified significance level. The cumulative binomial distribution is given by: which gives the probability that the test failures, f, are less than or equal to the number of allowable failures, k, in N trials (in this case K = N2) when each trial has a probability of succeeding of P. Average vs. Demonstrated Failure Intensities Test The purpose of this test is to compare the average failure intensity during Phase 2 with the Crow-AMSAA (NHPP) instantaneous failure intensity (or demonstrated failure intensity) at the end of Phase 1. Once again, the average failure intensity for Phase 2 is given by: where T2 is the Phase 2 test time and N2 is the number of failures during Phase 2. The Crow-AMSAA (NHPP) model estimate of failure intensity at time T1 is . The Crow-AMSAA (NHPP) estimate is approximately distributed as a random variable with standard deviation [2]. We therefore treat as an approximate Poisson random variable with a number of failures: The Phase 1 test time is: The overall test time is: The overall number of failures is: We also define P as: If the cumulative binomial probability B(k;P,N) of observing up to N2 failures is less than or equal to a specified statistical significance a, then the average failure intensity for Phase 2 is statistically less than the Crow-AMSAA (NHPP) instantaneous failure intensity at the end of Phase 1, at the specified significance level. Example A manufacturer is performing a reliability growth test that is divided into two phases. The data obtained from the growth test can be seen in Table 1. Phase 1 has a duration of 3,000 hours and 18 failures were observed during this phase. Phase 2 has a duration of 6,000 hours and 5 failures were observed during this phase. The Crow-AMSAA (NHPP) parameters were found to be β = 0.4056 and λ = 0.5725. Table 1: Data obtained from a reliability growth test with two phases The average failure intensity for Phase 1 is: The average failure intensity for Phase 2 is: The average failure intensity in Phase 2 is less than the average failure intensity in Phase 1. However, we want to investigate whether this difference is significant with a 10% statistical significance level. The total test time is: The total number of failures is: P is calculated as: Using the above values, the cumulative binomial probability is: Since 1.31 x 10-5 is less than 0.1, we can conclude that at the 10% significance level, the average failure intensity for Phase 2 is statistically less than the average failure intensity for Phase 1. Figure 1 shows the Statistical Test for Corrective Actions window in RGA 7 with the result of the Average Failure Intensities Test highlighted. As we have proven above, the result of the test comparing Phase 1 to Phase 2 at the 10% significance is "Passed." This means that the average failure intensity for the second phase is statistically less than the average failure intensity for the first phase. Figure 1: Average Failure Intensities Test As mentioned above, the effectiveness of the corrective actions also can be evaluated by comparing the average failure intensity of Phase 2 with the Crow-AMSAA (NHPP) instantaneous failure intensity at the end of Phase 1. The demonstrated failure intensity at the end of Phase 1 is: The number of failures is: The Phase 1 test time is: Therefore, the overall test time is: The overall number of failures is: P is calculated as: The cumulative binomial probability is: Since 0.090577 is less than 0.1, we can conclude that at the 10% significance level, the average failure intensity for Phase 2 is statistically less than the instantaneous failure intensity at the end of Phase 1. Figure 2 shows the Statistical Test for Corrective Actions window in RGA 7 with the results of the Average vs. Demonstrated Failure Intensities Test highlighted. Again, as proven above, we can see that the test passed; therefore the effectiveness of the corrective actions is statistically significant. Figure 2: Average vs. Demonstrated Failure Intensities Test Some of the numerical values that we calculated above can be generated by clicking the Report button in the Statistical Test for Corrective Actions window. Figure 3 shows the generated report. Figure 3: Report generated for the Statistical Test for Effectiveness of Corrective Actions Conclusions In this article, we discussed two methodologies for evaluating the effectiveness of corrective actions during a multi-phase reliability growth test. As we have seen, by using the cumulative binomial probability we can compare the failure intensities between two phases and determine whether the difference between them is statistically significant and, therefore, whether the corrective actions that were implemented were effective. References [1] ReliaSoft Corporation, Reliability Growth & Repairable System Analysis Reference, Tucson, AZ: ReliaSoft Publishing, 2009. [2] Crow, L. H., "Confidence Interval Procedures for the Weibull Process with Applications to Reliability Growth", Technometrics, Vol. 24, No. 1, pp 67-72, 1982. Copyright 2010 ReliaSoft Corporation, ALL RIGHTS RESERVED