Build Equivalent Single System During Reliability Growth
Testing
[Editor's Note: This article has been updated
since its original publication to reflect a more recent
version of the software interface.]
During reliability growth testing, it is common that multiple systems are tested concurrently. The cumulative test hours and failure information from all the systems under test are used for reliability growth modeling. For this analysis, it is critical to track all the cumulative test hours from all the systems using the same configuration (i.e., with the same
"fixes" implemented). When a failure occurs during reliability growth testing, a fix will be implemented not only on the system that has the failure, but also on other systems in the same test. Ideally, all the systems should be stopped and the same fix should be implemented before resuming the test. However, in reality it is rare that the same fixes will be implemented simultaneously on each system. Delays are common and it complicates the calculation for cumulative test hours. In this article, we will illustrate how to correctly get the cumulative test hours for each failure mode and use them to build an equivalent single system (ESS) for reliability growth modeling and prediction using
RGA.
Example
Assume four systems are tested for 500 hours. Failures (F) and fixes (I) are recorded in the following
table. Three failure modes: BC1, BC2 and BC3 are observed.
The BC1 failure mode is fixed at different times on
different systems. Each occurrence of a BC1 failure mode
is identified in the final column with a unique number for easy reference
later in the article. The question posed by engineers is
how can you build an equivalent single system using the data?
System ID 
Event 
Time To Event 
Failure Mode 
Failure ID for BC1 
System 1 
F 
100 
BC1 
#1 
System 1 
I 
200 
BC1 

System 1 
F 
250 
BC1 
#2 
System 1 
I 
300 
BC1 

System 1 
F 
350 
BC1 
#3 
System 1 
F 
450 
BC1 
#4 
System 2 
F 
280 
BC1 
#5 
System 2 
I 
300 
BC1 

System 2 
F 
380 
BC1 
#6 
System 2 
I 
400 
BC1 

System 2 
F 
480 
BC1 
#7 
System 3 
F 
290 
BC1 
#8 
System 3 
F 
390 
BC1 
#9 
System 3 
I 
400 
BC1 

System 3 
F 
490 
BC1 
#10 
System 3 
I 
500 
BC1 

System 4 
F 
5 
BC2 

System 4 
F 
12 
BC3 

System 4 
F 
270 
BC1 
#11 
System 4 
F 
370 
BC1 
#12 
System 4 
F 
470 
BC1 
#13 
System 4 
I 
500 
BC1 

System 4 
I 
500 
BC2 

System 4 
I 
500 
BC3 

To calculate the cumulative event time for the equivalent
single system (ESS), we assume failure modes are independent
of each other. The cumulative test time is calculated for
each failure mode and then combined together to build the
equivalent single system. Figure 1 shows the event times
for BC1.
Figure 1: Event Times for Failure Mode BC1
For each BC1 failure, the corresponding cumulative
time for the ESS is calculated below.
 For failure #1 at 100 hours for system 1 (S1), the
cumulative test time is 100 x 4 = 400. This is because
the 4 systems have the same configuration for the time
period up to 100 hours. When a failure occurs at 100
hours,
the accumulated test time from all the four systems is
400 hours.
 For failure #2 at 250 hours (S1), it occurs 50 hours
after the fix. So it occurs at 1400 (I) + 50 =
1450 hours on the ESS.
 For failure #3 at 350 hours (S1), its time on the
ESS is 1400 (I) + 150 (S1) + 50 (S2) + 0 (S3) + 0 (S4) =
1600.
 For failure #4 at 450 hours (S1), its time on
the ESS is 1400 (I)
+ 250 (S1) + 150 (S2) + 50 (S3) + 0 (S4) = 1850.
 For failure #5 at 280 hours (S2), its time on
the ESS is 200 (S1) +
280
x 3 = 1040. Because this failure occurs
before the I event on S2, it only adds 200 hours
from S1.
 For failure #6 at 380 hours (S2), its time on
the ESS is 1400 (I)
+ 180 (S1) + 80 (S2) + 0 (S3,
S4) = 1660.
 For failure #7 at 480 hours (S2), its time on
the ESS is 1400 (I)
+ 280 (S1) + 180 (S2) +80 (S3) +0 (S4)
= 1940.
 For failure #8 at 290 hours (S3), its time on
the ESS is 290 (S3)
+ 290 (S4) + 290 (S2) + 200 (S1) = 1070.
 For failure #9 at 390 hours (S3), its time on
the ESS is 390 (S3)
+ 390 (S4) + 300 (S2) + 200 (S1) = 1280.
 For failure #10 at 490 hours (S3), its time on
the ESS is 1400 (I)
+ 90 (S3) + 190 (S2) + 290 (S1) = 1970.
 For failure #11 at 270 hours (S4), its time on
the ESS is 270 (S4)
+ 270 (S3) + 270 (S2) + 200 (S1) = 1010.
 For failure #12 at 370 hours (S4), its time on
the ESS is 370 (S4)
+ 370 (S3) + 300 (S2) + 200 (S1) = 1240.
 For failure #13 at 470 hours (S4), its time on
the ESS is 470 (S4)
+ 400 (S3) + 300 (S2) + 200 (S1) = 1370.
For each BC1 fix, the corresponding cumulative
time for the ESS is calculated below.
 For the fix at 200 hours (S1), the same fix is implemented
at 300 hours for system 2 (S2), at 400 hours for system
3 (S3) and at 500 hours for system 4 (S4). So the cumulative
time before the fix is 200 + 300 + 400 +500 = 1400.
It means the test has cumulated 1400 operation hours
from the four systems under the same configuration.
 For the second fix, it occurs at 300 hours for system 1, at 400
hours for system 2, at 500 hours for system 3, but
these fixes are not used in the calculation of ESS.
In terms of the Crow Extended model, it does not
need to know when the fix was implemented. More
information on recurring failures for the same mode
after an I event will be covered in a future
article.
Since we assume failure modes are independent of each
other, the above procedure also is used to get the cumulative
event times for BC2 and BC3.
 For BC2 failure at 5 hours (S4), the cumulative time
is 5 x 4 = 20.
 For BC3 failure at 12 hours (S4), the cumulative
time is 12 x 4 = 48.
Finally:
 For the I events at 500 hours (S4), the cumulative
time is 500 x 4 = 2000.
From the above calculations, it can be seen that the basic
rule is to get the cumulative test hours for systems with
the same configuration. Several general rules for BC modes
are summarized in below:
 Each failure time for a BC mode that occurred before
an implemented fix (I event) for that mode is calculated
by multiplying the failure time of the system by the
total number of total systems under test.
 The implemented fix (I event) time in the equivalent
single system is calculated by adding the test time
invested in each system before that I event takes place.
It is the total time that the system
has spent at the same configuration in terms of that
specific mode.
 After a fix was implemented in one or
more systems (I event) and the same BC mode occurs in
another system, the failure time in the equivalent single
system for this failure is calculated by adding the
test time until this failure and one of the following
for each system:
 The test
time until the implemented fix (I event) if the
I events occurred earlier than this failure in calculation.
 The
time of this failure for each one of the systems if
the I events occurred later than this failure time
in the other systems or those systems did not have
any I events for that BC mode.
 After a fix for a mode was implemented in one or
more systems (I event) and the same BC mode occurs in
the same system, the failure time in the equivalent
single system is calculated by adding the test time
of each system after the I event was implemented to
the equivalent I event time.
It can be seen that the calculation for building the
ESS is tedious when the number of systems and the number of failures
are large. Luckily, RGA has implemented all the
calculations using the Multiple Systems with Event Code
data type. This new data type and the ability to transfer
the data to the ESS data type will be available in
version 7.5.1 or higher, which will be released by
ReliaSoft in June 2010. Licensed users can obtain the
latest service release from
http://RGA.ReliaSoft.com/updates.htm.
The data in the above table can be entered into RGA
as shown in Figure 2.
Figure 2: Multiple Systems with Event Code
RGA also can graphically display the failure times of
each system and the corresponding ESS. The plot is called
the System Operation plot and is given in Figure 3.
Figure 3: System Operation Plot
The values of the points on the "Equivalent" line in
the plot are the
same as the values we have calculated manually. Click the Transfer to
New Data Type icon and select Equivalent Single
System as shown in
Figure 4.
Figure 4: Transfer Data Type Window
The transferred data appears in a separate worksheet,
as shown in Figure 5.
Figure 5: Transferred Data
One can see that values in Figure 5 indeed are the same
as the values we have calculated. Please notice that all
the BC modes were renamed to BD modes. This is because all
the fixes are delayed fixes (not implemented right at the
failure time). For more detail on the definition of failure
mode classification, please refer to [1].
Using the newly obtained ESS, we can build the model and
make predictions. All the calculations in the Multiple
System with Event Code data type are based on the ESS.
Conclusion
In this article, we illustrated how to correctly calculate the cumulative test hours when there are multiple systems under the same reliability growth test. For this analysis, it is critical to get the number of test hours for the systems with the same configuration. We used an example to demonstrate the stepbystep calculation of the event times for the ESS in
RGA. Once the ESS has been obtained, we can use it to build a reliability growth model and calculate reliability metrics such as the demonstrated MTBF, demonstrated failure intensity and growth rate.
References
[1] ReliaSoft Corporation. "Crow Extended Model."
ReliaSoft Corporation. 2010.
http://reliawiki.org/index.php/Crow_Extended
