Reliability HotWire: eMagazine for the Reliability Professional
Reliability HotWire

Issue 37, March 2004

Hot Topics
Software Reliability Growth

[Editor's Note: This article has been updated since its original publication to reflect a more recent version of the software interface.]

When considering reliability growth, some sort of hardware is typically being analyzed. But the same theory and analysis procedures can also be applied to the analysis of software. The faults (bugs) that are found during each day's testing of the software can be recorded and then analyzed, just as would be done for hardware. This article will explore how software reliability growth can be analyzed using RGA.

Software for a particular application is under development. The customer/reliability requirement is that 1 fault occurs every 8 hours of continuous operation, at most.

Testing begins when the software reaches the "Beta" phase. Three employees are assigned to perform continuous testing during business hours. This results in 24 hours of software testing per day. The software faults are reported and captured in a FRACAS system (failure reporting, analysis and corrective action system) using a data entry interface similar to the one shown next.

FRACAS system

Given that a new compile of the software is available for testing every week, design engineers implement fixes within a week with the exception of the last two weeks of testing.

Assume that the following data set was extracted from the FRACAS system:

Number of Faults Days of Testing
45 5
37 10
19 15
16 20
25 23
16 26
10 28

Data Entry

The data set is grouped by the number of days until a new compile of the software is available. Using a data sheet configured for grouped failure times in RGA, the Crow-AMSAA model was used for the analysis. The data entered into are shown next.

RGA Data Entry

Analysis and Discussion

The failure rate goal for this software was 1 failure per 8 hours of operation or 1/8 = 0.125 failures per hour. In one day (24 hours), the failure intensity goal is 0.125 * 24 = 3 faults per day. The achieved failure intensity can be estimated using the Quick Calculation Pad (QCP), as shown next.


Currently, the achieved failure intensity is 4.625 faults per day. Therefore, the question is: "If we continue testing with the same growth rate, when will we achieve the goal of 3 faults per day?"

The Time calculation option is used to answer this question, as shown next.


Therefore, 185 - 28 = 157 additional days of testing and development are required (test-analyze-and-fix) to achieve the failure intensity goal. This is shown graphically in the following failure intensity plot.

Instantaneous Failure Intensity plot

From this plot, it can be seen that there is a jump in the failure intensity between 20 and 23 days. This is the reason why it is estimated that more development time is required. If the data set is analyzed for up to 20 days of testing, we get:

Instantaneous Failure Intensity plot


In this case, it is estimated that it will take 11 more days of development to reach the failure intensity goal. So the question is: "What happened when the failure intensity jumped on the 23rd day of testing and development?"

It turns out that new functionality was implemented at the request of a customer, which caused major redesign on some general modules of the software. This type of jump is typical in both software and hardware development when new features are introduced and observed.

Due to these significant changes, it was decided that the clock should be reset and to track the reliability growth from the 20th day forward. In other words, the origin of the test was set at 20 days and the data thereafter were considered as follows:

Number of Faults Days of Testing
25 3
16 6
10 8

This data set was then re-analyzed with the following results:

Instantaneous Failure Intensity plot


Therefore, when considering this data set, 33 more days of developmental testing are required.

Of course, it is too early to make any predictions based on just 8 days of testing, but this result can be used to get a general idea of the remaining development time required and to come up with a new testing plan. In this case, it was decided that 3 more employees needed to be added to testing and, if possible, that a new compile needed to be created every 2 days. This yielded a much more aggressive testing and development plan with the objective of completing the project within one month.

ReliaSoft Corporation

Copyright 2004-2014 ReliaSoft Corporation, ALL RIGHTS RESERVED