When considering reliability growth, some sort of hardware is typically being analyzed. But the same theory and analysis procedures can also be applied to the analysis of software. The faults (bugs) that are found during each day's testing of the software can be recorded and then analyzed, just as would be done for hardware. This article will explore how software reliability growth can be analyzed using RGA 6.
Software for a particular application is under development. The customer/reliability requirement is that 1 fault occurs every 8 hours of continuous operation, at most.
Testing begins when the software reaches the "Beta" phase. Three employees are assigned to perform continuous testing during business hours. This results in 24 hours of software testing per day. The software faults are reported and captured in a FRACAS system (failure reporting, analysis and corrective action system) using a data entry interface similar to the one shown next.
Given that a new compile of the software is available for testing every week, design engineers implement fixes within a week with the exception of the last two weeks of testing.
Assume that the following data set was extracted from the FRACAS system:
The data set is grouped by the number of days until a new compile of the software is available. Using the Data Entry Spreadsheet in RGA 6 for grouped data, the Crow-AMSAA model was used for the analysis. The data entered into the Data Entry Spreadsheet is shown next.
Analysis and Discussion
The failure rate goal for this software was 1 failure per 8 hours of operation or 1/8 = 0.125 failures per hour. In one day (24 hours), the failure intensity goal is 0.125 * 24 = 3 faults per day. The achieved failure intensity can be estimated using the Quick Calculation Pad (QCP), as shown next.
Currently, the achieved failure intensity is 4.625 faults per day. Therefore, the question is: "If we continue testing with the same growth rate, when will we achieve the goal of 3 faults per day?"
The Time/Stage QCP calculation option is used to answer this question, as shown next.
Therefore, 185 - 28 = 157 additional days of testing and development are required (test-analyze-and-fix) to achieve the failure intensity goal. This is shown graphically in the following failure intensity plot.
From this plot, it can be seen that there is a jump in the failure intensity between 20 and 23 days. This is the reason why it is estimated that more development time is required. If the data set is analyzed for up to 20 days of testing, we get:
In this case, it is estimated that it will take 11 more days of development to reach the failure intensity goal. So the question is: "What happened when the failure intensity jumped on the 23rd day of testing and development?"
It turns out that new functionality was implemented at the request of a customer, which caused major redesign on some general modules of the software. This type of jump is typical in both software and hardware development when new features are introduced and observed.
Due to these significant changes, it was decided that the clock should be reset and to track the reliability growth from the 20th day forward. In other words, the origin of the test was set at 20 days and the data thereafter were considered as follows:
This data set was then re-analyzed with the following results:
Therefore, when considering this data set, 33 more days of developmental testing are required.
Of course, it is too early to make any predictions based on just 8 days of testing, but this result can be used to get a general idea of the remaining development time required and to come up with a new testing plan. In this case, it was decided that 3 more employees needed to be added to testing and, if possible, that a new compile needed to be created every 2 days. This yielded a much more aggressive testing and development plan with the objective of completing the project within one month.
Copyright 2004 ReliaSoft Corporation, ALL RIGHTS RESERVED