Reliability HotWire

Issue 41, July 2004

Reliability Basics

Fielded Systems in Reliability Growth (Part II)

In last month's issue of HotWire, we presented repairable system analysis for reliability growth in the article Fielded Systems in Reliability Growth (Part I). Fleet analysis will be presented in Part II of this article. Fleet analysis is similar to repairable system data analysis. However, the Power Law model is applied to the fleet failures rather than the system failures. In other words, in repairable system data analysis, the number of system failures versus system time is modeled. In fleet analysis, the number of fleet failures versus fleet time is modeled. Therefore, in both cases, the model is the same but the data are treated differently.

In last month's article, the system failures in the example were reported in general terms (usually hours). However, the field data that are actually received may not always be in this form. For instance, the field data may be returned in miles. Therefore, it is important to be aware of the units in which the data are reported, as well as the units in which you would like to conduct the analysis. It may be necessary to convert from, say, miles to hours or vice versa.

The failure intensity in a fielded system might be changing over time (e.g., increasing if the system wears out). If a fleet of systems is considered and the number of fleet failures versus fleet-time is modeled, the failures might become random. This is because there is a mixture of systems within a fleet, new and old, and when the failures of this mixture of systems are viewed from a cumulative fleet-time point of view, they may be random. Figures 1 and 2 illustrate this concept. Figure 1 shows the number of failures over system age. It can clearly be seen that as the systems age, the intensity of the failures increases (wearout). The superposition system line, which brings the failures from the different systems under a single timeline, also illustrates this observation. On the other hand, if we take the same four systems and combine their failures from a fleet perspective, where fleet failures over cumulative fleet hours are observed, then the failures seem to be random. Figure 2 illustrates this concept in the cumulative timeline.

Figure 1: Repairable systems analysis System Operation plot

Figure 2: Fleet systems analysis System Operation plot

Methodology

Figures 1 and 2 illustrate that the difference between repairable system data analysis and fleet analysis is the way that the data set is treated. In fleet analysis, the time-to-failure data from each system is "stacked" to a cumulative timeline. For example, consider the two systems in Table 1.

 System Times-to-Failure (hrs) System End Time (hrs) 1 3, 7 10 2 4, 9, 13 15

The data are first converted to a cumulative timeline, as follows:

• System 1 is considered first. Therefore, the cumulative timeline is 3 and 7 hours.
• System 1's end time is at 10 hours, System 2's first failure is at 4 hours. This failure time is added to System 1's end time to give a cumulative failure time of 14 hours.
• The second failure for System 2 occurred 5 hours after the first failure. This time interval is added to the cumulative timeline to give 19 hours.
• The third failure for System 2 occurred 4 hours after the second failure. Therefore, the cumulative failure time is 19 + 4 = 23 hours.
• System 2's end time is 15 hours, or 2 hours after the last failure. Therefore, the total cumulative operating time for the fleet is 23 + 2 = 25 hours.
• In general, the cumulative operating time, Yj, is calculated by:

where:

• Xi,q is the ith failure of the qth system
• Tq is the end time of the qth system
• K is the total number of systems
• N is the total number of failures for all of the systems

As this example demonstrates, the cumulative timeline is determined based on the order of the systems. So if we consider the data in Table 1 by taking System 2 first, then the cumulative timeline would be: 4, 9, 13, 18, 22 and an end time of 25. Therefore, the order in which the systems are considered is somewhat important. However, in the next step of the analysis, the data from the cumulative timeline will be grouped into time intervals, effectively eliminating the importance of the order of the systems. Keep in mind that this will NOT always be true. This is true only when the order of the systems was random to begin with. If there is some logic/pattern in the order of the systems, then it will remain even if the cumulative timeline is converted to grouped data. For example, consider a system that wears out with age. This means that more failures will be observed as this system ages and these failures will occur more frequently. Within a fleet of such systems, there will be new and old systems in operation. If the data collected are considered from the newest to the oldest system, then even if the data points are grouped, the pattern of fewer failures at the beginning and more failures at later time intervals will still be present. If the objective of the analysis is to determine the difference between newer and older systems, then that order for the data will be acceptable. However, if the objective of the analysis is to determine the reliability of the fleet, then the systems should be randomly ordered.

Data Analysis

Once the cumulative timeline has been generated, it is then converted into grouped data. For this, a group interval is required. The group interval length is specified by the analyst and it is chosen so that it is representative of the data. The analyst should verify that the recommended interval is valid from a practical perspective and alter it if necessary. Also note that the intervals do not have to be of equal length.

For the system data in Table 1, the data could be grouped into 5 hour intervals. This interval length is sufficiently large to insure that there are failures within each interval. The grouped data set is given in Table 2.

 Failures in Interval Interval End Time 1 5 1 10 1 15 1 20 1 25

The Crow-AMSAA model for grouped data is used for the data in Table 2 and the parameters of the model are solved by satisfying the following ML equations:

Example

Table 3 presents data for a fleet of 27 systems. A cycle is a complete history from overhaul to overhaul. The failure history for the last completed cycle for each system is recorded. This is the random sample of data from the fleet. These systems are in the order they were selected. Suppose the intervals to group the current data are 10,000, 20,000, 30,000, 40,000 and the final interval is defined by the termination time. Conduct the fleet analysis.

 System Cycle Time (Tj) Nj Failure Time (Xij) 1 1396 1 1396 2 4497 1 4497 3 525 1 525 4 1232 1 1232 5 227 1 227 6 135 1 135 7 19 1 19 8 812 1 812 9 2024 1 2024 10 943 2 316, 943 11 60 1 60 12 4234 2 4233, 4234 13 2527 2 1877, 2527 14 2105 2 2074, 2105 15 5079 1 5079 16 577 2 546, 577 17 4085 2 453, 4085 18 1023 1 1023 19 161 1 161 20 4767 2 36, 4767 21 6228 3 3795, 4375, 6228 22 68 1 68 23 1830 1 1830 24 1241 1 241 25 2573 2 871, 2573 26 3556 1 3556 27 186 1 186 Total 52110 37

The system data in Table 3 can be grouped into 10,000, 20,000, 30,000, 40,000 and 52,110 time intervals. The grouped data is given in Table 4.

 Interval End Time Observed Failures 10,000 8 20,000 16 30,000 22 40,000 27 52,110 37

The ML estimates of and for this data set based on the above time intervals are then given by:

Figure 3 shows the System Operation plot.

Figure 3: System Operation plot

Once this step in the fleet analysis has been completed, the next step is to calculate the reliability growth of the fleet using the Crow Extended model. The Crow Extended model, developed by Dr. Larry Crow, allows for the quantification of future reliability growth based on planned improvements. The Crow Extended model application as it relates to fleet analysis will be presented in detail in next month's issue of HotWire. RGA supports both repairable and fleet systems analysis for reliability growth data.