
Incorporating Recurrent Event Data Analysis in Reliability Block Diagrams
In a repairable system, the occurrence of a failure is not an independent event and the sequence of these events is not identically distributed in most cases. When a repairable system fails due to the malfunction of a component, the remaining components are still operational and they do have a current age. Therefore, the next failure event of the system depends on this current age of the components, which makes the systemlevel failure events dependent. If data are collected at the component/subsystem level (the Lowest Replaceable Unit or LRU) then a reliability block diagram (RBD) approach can be used. However, this approach requires detailed information, including failure and repair data at the LRU level. If detailed information is not available, then Parametric Recurrent Event Data Analysis (Parametric RDA) can be used instead. This article provides an introduction to the concept of parametric RDA and follows with an example that compares the results generated in a Weibull++ Parametric RDA folio with the use of corrective tasks with partial restoration in a simulation diagram in BlockSim.
Defining a Repairable System
A system is designed to accomplish a specific desired function and may be a collection of subsystems, assemblies and/or components. It can be repairable or nonrepairable. The appropriate method to analyze the data collected during the system’s operation depends on this distinction. A repairable system can be restored to operating condition via corrective activities upon its failure. This definition distinguishes the failure distribution models used for analyzing life lengths prior to failure from the models used to represent periods of operation that might extend across several failures over the life length of the system [1]. Using these appropriate models and methods, the analysis of a repairable system provides several results, including the number of failures over a fixed time interval, the reliability of the system over a certain period, the availability of the system, the inventory needed for spare parts, the rough cost of maintaining the system, the optimum overhaul time, etc.
For example, imagine that you are in charge of a gold mine operation and are responsible for analyzing the data collected. A gold mine is a complex system composed of various subsystems and components, such as crushers, mills, thickeners, rotary screens, etc. You decide that the best option to analyze this mixed (componentlevel and subsystemlevel) data is the RBD approach. For the subsystems, where you only have the failure times and where you know that the repairs partially rejuvenate the subsystem, you can use Parametric RDA.
The Parametric RDA approach is based on the General Renewal Process (GRP) model and analyzes the systemlevel data, while assuming that the repair partially renews the system. This method supplies the failure rate distribution that the unit follows, as well as the efficiency of each repair, which is called the restoration factor. You can use Weibull++ to generate and publish the model and then use the model in the systemlevel analysis using an RBD in BlockSim.
Partial Rejuvenation  Parametric Recurrent Event Data Analysis
The Parametric RDA folio in Weibull++ is a tool for modeling recurrent event data. After analyzing the systemlevel data and estimating the trend, it can predict the total number of recurrences over a specified length of time. This method treats the failure and repair data of a repairable system as one type of recurrence data to predict the future failure frequency.
The parametric analysis approach uses the GRP model [5]. It models the rate of occurrence of events over time, (i.e., repairable system data), while ignoring the repair times. It is very efficient in understanding the effects of the repairs on the age of the system. In Weibull++, the GRP model provides the capability to model systems with partial renewal (general repair or imperfect repair/maintenance) and allows for a variety of predictions such as reliability, expected failures, etc.
This model introduces the concept of virtual age. Let t_{1},t_{2},…,t_{n} represent the successive failure times and let χ_{1},χ_{2},...,χ_{n} represent the time between failures . Assume that after each event, actions are taken to improve the system performance and let q be the action effectiveness factor.
There are two GRP models:
Type I:
Type II:
where ν_{i} is the virtual age of the system right after the i^{th} repair.
The Type I model assumes that the i^{th} repair cannot remove the damage incurred before the (i1)^{th} repair, or in other words, it can only reduce the additional age χ_{i} to qχ_{i}. The Type II model assumes that at the i^{th} failure, the virtual age has been accumulated to ν_{(i1)}+χ_{i}. The i^{th} repair will remove the cumulative damage from both the current and previous failures by reducing the virtual age to q(ν_{(i1)}+χ_{i}).
The power law function models the rate of recurrence and is formulated as:
The model parameters are estimated using the MLE method [3].
Restoration Factors in BlockSim
The theory behind the restoration factors (RF) used in BlockSim is the same as the theory behind the Parametric RDA folio in Weibull++. The concept of a restoration factor may be used in cases in which you want to model imperfect repair, or a repair with a used component. The restoration factor represents the percentage of the accumulated age that can be removed from the component by a repair or other maintenance action, either age accumulated only since the last action (Type I), or the total accumulated age (Type II).
The restoration factor in BlockSim is defined as a number between 0 and 100% [2, 4]. If the restoration factor is:
 100%, then the component is ASGOODASNEW after repair, which in effect implies that the starting age of the component is 0.
 0%, then the component is the same as it was prior to repair (ASBADASOLD), which in effect implies that the starting age of the component is the same as the age of the component at failure.
 Greater than 0% but less than 100% (partial restoration), then the starting age of the component is equal to 100RF% of the age of the component at failure.
As a side note, if you know that the system is ASBADASOLD after each repair, you could run this analysis in RGA by using the NHPP Power Law model, which practically means that the restoration factor used in BlockSim should be 0%. In the corrective task, you would select the "To same as it was when it failed" option.
Example
To better understand how the restoration factors in repairable systems analysis works, you will go back to the gold mine example and use a Weibull++ Parametric RDA folio to analyze a data set collected at the subsystem level for two rotary screens. You will then compare the results with a simulation run in BlockSim to prove that both approaches give similar results and that a data set collected at the subsystem level can be applicable to an RBD approach and be a part of a more complex system. In this case, the two rotary screens will then become part of the RBD representation of the whole gold mine, which is composed of various other subsystems and components.
In this case, you analyze the successive failures data for two rotary screens operating at a gold mine, recorded over a period of more than 2.5 years. The following table shows the failure logs of the units, unit 1 and unit 2, between 0 to 970 and 987 days of observation, respectively. Each observation period ended at the time of the last failure. You are interested in the Cumulative Mean Time Between Failures (MTBF) and the Cumulative Number of Failures in the first 2,000 days (~5.5 years) of operation of each unit.
For this analysis, you choose to use the 3parameters settings and choose Type II for the virtual age models because you assume that the repairs can fix all of the wearout and damage accumulated up to the last failure that occurred. (For example, the 3rd repair removes both the damage that occurred during the time between the 2nd and 3rd failures and the cumulative damage accumulated during the time from the first failure to the 3rd failure.)
When using the default simulation settings, the parameters of the model are estimated to be Beta = 1.287, Lambda = 0.00247 and q = 0.75, as shown below.
As shown next, you use the QCP to calculate the cumulative MTBF and the cumulative number of failures expected to occur over 2,000 days. The results are 67.67 days and 29.556 failures.
Next you publish the analysis results as a Synthesis model resource so that you can use it in BlockSim to run the simulation for the rotary screen data.
In a BlockSim simulation diagram, you create a single block that represents the rotary screen.
As shown next, in the block URD, you select the failure distribution model as the published model, Rotary_Screen_RDA, and add a corrective task to model the restoration factor that you calculated in the Weibull++ Parametric RDA folio. In the corrective task, you choose to use a partial restoration amount of 25%, because the action effectiveness factor (q) was estimated as 0.75 (75%) in the Weibull++ Parametric RDA folio. You also select the All accumulated damage option because you used the Type II GRP in the Parametric RDA calculations.
For the simulation, you choose a simulation time of 2,000 days and set the number of simulations to 10,000. You then obtain the results shown in the following table.
As you can see, the expected number of failures and the MTBF are estimated as 29.483 failures and 1628.079 hours (67.837 days). The results obtained in BlockSim using restoration factors are very close to the ones obtained in the Weibull++ Parametric RDA folio (29.556 failures and 67.67 days).
Conclusion
In this article, we presented the theory and mathematical formulations behind restoration factors in repairable systems analysis with a basic example. The example illustrated two different methods to perform the analysis: parametric RDA and BlockSim simulation using restoration factors. Estimating restoration factors in a Weibull++ Parametric RDA folio and applying them in a BlockSim simulation diagram with the published failure model is a useful approach for integrating the component, subsystem or the system analyzed into a more complex system.
References
[1] http://www.ReliaSoft.com/newsletter/v7i1/avoiding_mistake.htm
[2] http://www.weibull.com/hotwire/issue77/relbasics77.htm
[3] http://reliawiki.org/index.php/Recurrent_Event_Data_Analysis
[4] http://www.ReliaWiki.org/index.php/Imperfect_Repairs
[5] Mettas, Adamantios and Zhao, Wenbiao, "Modeling and Analysis of Repairable Systems with General Repair," presented at the 2005 RAMS event in Alexandria, Virginia, 2005.