Reliability HotWire

Reliability HotWire

Issue 117, November 2010

Reliability Basics

Understanding Updated Life Data Results

It is very common for reliability engineers to update their test results with new data as tests progress, or to augment their test results with fielded system information. If the analyst is not careful in interpreting the results, he or she can easily draw wrong conclusions about the reliability of the product. In this article, we present a scenario where updated life data analysis results can seem confusing to the analyst. The purpose of the article is to highlight the impact that new data can have in life data analysis results and to show one way to handle such a situation in Weibull++.

The Scenario

An aerospace manufacturer is looking at the reliability of a new component with an intended mission duration of 2,100 hours. The reliability engineer has internal test data for the beta version and has also collected reliability information from the company’s beta customers after 2,500 hours of usage, in which no failures were seen.

The reliability engineer, Lisa, decides to use a 2-parameter Weibull distribution to analyze the data. Since there are relatively few failure data points and a lot of suspended data (30 units from customer beta sites), she decides to use Maximum Likelihood Estimation (MLE) as the statistical analysis method for the data. MLE uses the actual suspension times, not just their relative position in terms of where they occurred in the data set, as rank regression would do. The life data analysis results are shown in Figure 1.

Folio with Internal Test Data and Suspensions from Beta Testing After 2,500 Hours of Usage
Figure 1: Folio with Internal Test Data and Suspensions from Beta Testing After 2,500 Hours of Usage

Lisa then uses the Quick Calculation Pad (QCP) to obtain the reliability estimate at the mission duration of 2,100 hours, as shown in Figure 2.

QCP: Reliability at 2,100 Hours (Suspensions at 2,500 Hours)
Figure 2: Reliability Estimate at 2,100 Hours with Suspended Data at 2,500 Hours

Based on the original data set, the reliability at 2,100 hours is calculated to be 97.65%. She creates a report distributing this information to her organization.

Later in the process, Lisa is ready to update the report with new data from the beta sites. She collects data from the beta sites and sees that there were still no failures in the field, with the 30 beta units having reached 7,000 cumulative hours of test time. She is expecting an increase in the calculated reliability, since there was more accumulated time and no failures. She wants to keep the analysis consistent so she again calculates the life data using a 2-parameter Weibull distribution using MLE. Figure 3 shows the updated life data analysis results with the 30 beta site units still operating at 7,000 hours.

Folio with Internal Test Data and Suspensions from Beta Testing After 7,000 Hours of Usage
Figure 3: Folio with Internal Test Data and Suspensions from Beta Testing After 7,000 Hours of Usage

She then uses the QCP to obtain the updated estimate of the reliability at the mission duration of 2,100 hours, as shown in Figure 4.

QCP: Reliability at 2,100 Hours (Suspensions at 7,000 Hours)
Figure 4: Reliability Estimate at 2,100 Hours with Suspended Data at 7,000 Hours

To her surprise, the new reliability estimate is now 95.61%, which is lower than the previous estimate of 97.65%. How is this possible? The beta site units did not exhibit any failures and had more accumulated hours. The reliability should have gone up.

Explanation

This is a typical problem of looking at data in isolation and not understanding the impact of the selection of the model. Using a 2-parameter Weibull distribution, the model parameters are recalculated both in terms of the beta parameter, which indicates the slope of the Weibull probability plot, and the eta parameter, which is the characteristic life of the component (i.e. the estimated time by which 63.2% of the components will have failed).

If you take a closer look at the data, you’ll notice that both the beta and eta parameters are different in the original and updated results. The slope has changed from 5.0151 to 1.5634, and the eta parameter has changed from 4,427 to 15,280. Figure 5 shows a MultiPlot of the two Weibull distributions. Before the point at which the two probability lines cross each other, the updated analysis (shown in blue) yields lower reliabilities. After the lines cross, the updated analysis yields higher reliabilities. Since the mission duration of 2,100 hours is before the two lines meet, the updated analysis resulted in a lower reliability estimate for 2,100 hours.

MultiPlot: Original and Updated Analyses
Figure 5: Weibull Probability Plot of the Original and Updated Analyses

Solution

There is nothing inherently wrong with the MultiPlot shown in Figure 5. The analyst just needs to understand the impact of re-estimating a probability plot based on the data. The number of parameters that will be used can allow different "degrees of freedom" for the line to change both in terms of slope and intercept.

If Lisa wanted to create an updated report reflecting the progress from using the suspended data from the beta sites, a possible solution would be to superimpose the same slope in both probability plots.

In this case, a good estimate of the actual slope would involve the original test data and good engineering judgment in terms of the expected failure rate behavior. Remember that a high beta can indicate wear-out, while a beta close to 1 indicates more random failure types, such as insufficient design margins, external events, etc. A beta less than 1 indicates early life failures, such as those caused by misassembly, manufacturing issues or damage during shipping or storage.

The original internal test data set, excluding the 30 suspended units from the customer beta sites, is calculated using a 2-parameter Weibull distribution and Rank Regression on X (RRX), since the data are now complete (exact times to failure) and it is a small data set. For comparison purposes, the data are also analyzed using MLE, and Figure 6 shows a MultiPlot of the two methods, where the black line represents the RRX analysis and the blue line represents the MLE analysis.

MultiPlot: Including Internal Test Data Only, Using MLE and RXX
Figure 6: Weibull Probability Plots Including Internal Test Data Only, Using MLE and RXX

Using engineering judgment about the nature of the expected failure modes, a beta would be chosen to be used for all further analyses. In this case, it would be appropriate to accept the most conservative estimate of beta. Since the mission duration is 2,100 hours, a smaller beta would yield more failures in that region. In this case, the RRX beta of 2.8219 is chosen.

After choosing a beta value, every updated analysis with new suspended data can be done by superimposing that value of beta on the data set. In that case, a 1-parameter Weibull distribution would be used and the value of beta would be provided, as shown in Figure 7 for the data set including the suspensions at 2,500 hours.

Superimposing a Beta Parameter Value onto the Analysis
Figure 7: Superimposing a Beta Parameter Value onto the Analysis

The same process would be used in the analysis including the customer beta site units suspended at 7,000 hours. Using this process ensures that the slope remains the same in the two data sets, making it easier to draw comparisons, as shown in the MultiPlot in Figure 8.

MultiPlot: Original and Updated Analysis with the Same Superimposed Beta
Figure 8: Weibull Probability Plot of the Original and Updated Analysis with the Same Superimposed Beta

Figures 9 and 10 show the new calculated values of reliability at 2,100 hours. The data set with the 30 customer beta site units suspended at 2,500 hours exhibits a reliability of 91.44% and the data set with the customer beta site units suspended at 7,000 hours exhibits a reliability of 98.96%. The improvement obtained by adding more hours to the suspended units is now more evident.

QCP: Reliability at 2,100 Hours (Suspensions at 2,500 Hours, Superimposed Beta)
Figure 9: Reliability Estimate at 2,100 Hours with Suspended Data at 2,500 Hours and Superimposed Beta

QCP: Reliability at 2,100 Hours (Suspensions at 7,000 Hours, Superimposed Beta)
Figure 10: Reliability Estimate at 2,100 Hours with Suspended Data at 7,000 Hours and Superimposed Beta

Conclusion

When updating life data after a previous analysis, it is important to understand the impact of the choice of the model and the statistical analysis method upon the results. A good practice is to look at probability plots in order to understand the total behavior of the model. When the analysis and further updates are performed without examining the overall behavior of the statistical models, wrong conclusions can be drawn.