 Reliability HotWire Issue 87, May 2008 Reliability Basics Residual Plots for Lifetime Distributions Residual plots are widely used in linear regression analyses. By examining the pattern of residual plots, one can identify whether there are additional variables that should be included in the regression model. Residual plots also can help analysts find outliers in the data set. More often, residual plots are used to diagnose whether a model or a distribution can fit the data well. In this article, we will use DOE++ to illustrate the usefulness of residual plots in life data analysis. Only DOE++ offers Reliability DOE, which is specifically intended to handle life data obtained from designed experiments, either with or without censoring, and allows you to use the Weibull, lognormal or exponential distribution to analyze data. For a simple linear regression, the equation is: Usually, the least squares method is used to estimate the model parameters in linear regression. Once the parameters are estimated, the residual for the observation can be calculated by: where is the observed value and is the fitted value. In linear regression, residuals are assumed to be normally distributed. Therefore, for convenience, they are often transformed to the standard form: where is the standard deviation estimated from all the residuals and is a variable following the standard normal distribution. The above concepts about residuals can be extended to life data analysis. However, for life data analysis, instead of using least squares estimation, the maximum likelihood estimation (MLE) is often used. In the following paragraphs, we will discuss how to get the residual plots for the two most popularly used life distributions, Weibull and lognormal, when the MLE is used to estimate model parameters. For the Weibull distribution, the scale parameter, , is considered to be a function of variables such as different stresses and their interactions. For example, if there are two stresses, the function between and these two stresses is: where A is stress one and B is stress two. As you can see, the above function is a linear equation. For lognormal distribution, the location parameter, , is a function of stresses. It is:  and are so-called "life characteristics." The above functions are called life-stress functions. When stresses do not affect the life of a product, or when the product is operating at constant stresses, the above life-stress functions are reduced to and . The standardized residual for the Weibull distribution is defined as: where follows the standard smallest extreme value (SEV) distribution, and the standard residual for the lognormal distribution is defined as: where is the failure time of the observation and follows a standard normal distribution. Be aware that residuals make sense only when observations are times to failure, although traditionally, people also calculate residuals for suspensions and interval data. Example: Assume a two-stress accelerated life test was conducted. The engineer wants to find out which factor is more significant and proceed with determining which life distribution can better describe the data. Residual plots will be used to diagnose the model. The procedure will start by performing a Design of Experiments (DOE) analysis, and continue with performing an Accelerated Life Test (ALT) analysis. The first part of the procedure will be demonstrated next using the DOE++ software, but the ALT analysis will be omitted because it is outside the scope of the article. Figure 1 shows the data set obtained from the DOE. Figure 1: Data in Actual Values This is a two-factor design. The data also can be viewed in terms of coded values, as shown in Figure 2. In this figure, -1 means the stress is at its low level (408 for temperature, 0.5 for current) and 1 means the stress is set to the high level (423 for temperature, 0.7 for current). Figure 2: Data in Coded Values If we choose to use the Weibull distribution, the results are: Figure 3: Results for Weibull Distribution For the lognormal distribution, the results are: Figure 4: Results for Lognormal Distribution From both results, we can see that current has a more significant effect on the life than temperature because it has a smaller p value. In DOE++, a significant effect is shown in red. The significant effects also can be identified from the Pareto chart. Figure 5 shows the Pareto chart for the Weibull distribution results. Figure 5: Pareto Chart for Effects However, one question remains: Which distribution fits the data better? One simple way to determine this is by looking at the likelihood values. For the Weibull distribution, it is -166.9625; for the lognormal distribution, it is -168.7175. The fact that the Weibull distribution has a larger likelihood value indicates that it fits the data better than the lognormal distribution. Another way to evaluate which distribution fits the data better is by checking the residual plots, which give a visual representation of the results. In DOE++, there are four different residual plots. They are: Residual Probability Residual vs. Fitted Value Residual vs. Run Residual vs. Factor The Residual vs. Run plot can be used to check whether the failure time is affected by the sequence of the test runs. The Residual vs. Run plot for the Weibull distribution is shown in Figure 6. Figure 6: Residual vs. Run Plot for Weibull Distribution Because there is no obvious pattern in Figure 6, we can conclude that the results are not affected by the test sequence. The two dashed lines are the critical values at a significance level of 0.1. Their values are calculated based on the SEV distribution. If a residual point is beyond these two lines, it means the model cannot fit that observation very well. In Figure 6, we can see there is one point far below the lower critical line. The Residual vs. Run plot for the lognormal distribution is shown below. Figure 7: Residual vs. Run Plot for Lognormal Distribution As you can see, there are more points outside the critical lines in Figure 7 than in Figure 6, with both at the same significance level of 0.1. Clearly, the Weibull distribution gives a better fit to the data set. At this point, the engineer can transfer the data to ALTA, and perform an ALT analysis using the Weibull distribution and an appropriate life-stress relationship. Conclusion: In this article, we discussed how to use residual plots to check the fit of a life distribution to a data set. Residual plots are widely used in linear regression. By extending them to life data analysis, reliability engineers will have a better way to understand their analysis results. Copyright 2008 ReliaSoft Corporation, ALL RIGHTS RESERVED