|
Residual Plots for Lifetime
Distributions
Residual plots are widely used in linear regression analyses.
By examining the pattern of residual plots, one can identify
whether
there are additional variables that should be included in the
regression model. Residual plots also can help analysts find
outliers in the data set. More often, residual plots are used to
diagnose whether a model or a distribution can fit the data well.
In this article, we will use
DOE++ to illustrate
the usefulness of residual plots in life data analysis.
Only DOE++ offers Reliability DOE, which is specifically
intended to handle life data obtained from designed experiments, either with or without censoring,
and allows you to use the Weibull, lognormal or exponential
distribution to analyze data.
For a simple linear regression, the equation is:

Usually, the least squares method is used to estimate the
model parameters in linear regression. Once the parameters are
estimated, the residual for the observation can be
calculated by:

where is the observed value and is the fitted value.
In linear regression, residuals are assumed to be normally
distributed. Therefore, for convenience, they are often
transformed to the standard form:

where is the standard deviation estimated from all the
residuals and
is a variable following the standard normal
distribution.
The above concepts about residuals can be extended to life
data analysis. However, for life data analysis, instead of using
least squares estimation, the maximum likelihood estimation (MLE)
is often used. In the following paragraphs, we will discuss how
to get the residual plots for the two most popularly used life
distributions, Weibull and lognormal, when the MLE is used to
estimate model parameters.
For the Weibull distribution, the scale parameter, , is
considered to be a function of variables such as different
stresses and their interactions. For example, if there are two
stresses, the function between and these two stresses is:

where A is stress one and
B is stress two. As you can see, the
above function is a linear equation.
For lognormal distribution, the location parameter,
, is a
function of stresses. It is:

and
are so-called
"life characteristics." The above functions
are called life-stress functions. When stresses do not affect
the life of a product, or when the product is operating at
constant stresses, the above life-stress functions are reduced
to
and
.
The standardized residual for the Weibull distribution is
defined as:

where
follows the standard smallest extreme value (SEV)
distribution, and the standard residual for the lognormal
distribution is defined as:

where
is the failure time of the observation and
follows a
standard normal distribution. Be aware that residuals make sense
only when observations are times to failure, although
traditionally, people also calculate residuals for suspensions
and interval data.
Example:
Assume a two-stress accelerated life test was conducted. The
engineer wants to find out which factor is more significant and
proceed with determining which life distribution can better
describe the data. Residual plots will be used to diagnose the
model. The procedure will start by performing a Design of
Experiments (DOE) analysis,
and continue with performing an Accelerated Life Test (ALT) analysis.
The first part of the procedure will be demonstrated next using
the
DOE++ software, but the ALT analysis will be omitted because it is
outside the scope of the article.
Figure 1 shows the data set
obtained from the DOE.

Figure 1: Data in Actual Values
This is a two-factor design. The data also can
be viewed in terms of coded values, as shown in Figure 2. In
this figure, -1 means the stress is at its low level (408 for
temperature, 0.5 for current) and 1 means the stress is set to
the high level (423 for temperature, 0.7 for current).

Figure 2: Data in Coded Values
If we choose to use the Weibull distribution, the results
are:

Figure 3: Results for Weibull Distribution
For the lognormal distribution, the results are:

Figure 4: Results for Lognormal Distribution
From both results, we can see that current has a more
significant effect on the life than temperature because it has a
smaller p value. In DOE++, a significant effect is shown in red.
The significant effects also can be identified from the Pareto
chart. Figure 5 shows the Pareto chart for the Weibull distribution results.

Figure 5: Pareto Chart for Effects
However, one question remains:
Which distribution fits the
data better? One simple way to determine this is by looking at
the likelihood values. For the Weibull distribution, it is
-166.9625; for the lognormal distribution, it is -168.7175.
The fact that the Weibull distribution has a larger likelihood value
indicates that it fits the data better than the lognormal distribution. Another
way to evaluate which distribution fits the data better is by
checking the residual plots, which give a visual representation of
the results.
In DOE++, there are four different residual plots. They are:
- Residual Probability
- Residual vs. Fitted Value
- Residual vs. Run
- Residual vs. Factor
The Residual vs. Run plot can be used to check
whether the failure
time is affected by the sequence of the test runs. The Residual
vs. Run plot for the Weibull distribution is shown in Figure 6.

Figure 6: Residual vs. Run Plot for Weibull Distribution
Because there is no obvious pattern in Figure 6, we can
conclude that the results are not affected by the test sequence.
The two dashed lines are the critical values at a significance
level of 0.1. Their values are calculated based on the SEV
distribution. If a residual point is beyond these two lines, it
means the model cannot fit that observation very well. In
Figure 6, we can see there is one point far below the lower
critical line.
The Residual vs. Run plot for the lognormal distribution is
shown below.

Figure 7: Residual vs. Run Plot for Lognormal Distribution
As you can see, there are more points outside the critical lines
in Figure 7 than in Figure 6, with both at the same significance level of
0.1. Clearly, the Weibull distribution gives a better fit to the
data set.
At this point, the engineer can transfer the data to ALTA,
and perform an ALT analysis using the Weibull distribution and
an appropriate life-stress relationship.
Conclusion:
In this article, we discussed how to use residual plots to check
the fit of a life distribution to a data set. Residual plots are
widely used in linear regression. By extending them to life data
analysis, reliability engineers will have a better way to
understand their analysis results.
|