Residual Plots for Lifetime
Distributions
Residual plots are widely used
in linear regression analyses. By examining the pattern of
residual plots, one can identify whether there are additional
variables that should be included in the regression model.
Residual plots also can help analysts find outliers in the data
set. More often, residual plots are used to diagnose whether a
model or a distribution can fit the data well. In this article,
we will use
DOE++ to illustrate
the usefulness of residual plots in life data analysis. Only
DOE++ offers Reliability DOE, which is specifically intended
to handle life data obtained from designed experiments, either
with or without censoring, and allows you to use the Weibull,
lognormal or exponential distribution to analyze data.
For a simple linear regression,
the equation is:

Usually, the least squares
method is used to estimate the model parameters in linear
regression. Once the parameters are estimated, the residual for
the observation
can be calculated by:

where is
the observed value and is
the fitted value.
In linear regression, residuals
are assumed to be normally distributed. Therefore, for
convenience, they are often transformed to the standard form:

where is
the standard deviation estimated from all the residuals and
is a variable following the standard normal distribution.
The above concepts about
residuals can be extended to life data analysis. However, for
life data analysis, instead of using least squares estimation,
the maximum likelihood estimation (MLE) is often used. In the
following paragraphs, we will discuss how to get the residual
plots for the two most popularly used life distributions,
Weibull and lognormal, when the MLE is used to estimate model
parameters.
For the Weibull distribution,
the scale parameter, ,
is considered to be a function of variables such as different
stresses and their interactions. For example, if there are two
stresses, the function between
and these two stresses is:

where A is stress one
and
B is stress two. As you can see, the above function is a
linear equation.
For lognormal distribution, the
location parameter,
,
is a function of stresses. It is:

and
are so-called "life characteristics." The above functions are
called life-stress functions. When stresses do not affect the
life of a product, or when the product is operating at constant
stresses, the above life-stress functions are reduced to
and
.
The standardized residual for
the Weibull distribution is defined as:

where
follows
the standard smallest extreme value (SEV) distribution, and the
standard residual for the lognormal distribution is defined as:

where
is
the failure time of the observation
and
follows a standard normal distribution. Be aware that residuals
make sense only when observations are times to failure, although
traditionally, people also calculate residuals for suspensions
and interval data.
Example:
Assume a two-stress accelerated life test was conducted. The
engineer wants to find out which factor is more significant and
proceed with determining which life distribution can better
describe the data. Residual plots will be used to diagnose the
model. The procedure will start by performing a Design of
Experiments (DOE) analysis, and continue with performing an
Accelerated Life Test (ALT) analysis. The first part of the
procedure will be demonstrated next using the
DOE++ software,
but the ALT analysis will be omitted because it is outside the
scope of the article.
Figure 1 shows the data set
obtained from the DOE.

Figure 1:
Data in Actual Values
This is a two-factor design.
The data also can be viewed in terms of coded values, as shown
in Figure 2. In this figure, -1 means the stress is at its low
level (408 for temperature, 0.5 for current) and 1 means the
stress is set to the high level (423 for temperature, 0.7 for
current).

Figure 2:
Data in Coded Values
If we choose to use the Weibull
distribution, the results are:

Figure 3:
Results for Weibull Distribution
For the lognormal distribution,
the results are:

Figure 4:
Results for Lognormal Distribution
From both results, we can see
that current has a more significant effect on the life than
temperature because it has a smaller p value. In DOE++,
a significant effect is shown in red. The significant effects
also can be identified from the Pareto chart. Figure 5 shows the
Pareto chart for the Weibull distribution results.

Figure 5:
Pareto Chart for Effects
However, one question remains:
Which distribution fits the data better? One simple way to
determine this is by looking at the likelihood values. For the
Weibull distribution, it is -166.9625; for the lognormal
distribution, it is -168.7175. The fact that the Weibull
distribution has a larger likelihood value indicates that it
fits the data better than the lognormal distribution. Another
way to evaluate which distribution fits the data better is by
checking the residual plots, which give a visual representation
of the results.
In DOE++, there are four
different residual plots. They are:
- Residual Probability
- Residual vs. Fitted Value
- Residual vs. Run
- Residual vs. Factor
The Residual vs. Run plot can
be used to check whether the failure time is affected by the
sequence of the test runs. The Residual vs. Run plot for the
Weibull distribution is shown in Figure 6.

Figure 6:
Residual vs. Run Plot for Weibull Distribution
Because there is no obvious
pattern in Figure 6, we can conclude that the results are not
affected by the test sequence. The two dashed lines are the
critical values at a significance level of 0.1. Their values are
calculated based on the SEV distribution. If a residual point is
beyond these two lines, it means the model cannot fit that
observation very well. In Figure 6, we can see there is one
point far below the lower critical line.
The Residual vs. Run plot for
the lognormal distribution is shown below.

Figure 7:
Residual vs. Run Plot for Lognormal Distribution
As you can see, there are more
points outside the critical lines in Figure 7 than in Figure 6,
with both at the same significance level of 0.1. Clearly, the
Weibull distribution gives a better fit to the data set.
At this point, the engineer can
transfer the data to ALTA, and perform an ALT analysis using the
Weibull distribution and an appropriate life-stress
relationship.
Conclusion:
In this article, we discussed how to use residual plots to
check the fit of a life distribution to a data set. Residual
plots are widely used in linear regression. By extending them to
life data analysis, reliability engineers will have a better way
to understand their analysis results.
|