In a previous Hotwire article, we took a look at probability plotting. The rank regression or least squares parameter estimation method essentially "automates" the probability plotting method mathematically. In this article, we will take a look at the maximum likelihood estimation (MLE) method. This is considered to be one of the most robust parameter estimation techniques.
where x represents the data (times-to-failure) and θ1, θ2,..., θk are the parameters to be estimated. For a two-parameter Weibull distribution, for example, these would be beta (β) and eta (η). For complete data, the likelihood function is a product of the pdf functions, with one element for each data point in the data set:
where n is the number of failure data points in the complete data set, and xi is the ith failure time. It is often mathematically easier to manipulate this function by first taking the logarithm of it. This log-likelihood function then has the form:
It then remains to find the values for the parameters that result in the highest value for this function. This is most commonly done by taking the partial derivative of the log-linear equation for each parameter and setting it equal to zero:
This results in a number of equations with an equal number of unknowns, which can be solved simultaneously. This can be a relatively simple matter if there are closed-form solutions for the partial derivatives. In situations where this is not the case, numerical techniques need to be employed.
where lambda (λ) is the parameter we are trying to estimate. Since the log-likelihood function is easier to manipulate mathematically, we derive this by taking the natural logarithm of the likelihood function. For the exponential distribution, the log-likelihood function has the form:
Taking the derivative of the equation with respect to λ and setting it equal to zero results in:
From this point, it is a simple matter to rearrange this equation to solve for λ:
This gives the closed-form solution for the MLE estimate for the one-parameter exponential distribution. Obviously, this is one of the most simplistic examples available, but it does illustrate the process well. The methodology is more complex for distributions with multiple parameters, or without closed-form solutions.
where m is the number of suspended data points, yj is the jth suspension and F(yj;θ1, θ2,..., θk) is the cdf. With this function, the analysis process proceeds as described previously: take the natural logarithm of the likelihood function, take the partial derivatives with respect to the parameters and solve simultaneously.
The likelihood function for the suspended data helps illustrate some of the advantages that MLE analysis has over other parameter estimation techniques. First and foremost, MLE methodology takes into account the values of the suspension times, as is illustrated in the previous equation. Probability plotting and rank regression only take into account the relative location of the suspensions, not the actual time-to-suspension values. This makes MLE a much more powerful tool when dealing with data sets that contain a relatively large number of suspensions. A second advantage of the MLE method is that it is theoretically possible to derive parameter estimates for data sets containing nothing but suspensions. (Note, however, that the mathematics of the partial derivatives make it impossible to solve for more than one parameter with data sets consisting of nothing but suspensions. Either a one-parameter distribution must be used or values for other parameters in the distribution must be assumed. It is generally not recommended to draw important conclusions from analyses of data sets containing only suspensions.)
Note that this analysis only uses failure or suspension time data for the analysis; at no point are reliability/unreliability values or estimates incorporated. This sometimes results in models that do not track plotted data points on probability plots. As was discussed in the previous Reliability Basics article, data points are placed on the plot with the failure time as the x-coordinate and an unreliability estimate for the y-coordinate. Maximum likelihood estimation does not use these unreliability estimates. Consequently, the plot line based on the MLE parameter estimates does not always track the plotted points. This does not mean that one method or the other is "wrong," just that they were plotted using different techniques.
Thus, the "peak" of the likelihood surface function corresponds to the values of the parameters that maximize the likelihood function, i.e. the MLE estimates for the distribution's parameters.
Comments on Maximum
Unfortunately, the size of the sample necessary to achieve these properties can be quite large: thirty to fifty to more than a hundred exact failure times, depending on the application. With fewer data points, the methods can be biased. It is known, for example, that MLE estimates of the shape parameter for the Weibull distribution are biased for small sample sizes and the effect can be increased depending on the amount of censoring. This bias can cause discrepancies in analysis.
There are also pathological situations when the asymptotic properties of the MLE do not apply. One of these is estimating the location parameter for the three-parameter Weibull distribution when the shape parameter has a value close to 1. These problems, too, can cause major discrepancies.
As a rule of thumb, our recommendation is to use rank regression techniques when the sample sizes are small and without heavy censoring. When heavy or uneven censoring is present, when a high proportion of interval data points are present and/or when the sample size is sufficient, MLE should be preferred.
Copyright 2003 ReliaSoft Corporation, ALL RIGHTS RESERVED