Reliability HotWire: eMagazine for the Reliability Professional
Reliability HotWire

Issue 38, April 2004

Hot Topics

An Accelerated Life Testing Model Involving Performance Degradation

Competing risk problems involving degradation failures are becoming increasingly common and important in practice. In this article, we will investigate the modeling of competing risk problems involving both catastrophic and degradation failures under accelerated conditions. Modeling the degradation process as a Brownian motion process for which the first passage time to a boundary is considered as the soft failure and modeling hard failures as a Weibull distribution enables us to model accelerated testing in a natural way, make inferences about the parameters of the degradation process and predict the reliability of products at the operating conditions. This methodology is demonstrated and validated using a real case study.

The problem of competing risks encompasses the study of any failure process in which there is more than one distinct cause or type of failure. It is inherent and arises quite naturally in the area of reliability. Existing reliability methods for competing risk problems deal only with products operating at normal conditions that are subject to hard (catastrophic) failures, which implies the abrupt and complete cessation of the products function. Quality and reliability improvements have led to few or no hard failures at normal conditions or even at accelerated conditions. On the other hand, most product performance may deteriorate continuously over time. As the important performance parameter gradually degrades to a critical threshold level, systems and their components are defined as soft (degradation) failures. Many products exhibit this failure mode, such as semiconductors, mechanical systems and microelectronics.

Reliability and Failure Rate

  • N units are subjected to an accelerated test.
  • Testing units can fail in two types of failure modes: failure in one of k1 hard failure modes and failure in one of k2 soft failure modes; the component fails when the first of these k (k = k1 + k2) competing failure modes occurs.
  • Two or more causes of failure may occur simultaneously. In the current framework, a joint event may be taken, such as defining additional failure types.
  • For the jth soft failure mode (j = 1,...,k2), the related performance degradation measure Yj(t) is an increasing function in time t and its pdf is hsj(y | t). When units or systems degrade to an unacceptable level Sj (Yj(t) ≥ Sj), the system fails due to degradation j.

The competing risk problems involving degradation performance may be described using mathematical modeling as follows. For a given unit, let Ti be a random variable with cumulative distribution function Fhi(t), i = {1, 2,..., k1} (hard failure mode) and let Tj be a random variable with cumulative distribution function Fsj(t), j = {1, 2,..., k2} (degradation failure mode). We can observe the time of failure, T ≥ 0, T = min(Ti, Tj), i = {1, 2,..., k1}, j = {1, 2,..., k2}, and the cause of the failure, J, among a finite set of possible causes, say J {1, 2,..., (k1 + k2)}, which may be censored. Regardless of component distributions, the component reliability can be expressed as:

and the failure rate function is:

where Rhi(t) and λhi(t) are the reliability and failure rate functions for the ith hard failure mode. They can be written as:

and:

Rsj(t) and λsj(t) are the reliability and failure rate functions for the jth degradation failure mode, respectively. These functions can be described by the degradation measure Yj and its unacceptable level Sj:

and:

With this analysis of failure times with competing risk, we would also like to know the probability that a product fails for a certain type of failure mode in a mission duration, t, and the probability that a product fails for a certain type of failure mode. These probability functions provide us with the ability to predict which failure mode could cause a potential product failure in a certain operation time or during its entire life, thus providing us with some information about which failure mode is more critical.

For this problem, the probability that the unit fails due to the ith hard failure mode in a mission duration t is as follows:

Therefore, the probability that the product fails due to the ith hard failure mode can be expressed as:

Similarly, we can obtain the probability that the product fails due to the jth degradation failure mode in a mission duration t, and eventually the product fails due to the jth degradation failure mode, respectively:

and:

Competing Risk Model and Statistical Interference

In this section, we consider units or systems with two competing failure modes at accelerated conditions with a stress covariate, z. The first is a soft failure mode due to the degradation process, Y(t), and the second is a catastrophic failure mode due to complete cessation of the products function.

Without loss of generality, we assume that the degradation process can be described by the following model:

where W(t) is a standard Brownian motion on [0, ∞], σ > 0 is the variance parameter, μ is the drift parameter and Y0 is the initial degradation level at t0. Suppose that the drift parameter μ = μ(z) is dependent on the stress conditions for a given critical threshold, S. The lifetime of the product is the instant of time at which the degradation process exceeds the level for the first time.

For Y0 < S, the lifetime follows an inverse Gaussian distribution with the Lebesgue function.

The lifetime distribution depends on the parameters of the degradation process Y0, t0, μ, σ2 and threshold level S. Given n units under test, there are n independent degradation processes Yi(t) corresponding to these units. Let ti1,...,tim be mi observation points of the realization of Yi(t) with t0 < ti1...tim < ∞; thus, the censored observations in ith realization has the form:

To obtain the likelihood function, we must find the pdf for a truncated Wiener process, since the surviving units have not reached the degradation threshold level S during test. Let Yj-1, Yj be the degradation measure for time tj-1 and tj respectively, and the pdf of Y(t) conditional on Y(t) < S for tj-1τ ≤ tj is given as follows:

On the other hand, for the hard failure, we assume the time to a hard failure can be modeled as a Weibull distributed random variable, with the probability density function f(t) defined as follows:

where t ≥ 0, β > 0 ,η > 0, scale parameter η is directly proportional to the mean time-to-failure, while shape parameter β (or slope) provides more information about the properties of the failure mode. That is, it dictates the shape of the failure rate function, making it a decreasing function when β < 1, constant for β  = 1 and increasing for β > 1. Thus, the likelihood function of the competing risk problem can be expressed as:

where,

Δ = {

1 if hard failure occurs
0 if soft failure occurs

Model Validation

In this section, we analyze an accelerated testing experiment that was conducted in the Quality and Reliability Engineering Laboratory at Rutgers University. The purpose of this experiment is to study the effect of stress on light emitting diodes (LEDs) and to predict their reliability under operating conditions.

The reliability of LEDs is strongly dependent on the degradation mode and device characteristics, such as current versus optical output power and operating temperature. The influence of physical degradation on the degradation rate of the device characteristics is affected by the device characteristics themselves. The correlation between reliability and degradation modes is not so common. LED degradation modes are studied and rapid degradation is found to be related to the generation or growth of dark spot/line defects. At higher current density, voltage or temperature, rapid power reduction due to dark spot/line defect generation occurs. Two primary causes for the dark spot/line defects are identified. They are the precipitation of host atoms and the migration of electrode metal into the semiconductor. Solder and heat sink degradation is the cause of sudden failure (Fukuda, Fujita, and Iwane, 1983). We can consider this failure mode as a hard failure mode.

In our experiment, we assume that an LED fails when its performance reaches a specified rapid degradation level (degradation failure) that is defined by an additional test or suddenly due to the solder and heat sink (hard failure). This is a typical competing risk problem.

To continuously record the failure times of testing components and to control the applied factors, an automatic accelerated life testing environment is designed. Figure 1 depicts the layout of the experimental equipment.

The AT-MIO-16 is a multifunction analog, digital and timing I/O board. The AMUX-64T multiplexer board is a four to one multiplexer that can process single-ended inputs or 32 differential inputs. The data acquisition board is used to convert the information of the LED performance degradation to a voltage signal. Figures 2 and 3 demonstrate the samples of an LED test.

Layout of experimental equipment

Figure 1: Layout of experimental equipment

Samples of a LED Test

Figure 2: Samples of an LED test

LEDs testing set

Figure 3: LEDs testing set

The experiment is conducted at three different stress levels: 40mA, 35mA and 28mA. We use data obtained from stress levels 40mA and 35mA to estimate the model and then validate the model using 28mA data. At each stress level, we conduct six accelerated life testing experiments. For each experiment, there are 32 samples for testing. Therefore, there are 192 samples for testing at each stress level. In each test, a designed circuit board that contains 32 randomly chosen LEDs is placed in a temperature chamber where the temperature and current in the circuit are held constant. The light intensity of the LEDs is then measured at room temperature every 50 hours. We utilize an inverse function to transform the original decreasing degradation (light intensity of LEDs) paths to a monotonically increasing function of time. Table 1 shows the inverse LEDs light intensity and hard failure times for some of the samples.

Inverse LEDs light intensity and hard failure times

Table 1: Inverse LEDs light intensity and hard failure times

The Inverse Power Law (IPL) life-stress relationship is commonly used for non-thermal accelerated stresses and is given by:

Where L represents a quantifiable life measure, V represents the stress level and K, n are model parameters to be determined (K > 0). In this particular competing risk problem, we use IPL to describe the life-stress relationship. Thus, the IPL-Weibull-Brownian competing risk model can be derived by setting the Weibull scale parameter:

and the drift parameter:

where I is the stress level (mA). This yields the following model:

Remark: Yij is ith LEDs performance degradation measure at time tj, S is the critical threshold value that we predetermine, according to the users requirement:

All these values are known and we can obtain these values from experimental data without any difficulty. K1, K2, n1, n2, s and b are unknown and need to be estimated. For data involving both catastrophic and degradation failures occurring under accelerated life testing, six parameters are reasonable, especially for the competing risk problem.

In order to estimate the unknown parameters, we use a numerical method to maximize the log-likelihood function. The resultant model is then used to estimate reliability at 28 mA using the accelerated testing data. The estimated values of reliability are then compared with the experimental data collected at the same level. Table 2 compares the reliability values obtained from our competing risk model and those obtained experimentally.

Reliability of LEDs at 28 mA

Table 2: Reliability of LEDs at 28 mA

Based on Table 2, the estimated reliability matches closely to the experimental reliability, implying that our model provides good estimates of the LEDs reliability. Figure 3 shows the reliability of the LEDs at 28 mA.

Reliability of the LEDs at 28 mA

Figure 3: Reliability of the LEDs at 28 mA

Remark: If the accelerated stress is temperature or humidity, we can model the ALT by setting the Weibull scale parameter and the drift parameter to the Arrhenius or Eyring relationship. If the accelerated stress is non-thermal, we can consider using the IPL. Like this particular case, if there are multiple stresses, we can use a combination of the Arrhenius, Eyring and IPL models to capture the life-stress relationship.

Conclusions

In this article, we developed an IPL-Weibull-Brownian model for analyzing competing risk data involving performance degradation and hard failures obtained at accelerated operating conditions. The model is also validated experimentally by conducting accelerated testing on the LEDs subject to high test driving current. Three experiments are conducted at different operating conditions. The results of two experiments are used in estimating the parameters and, subsequently, the reliability of the LEDs is estimated at the same stress conditions of the third experiment. Comparing the reliability estimates obtained by the proposed model with those obtained using the data from the third experiment indicates that the proposed model is valid and accurate.

 

This article is based on the paper "An Accelerated Life Testing Model Involving Performance Degradation" by Wenbiao Zhao (ReliaSoft) and E.A. Elsayed (Rutgers University), which was presented at the 2004 Annual Reliability and Maintainability Symposium. You can download a *.pdf version of this paper (7 pages, 517 KB). (To save the file, right-click the link and select Save Target As from the shortcut menu.)

ReliaSoft Corporation

Copyright 2004 ReliaSoft Corporation, ALL RIGHTS RESERVED