In the previous section of this on-line reference, Load Sharing, we discussed the case of a system with load sharing components. This is a form of redundancy with dependent components. That is, the failure of one component affects the failure of the other(s). In this section, we will discuss another form of redundancy, standby redundancy. Under standby redundancy, the redundant components are set to be under a lighter load condition (or no load) while not needed and under the operating load when they are activated.
In standby redundancy, the components are set to have two states, an active state and a standby state. Components in standby redundancy have two failure distributions, one for each state. When in the standby state, they have a quiescent (or dormant) failure distribution and when operating, they have an active failure distribution.
In the case that both quiescent and active failure distributions are the same, the units are in a simple parallel configuration (also called a hot standby configuration). When the rate of failure of the standby component is less in quiescent mode than in active mode, then that is called a warm standby configuration. Lastly, when the rate of failure of the standby component is zero in quiescent mode (i.e. the component cannot fail when in standby), then you have a cold standby configuration.
Consider two components in a standby configuration. Component 1 is the active component with a Weibull failure distribution and parameters β = 1.5 and η = 1,000. Component 2 is the standby component. When Component 2 is operating, it also has a Weibull failure distribution with β = 1.5 and η = 1,000. Furthermore, assume the following cases for the quiescent distribution.
Case 1: The quiescent distribution is the same as the active (hot standby).
Case 2: The quiescent distribution is a Weibull with β = 1.5 and η = 2000 (warm standby).
Case 3: The component cannot fail in quiescent mode (cold standby).
In this case, the reliability of the system at some time, t, can be obtained using the following equation:
(26)
Where:
R1 is the reliability of the active component.
f1 is the pdf of the active component.
R2;SB is the reliability of the standby component when in standby mode (quiescent reliability).
R2;A is the reliability of the standby component when in active mode.
te is the equivalent operating time for the standby unit if it had been operating at an active mode, such that:
(27)
Eqn. (27) can be solved for te and substituted into Eqn. (26).
Figure 5.24 illustrates the example as entered in BlockSim using a standby container.
Figure 5.24: Standby container.
The active and standby blocks are within a container, which is used to specify standby redundancy. Since the standby component has two distributions (active and quiescent), the Block Properties window of the standby block has two pages for specifying each one. Figures 5.25 and 5.26 illustrate these pages.
Figure 5.25: Defining the active failure distribution.
Figure 5.26: Defining the quiescent failure distribution.
The results for 1000 hours are given in the following table:
Note that even though the beta for the quiescent distribution is the same as in the active distribution, it is possible that the two can be different. That is, the failure modes present during the quiescent mode could be different than the modes present during the active mode. In that sense, the two distribution types can be different as well (e.g. lognormal when quiescent and Weibull when active).
In many cases when considering standby systems, a switching device may also be present that switches from the failed active component to the standby component. The reliability of the switch can also be incorporated into Eqn. (26). The next section explores this.
BlockSim's System Reliability Equation window returns a single token for the reliability of units in a standby configuration. This is the same as the load sharing case discussed in the previous section of this on-line reference.
In many cases when dealing with standby systems, a switching device is present that will switch to the standby component in the case of the failure of the active component. Therefore, the failure properties of the switch must also be included in the analysis.
In most cases when the reliability of a switch is to be included in the analysis, two probabilities can be considered. The first and most common one is the probability of the switch performing the action (i.e. switching) when requested to do so. This is called "Switch Probability per Request" in BlockSim and is expressed as a static probability (e.g. 90%). The second probability is the quiescent reliability of the switch. This is the reliability of the switch as it ages (e.g. the switch might wear-out with age due to corrosion, material degradation, etc.). Thus, it is possible for the switch to fail before the active component. However, a switch failure does not cause the system to fail but rather causes the system to fail if the switch is needed and the switch has failed. For example, if the active component does not fail until the mission end time and the switch fails, then the system does not fail. However, if the active component fails and the switch has also failed, then the system cannot be switched to the standby component and it therefore fails.
In analyzing standby components with a switching device, either or both failure probabilities (during the switching or while waiting to switch) can be considered for the switch, since each probability can represent different failure modes. For example, the switch probability per request may represent software-related issues or the probability of detecting the failure of an active component and the quiescent probability may represent wear-out type failures of the switch.
To illustrate the formulation, consider the previous example where we originally assumed perfect switching. To examine the effects of including an imperfect switch, we will assume that when the active component fails there is a 90% probability that the switch will switch from the active component to the standby component. In addition, assume that the switch can also fail due to a wear-out failure mode described by a Weibull distribution with β = 1.7 and η = 5000.
Therefore, the reliability of the system at some time, t, is given by the following equation.
(28)
Where:
R1 is the reliability of the active component.
f1 is the pdf of the active component.
R2;SB is the reliability of the standby component when in standby mode (quiescent reliability).
R2;A is the reliability of the standby component when in active mode.
RSW;Q is the quiescent reliability of the switch.
RSW;REQ is the switch probability per request.
te is the equivalent operating time for the standby unit if it had been operating at an active mode.
This problem can be solved in BlockSim by including these probabilities in the container's properties, as shown in Figures 5.27 and 5.28. In BlockSim, the standby container is acting as the switch.
Figure 5.27: Standby container (switch) failure distribution while waiting to switch.
Figure 5.28: Standby container (switch) failure probabilities while attempting to switch.
Note that there are additional properties that can be specified in BlockSim for a switch, such as "Switch Restart Probability," "Finite Restarts" and "Switch Delay Time." In many applications, the switch is re-tested (or re-cycled) if it fails to switch the first time. In these cases, it might be possible that it switches in the second or third, or nth attempt. The "Switch Restart Probability" specifies the probability of switching of each additional attempt and the "Finite Restarts" specifies the total number of attempts. Note that the "Switch Restart Probability" specifies the probability of success of each trial (or attempt). The probability of success of n consecutive trials is calculated by BlockSim using the binomial distribution and this probability is then incorporated into Eqn. (28). The "Switch Delay Time" property is mostly related to repairable systems and is considered in BlockSim only when using simulation. When using the analytical solution (i.e. without repairs), this property is ignored.
Solving the analytical solution (as given by Eqn. 28), the following results are obtained.
From the table above, it can be seen that the presence of a switching device has a significant effect on the reliability of a standby system. So it is important when modeling standby redundancy to incorporate the switching device reliability properties. It should be noted that this methodology is not the same as treating the switching device as another series component with the standby subsystem. This would be valid only if the failure of the switch resulted in the failure of system (e.g. switch failing open). In Eqn. (28), the "Switch Probability per Request" and quiescent probability are present only in the second term of the equation. Treating these two failure modes as a series configuration with the standby subsystem would imply that they are also present when the active component is functioning (first term of Eqn. 28). This is invalid and would result in the underestimation of the reliability of the system. In other words, these two failure modes become significant only when the active component fails.
As an example, and if we consider the warm standby case, the reliability of the system without the switch is 70.57% at 1000 hours. If we had modeled the switching device to be in series with the warm standby subsystem, then we would get:
In the case where a switch failure mode causes the standby subsystem to fail, then this mode can be modeled as an individual block in series with the standby subsystem.
Consider a car with four new tires and a full-size spare. Assume the following failure characteristics:
The tires follow a Weibull distribution with a β = 4 and an η = 40,000 miles while on the car due to wear.
The tires also have a probability of failing due to puncture or other causes. For this, assume a constant rate for this occurrence with a probability of 1 every 50,000 miles.
When not on the car (spare), a tire's probability of failing also has a Weibull distribution with a β = 2 and η = 120,000 miles.
You are embarking on a 1,000 mile trip. If a tire fails during this trip, you will replace it with the spare. However, you will not repair the spare during the trip. In other words, you will continue the trip with the spare on the car and if the spare fails you will be stranded. Determine the probability that you will be stranded.
Active failure distribution for tires:
Due to wear-out, Weibull β = 4 and η = 40,000 miles.
Due to random puncture, exponential μ = 50,000.
Quiescent failure distribution, Weibull β = 2 and η = 120,000 miles.
The block diagram for each tire has two blocks in series, one block representing the wear-out mode and the other the random puncture mode, as shown next:
There are five tires, four active and one standby (represented in the diagram by a standby container with a 4-out-of-5 requirement), as shown next:
For the standby "Wear" block, set the active failure and the quiescent distributions, while for the "Puncture" block, only set the active puncture distribution (because the tire cannot fail due to puncture while stored). Using BlockSim, the probability of being stranded is found to be 0.003 or 0.3%.
See Also:
Time-Dependent System Reliability (Analytical)
Go
to weibull.com
Go
to ReliaSoft.com
©1999-2007. ReliaSoft Corporation. ALL RIGHTS RESERVED.