Improving Reliability

Reliability engineers are very often called upon to make decisions as to whether to improve a certain component or components in order to achieve a minimum required system reliability. (Note: This minimum required system reliability is for a specified time.) There are two approaches to improving the reliability of a system: fault avoidance and fault tolerance. Fault avoidance is achieved by using high-quality and high-reliability components and is usually less expensive than fault tolerance. Fault tolerance, on the other hand, is achieved by redundancy. Redundancy can result in increased design complexity and increased costs through additional weight, space, etc.

Before deciding whether to improve the reliability of a system by fault tolerance or fault avoidance, a reliability assessment for each component in the system should be made. Once the reliability values for the components have been quantified, an analysis can be performed in order to determine if that system's reliability goal will be met. If it becomes apparent that the system's reliability will not be adequate to meet the desired goal at the specified mission duration, steps can be taken to determine the best way to improve the system's reliability so that it will reach the desired target.

Consider a system with three components connected reliability-wise in series. The reliabilities for each component for a given time are: R1 = 70%, R2 = 80% and R3 = 90%. A reliability goal, RG = 85%, is required for this system.

The current reliability of the system is:

Obviously, this is far short of the system's required reliability performance. It is apparent that the reliability of the system's constituent components will need to be increased in order for the system to meet its goal. First, we will try increasing the reliability of one component at a time to see whether the reliability goal can be achieved.

Figure 6.9: Change in system reliability if a three-unit series system due to increasing the reliability of just one component.

Figure 6.9 shows that even by raising the individual component reliability to a hypothetical value of 1 (100% reliability, which implies that the component will never fail), the overall system reliability goal will not be met by improving the reliability of just one component. The next logical step would be to try to increase the reliability of two components. The question now becomes: which two? One might also suggest increasing the reliability of all three components. A basis for making such decisions needs to be found in order to avoid the "trial and error" aspect of altering the system's components randomly in an attempt to achieve the system reliability goal.

As we have seen, the reliability goal for the preceding example could not be achieved by increasing the reliability of just one component. There are cases, however, where increasing the reliability of one component results in achieving the system reliability goal. Consider, for example, a system with three components connected reliability-wise in parallel. The reliabilities for each component for a given time are: R1 = 60%, R2 = 70% and R3 = 80%. A reliability goal, RG = 99%, is required for this system. The initial system reliability is:

The current system reliability is inadequate to meet the goal. Once again, we can try to meet the system reliability goal by raising the reliability of just one of the three components in the system.

Figure 6.10: Meeting a reliability goal requirement by increasing a component

Figure 6.10: Meeting a reliability goal requirement by increasing a component's reliability.

From Figure 6.10, it can be seen that the reliability goal can be reached by improving Component 1, Component 2 or Component 3. The reliability engineer is now faced with another dilemma: which component's reliability should be improved? This presents a new aspect to the problem of allocating the reliability of the system. Since we know that the system reliability goal can be achieved by increasing at least one unit, the question becomes one of how to do this most efficiently and cost effectively. We will need more information to make an informed decision as to how to go about improving the system's reliability. How much does each component need to be improved for the system to meet its goal? How feasible is it to improve the reliability of each component? Would it actually be more efficient to slightly raise the reliability of two or three components rather than radically improving only one?

In order to answer these questions, we must introduce another variable into the problem: cost. Cost does not necessarily have to be in dollars. It could be described in terms of non-monetary resources, such as time. By associating cost values to the reliabilities of the system's components, we can find an optimum design that will provide the required reliability at a minimum cost.

Cost/Penalty Function

There is always a cost associated with changing a design due to change of vendors, use of higher-quality materials, retooling costs, administrative fees, etc. The cost as a function of the reliability for each component must be quantified before attempting to improve the reliability. Otherwise, the design changes may result in a system that is needlessly expensive or overdesigned. Developing the "cost of reliability" relationship will give the engineer an understanding of which components to improve and how to best concentrate the effort and allocate resources in doing so. The first step will be to obtain a relationship between the cost of improvement and reliability.

The preferred approach would be to formulate the cost function from actual cost data. This can be done from past experience. If a reliability growth program is in place, the costs associated with each stage of improvement can also be quantified. Defining the different costs associated with different vendors or different component models is also useful in formulating a model of component cost as a function of reliability.

However, there are many cases where no such information is available. For this reason, a general (default) behavior model of the cost versus the component's reliability was developed for performing reliability optimization in BlockSim. The objective of this function is to model an overall cost behavior for all types of components. Of course, it is impossible to formulate a model that will be precisely applicable to every situation; but the proposed relationship is general enough to cover most applications. In addition to the default model formulation, BlockSim does allow the definition of user-defined cost models.

Quantifying the Cost/Penalty Function

One needs to quantify a cost function for each component, Ci, in terms of the reliability, Ri, of each component, or:

(4)

This function should:

Thus, for the cost function to comply with these needs, the following conditions should be adhered to:

The following default cost function (also used in BlockSim) adheres to all of these conditions and acts like a penalty function for increasing a component's reliability. Furthermore, an exponential behavior for the cost is assumed since it should get exponentially more difficult to increase the reliability. See Mettas [21].

(5)

Where:

Note that this penalty function is dimensionless. It essentially acts as a weighting factor that describes the difficulty in increasing the component reliability from its current value, relative to the other components.

Examining the cost function given by Eqn. (5), the following observations can be made:

The Feasibility Term, f

The feasibility term in Eqn. (5) is a constant (or an equation parameter) that represents the difficulty in increasing a component's reliability relative to the rest of the components in the system. Depending on the design complexity, technological limitations, etc., certain components can be very hard to improve. Clearly, the more difficult it is to improve the reliability of the component, the greater the cost. Figure 6.11 illustrates the behavior of the function defined in Eqn. (5) for different values of f. It can be seen that the lower the feasibility value, the more rapidly the cost function approaches infinity.

Figure 6.11: Behavior of the cost function for different feasibility values.

Several methods can be used to obtain a feasibility value. Weighting factors for allocating reliability have been proposed by many authors and can be used to quantify feasibility. These weights depend on certain factors of influence, such as the complexity of the component, the state of the art, the operational profile, the criticality, etc. Engineering judgment based on past experience, supplier quality, supplier availability and other factors can also be used in determining a feasibility value. Overall, the assignment of a feasibility value is going to be a subjective process. Of course, this problem is negated if the relationship between the cost and the reliability for each component is known because one can use regression methods to estimate the parameter value.

Maximum Achievable Reliability

For the purposes of reliability optimization, we also need to define a limiting reliability that a component will approach, but not reach. The costs near the maximum achievable reliability are very high and the actual value for the maximum reliability is usually dictated by technological or financial constraints. In deciding on a value to use for the maximum achievable reliability, the current state of the art of the component in question and other similar factors will have to be considered. In the end, a realistic estimation based on engineering judgment and experience will be necessary to assign a value to this input.

Note that the time associated with this maximum achievable reliability is the same as that of the overall system reliability goal. Almost any component can achieve a very high reliability value, provided the mission time is short enough. For example, a component with an exponential distribution and a failure rate of one failure per hour has a reliability that drops below 1% for missions greater than five hours. However, it can achieve a reliability of 99.9% as long as the mission is no longer than four seconds. For the purposes of optimization in BlockSim, the reliability values of the components are associated with the time for which the system reliability goal is specified. For example, if the problem is to achieve a system goal of 99% reliability at 1000 hours, the maximum achievable reliability values entered for the individual components would be the maximum reliability that each component could attain for a mission of 1000 hours.

As the component reliability, Ri, approaches the maximum achievable reliability, Rmax,i, the cost function approaches infinity. The maximum achievable reliability acts as a scale parameter for the cost function. By decreasing Rmax,i, the cost function is compressed between Rmin,i and Rmax,i, as shown in Figure 6.12.

Figure 6.12: Effect of the maximum achievable reliability on the cost function.

Cost Function

Once the cost functions for the individual components have been determined, it becomes necessary to develop an expression for the overall system cost. This takes the form of:

In other words, the cost of the system is simply the sum of the costs of its components. This is regardless of the form of the individual component cost functions. They can be of the general behavior model in BlockSim or they can be user-defined. (Note: ReliaSoft does not recommend mixing different types of cost functions as it may lead to incorrect or misleading optimization results. This is due to the fact that the default general cost function is dimensionless, whereas many user-defined cost functions would be in terms of dollars per reliability percentage.) Once the overall cost function for the system has been defined, the problem becomes one of minimizing the cost function while remaining within the constraints defined by the target system reliability and the reliability ranges for the components. The latter constraints in this case are defined by the minimum and maximum reliability values for the individual components.

BlockSim employs a nonlinear programming technique to minimize the system cost function. The system has a minimum (current) and theoretical maximum reliability value that is defined by the minimum and maximum reliabilities of the components and by the way the system is configured. That is, the structural properties of the system are accounted for in the determination of the optimum solution. For example, the optimization for a system of three units in series will be different than the optimization for a system consisting of the same three units in parallel. The optimization occurs by varying the reliability values of the components within their respective constraints of maximum and minimum reliability in a way that the overall system goal is achieved. Obviously, there can be any number of different combinations of component reliability values that might achieve the system goal. The optimization routine essentially finds the combination that results in the lowest overall system cost. (Note: The solution is restricted to the specific combination of component reliabilities that yields the lower system cost and a system reliability equal to the reliability goal.)

Determining the Optimum Allocation Scheme

To determine the optimum reliability allocation, the analyst first determines the system reliability equation (the objective function). As an example, and again for a trivial system with three components in series, this would be:

(6)

If a target reliability of 90% is sought, then Eqn. (6) is recast as:

(7)

The objective now is to solve for R1, R2 and R3 so that the equality in Eqn. (7) is satisfied. To obtain an optimum solution, we also need to utilize our cost functions, i.e. define the total allocation costs as:

(8)

With the cost equation defined, then the optimum values for R1, R2 and R3 are the values that satisfy the reliability requirement, Eqn. (6), at the minimum cost, Eqn. (8). BlockSim uses this methodology during the optimization task.

Defining a Feasibility Policy in BlockSim

In BlockSim, you can choose to use the default feasibility function, as defined by Eqn. (5), or use your own function. Figure 6.13 illustrates the use of the default values using the slider control. Figure 6.14 shows the use of an associated feasibility policy to create a user-defined cost function. When defining your own cost function, you should be aware of/adhere to the following guidelines:

Figure 6.13: Setting the default feasibility function in BlockSim with the feasibility slider. Note that the feasibility slider displays values, SV, from 1 to 9 when moved by the user, with SV = 9 being the hardest.

Figure 6.14: Setting a user-defined feasibility function in BlockSim using an associated feasibility policy. Any user-defined equation can be entered as a function of R.

 

See Also:
Reliability Importance and Optimized Reliability Allocation (Analytical)


Go to weibull.com
Go to ReliaSoft.com

©1999-2007. ReliaSoft Corporation. ALL RIGHTS RESERVED.