|
Reliability Growth Analysis Resources
An Overview of Basic Concepts
and Directory of Other Resources
What is Reliability
Growth?
In general, the first
prototypes produced during the development of a new complex system will
contain design, manufacturing and/or engineering deficiencies. Because
of these deficiencies the initial reliability of the prototypes may be
below the system's reliability goal or requirement. In order to identify
and correct these deficiencies, the prototypes are often subjected to a
rigorous testing program. During testing, problem areas are identified
and appropriate corrective actions (or redesign) are taken. Reliability
growth is the improvement in the reliability of a product (component,
subsystem or system) over a period of time due to changes in the
product's design and/or the manufacturing process.
The concept of
reliability growth is not just theoretical or absolute. Reliability
growth is related to factors such as the management strategy toward
taking corrective actions, effectiveness of the fixes, reliability
requirements, the initial reliability level, reliability funding and
competitive factors. For example, one management team may take
corrective actions for 90% of the failures seen during testing, while
another management team with the same design and test information may
take corrective actions on only 65% of the failures seen during testing.
Different management strategies may attain different reliability values
with the same basic design. The effectiveness of the corrective actions
is also relative when compared to the initial reliability at the
beginning of testing. If corrective actions give a 400% improvement in
reliability for equipment that initially had one tenth of the
reliability goal, this is not as significant as a 50% improvement in
reliability if the system initially had one half the reliability goal.
Elements of a Reliability
Growth Program
In a formal reliability
growth program a reliability goal (or goals) is set and should be
achieved during the development testing program with the necessary
allocation or reallocation of resources. Therefore, planning and
evaluating are essential factors in a growth process program. A
comprehensive reliability growth program needs well-structured planning
of the assessment techniques. A reliability growth program differs from
a conventional reliability program in that there is a more objectively
developed growth standard against which assessment techniques are
compared. A comparison between the assessment and the planned value
provides a good estimate of whether or not the program is progressing as
scheduled. If the program does not progress as planned, then new
strategies should be considered. For example, a reexamination of the
problem areas may result in changing the management strategy so that
more problem failure modes surfaced during the testing actually receive
a corrective action instead of a repair. Several important factors for
an effective reliability growth program are:
-
Management: the decisions are made regarding the management strategy
to correct problems or not correct problems and the effectiveness of
the corrective actions
-
Testing: provides opportunities to identify the weaknesses and
failure modes in the design and manufacturing process
- Failure
mode root cause identification: funding, personnel and procedures
are provided to analyze, isolate and identify the cause of failures
-
Corrective action effectiveness: design resources to implement
corrective actions that are effective and support attainment of the
reliability goals
- Valid
reliability assessments
The management strategy
may be driven by budget and schedule but it is defined by the actual
actions of management in correcting reliability problems. If the
reliability of a failure mode is known through analysis or testing, then
management makes the decision either not to fix (no corrective action)
or to fix (implement a corrective action) that failure mode. Generally,
if the reliability of the failure mode meets the expectations of
management, then no corrective actions would be expected. If the
reliability of the failure mode is below expectations, the management
strategy would generally call for the implementation of a corrective
action.
Another part of the
management strategy is the effectiveness of the corrective actions. A
corrective action typically does not eliminate a failure mode from
occurring again. It simply reduces its rate of occurrence. A corrective
action, or fix, for a problem failure mode typically removes a certain
amount of the failure mode's failure intensity, but a certain amount
will remain in the system. The fraction decrease in the problem mode
failure intensity due to the corrective action is called the
effectiveness factor (EF).
The EF will vary from failure mode to failure mode but a typical average
for government and industry systems has been reported to be about 0.70.
With an EF equal to 0.70, a corrective action for a failure mode removes
about 70% of the failure intensity, but 30% remains in the system.
Corrective action
implementation raises the following question: "What if some of the fixes
cannot be incorporated during testing?" It is possible that only some
fixes can be incorporated into the product during testing. However,
others may be delayed until the end of the test since it may be too
expensive to stop and then restart the test, or the equipment may be too
complex for performing a complete teardown. Implementing delayed fixes
usually results in a distinct jump in the reliability of the system at
the end of the test phase. For corrective actions implemented during
testing, the additional follow-on testing provides feedback on how
effective the corrective actions are and provides opportunity to uncover
additional problems to correct.
Evaluation of the delayed
corrective actions is provided by projected reliability values. The
demonstrated reliability is based on the actual current system
performance and estimates the system reliability due to corrective
actions incorporated during testing. The projected reliability is based
on the impact of the delayed fixes that will be incorporated at the end
of the test or between test phases.
When does a reliability
growth program take place in the development process? Actually, there is
more than one answer to this question. The modern approach to
reliability realizes that typical reliability tasks often do not yield a
system that has attained the reliability goals or attained the cost
effective reliability potential in the system. Therefore, reliability
growth may start very early in a program utilizing Integrated
Reliability Growth Testing (IRGT). This approach recognizes that
reliability problems often surface early in engineering tests. The focus
of these engineering tests is typically on performance and not
reliability. IRGT simply piggybacks reliability failure reporting, in an
informal fashion, on all engineering tests. When a potential reliability
problem is observed, reliability engineering is notified and
appropriated design action is taken. IRGT will usually be implemented at
the same time as the basic reliability tasks. In addition to IRGT,
reliability growth may take place during early prototype testing, during
dedicated system testing, during production testing, and from feedback
from any manufacturing or quality testing or inspections. The formal
dedicated testing or RGDT will typically take place after the basic
reliability tasks have been completed.
Note that when testing and
assessing against a product's specifications, the test environment must
be consistent with the specified environmental conditions under which
the product specifications are defined. In addition, when testing
subsystems it is important to realize that interaction failure modes may
not be generated until the subsystems are integrated into the total
system.
Reliability Growth
Analysis
Reliability growth
analysis is the process of collecting, modeling, analyzing and
interpreting data from the reliability growth development test program
(development testing). In addition, reliability growth analysis can be
done for data collected from the field (fielded systems). Fielded
systems also includes the ability to analyze data of complex repairable
systems. Depending on the metric(s) of interest and the data collection
method, different models can be utilized (or developed) to analyze the
growth processes.
For complete details see
ReliaSoft's
eTextbook on Reliability Growth
|
|
Reliability
Growth Analysis
Resources |
|
|
 |
|
RGA |
|
|
|