Template:Elements of a reliability growth program

Elements of a Reliability Growth Program
In a formal reliability growth program, one or more reliability goals are set and should be achieved during the development testing program with the necessary allocation or reallocation of resources. Therefore, planning and evaluating are essential factors in a growth process program. A comprehensive reliability growth program needs well-structured planning of the assessment techniques. A reliability growth program differs from a conventional reliability program in that there is a more objectively developed growth standard against which assessment techniques are compared. A comparison between the assessment and the planned value provides a good estimate of whether or not the program is progressing as scheduled. If the program does not progress as planned, then new strategies should be considered. For example, a reexamination of the problem areas may result in changing the management strategy so that more problem failure modes that surface during the testing actually receive a corrective action instead of a repair. Several important factors for an effective reliability growth program are: •	Management: the decisions made regarding the management strategy to correct problems or not correct problems and the effectiveness of the corrective actions •	Testing: provides opportunities to identify the weaknesses and failure modes in the design and manufacturing process •	Failure mode root cause identification: funding, personnel and procedures are provided to analyze, isolate and identify the cause of failures •	Corrective action effectiveness: design resources to implement corrective actions that are effective and support attainment of the reliability goals •	Valid reliability assessments: using valid statistical methodologies to analyze test data in order to assess reliability The management strategy may be driven by budget and schedule but it is defined by the actual decisions of management in correcting reliability problems. If the reliability of a failure mode is known through analysis or testing, then management makes the decision either not to fix (no corrective action) or to fix (implement a corrective action) that failure mode. Generally, if the reliability of the failure mode meets the expectations of management, then no corrective actions would be expected. If the reliability of the failure mode is below expectations, the management strategy would generally call for the implementation of a corrective action. Another part of the management strategy is the effectiveness of the corrective actions. A corrective action typically does not eliminate a failure mode from occurring again. It simply reduces its rate of occurrence. A corrective action, or fix, for a problem failure mode typically removes a certain amount of the mode's failure intensity, but a certain amount will remain in the system. The fraction decrease in the problem mode failure intensity due to the corrective action is called the effectiveness factor (EF). The EF will vary from failure mode to failure mode but a typical average for government and industry systems has been reported to be about 0.70. With an EF equal to 0.70, a corrective action for a failure mode removes about 70% of the failure intensity, but 30% remains in the system.

Corrective action implementation raises the following question: "What if some of the fixes cannot be incorporated during testing?" It is possible that only some fixes can be incorporated into the product during testing. However, others may be delayed until the end of the test since it may be too expensive to stop and then restart the test, or the equipment may be too complex for performing a complete teardown. Implementing delayed fixes usually results in a distinct jump in the reliability of the system at the end of the test phase. For corrective actions implemented during testing, the additional follow-on testing provides feedback on how effective the corrective actions are and provides opportunity to uncover additional problems that can be corrected.

Evaluation of the delayed corrective actions is provided by projected reliability values. The demonstrated reliability is based on the actual current system performance and estimates the system reliability due to corrective actions incorporated during testing. The projected reliability is based on the impact of the delayed fixes that will be incorporated at the end of the test or between test phases.

When does a reliability growth program take place in the development process? Actually, there is more than one answer to this question. The modern approach to reliability realizes that typical reliability tasks often do not yield a system that has attained the reliability goals or attained the cost-effective reliability potential in the system. Therefore, reliability growth may start very early in a program, utilizing Integrated Reliability Growth Testing (IRGT). This approach recognizes that reliability problems often surface early in engineering tests. The focus of these engineering tests is typically on performance and not reliability. IRGT simply piggybacks reliability failure reporting, in an informal fashion, on all engineering tests. When a potential reliability problem is observed, reliability engineering is notified and appropriate design action is taken. IRGT will usually be implemented at the same time as the basic reliability tasks. In addition to IRGT, reliability growth may take place during early prototype testing, during dedicated system testing, during production testing, and from feedback through any manufacturing or quality testing or inspections. The formal dedicated testing or RDGT will typically take place after the basic reliability tasks have been completed. Note that when testing and assessing against a product's specifications, the test environment must be consistent with the specified environmental conditions under which the product specifications are defined. In addition, when testing subsystems it is important to realize that interaction failure modes may not be generated until the subsystems are integrated into the total system.