March 2017 – FMEA vs FTA

Two Complementary Analyses:
Failure Mode and Effects Analysis (FMEA) and Fault Tree Analysis (FTA) are complementary reliability analyses – FMEA considers low-level component failures and their system-level consequences, while FTA considers system-level failures and how they could be traced to low-level component failures. Both FMEA and FTA determine likelihoods and criticality of their respective failures. In our last two blog posts, we discussed FTAsand FMEAs, respectively. 

Image

FMEA, FTA, or Both?
While both are important tools for predicting a system’s reliability and safety, they often take considerable time and effort. When a project is governed by a Statement of Work (SOW), the SOW usually cites the required kinds of reliability analyses. In contrast, if the decision to perform reliability analysis is internal, when does it make sense to undertake just one of these two approaches, or are both really necessary? That decision typically is driven by budget and schedule. Some background is very useful to better understand what is involved in these analyses, and how they affect the efforts and quality of your reliability program.

FTA


Performed Independently, but Highly Dependent
FMEA and FTA can be undertaken simultaneously by independent groups of engineering analysts. Both groups need a good understanding of how the subject system performs and the roles of its underlying internal components. Two kinds of information must be shared: the FMEA group determines failure rates of each kind of failure (failure modes) of low-level components and provides the failure rates to the FTA group, while the FTA group determines criticality (degree of undesirability) of system-level failures and provides the criticality values to the FMEA group.

While the FMEA group can get started quickly and work directly from drawings (e.g., schematics if the low-level components are piece-parts, or interconnect diagrams if the low-level components are circuit boards or LRUs), the FTA group cannot get started without a Functional Hazard Analysis (FHA), or at least part of an FHA. An FHA identifies and considers each of the system’s required functions and assigns a criticality figure to each kind of possible failure mode for each system function. Each system failure mode that can have serious or catastrophic consequences becomes the top of a fault tree, and the fault tree shows how failures of lower-level components contribute to this top-level failure. With low-level component failure rates supplied by the FMEA group, the failure probability at the top of each tree can be computed. The combination of criticality and failure probability – called risk – is often determined using MIL-STD-882 lookup tables. An unacceptably high-risk value means the underlying design may have to be modified, or other measures may have to be taken (e.g., new or additional warning notices or improved safety training).

FMEA


What Goes Wrong
Unfortunately, different human analysts have a different understanding of a system’s particulars and the FMEA group’s conclusions will not (ever!) be fully consistent with the FTA group’s conclusions. The most common kinds of inconsistencies are (1) low-level component failures in fault trees that don’t appear in the FMEA, (2) serious system-level consequences cited in FMEA that don’t appear in any fault tree, (3) different ways of describing the same thing in FMEA and FTA, and (4) assignment of different system-level criticality values for the same failure in both FMEA and FTA. There will be other kinds of inconsistencies as well that degrade analysis quality and make results harder to understand. 

Additionally, there will (always!) be inconsistencies among analysts in the same group. Among FMEA analysts, there will be differences in understanding and descriptions of failure modes and consequences; among FTA analysts, there will be common branches that appear in different fault trees (e.g., different parts of the system that depend on a common set of power supplies). When Omnicon was faced with these issues in a very large reliability program with both FMEA and FTA, we developed a software tool to find FMEA/FTA inconsistencies and check that they were properly corrected. A paper describing this tool was presented at the 2016 ARS Conference and is available from Omnicon.

Takeaway
If your choice is limited to either FMEA or FTA, an FTA is probably the way to go because its results are more meaningful. An FTA concentrates on providing insight into serious potential system hazards and highlights underlying candidates (and their combinations or interactions) that might be modified when necessary to reduce a hazard’s risk level. However, there are drawbacks. First, an FHA must first be performed to identify hazards that will be represented by fault trees. Second, FTA generally requires a software app (and app training) to draw the trees and computer failure probabilities. 

In contrast, a FMEA analyzes all failure modes of all low-level components, but most low-level failure modes in a well-designed system will not cause serious consequences at the system level (generally due to design redundancies) – so a lot of the analysis effort doesn’t buy much for understanding system-level hazards. However, if you need to predict the failure rate of a system due to any failure (regardless of criticality), a FMEA is required. One plus is that FMEA is generally captured in Microsoft Excel, so no tool training is required. Omnicon developed the following diagram to illustrate the relationships among FHA, FTA, and FMEA. Data formats are from analyses of a very large system where low-level components were LRUs. In the diagram, SEV is severity, where 1 represents the greatest severity, FM is Failure Mode, and HWCI is Hardware Configuration Item.

traceability

Tags: FMEA, FTA, reliability, safety

Overview

Resources

About Us

Omnicon provides custom engineering consultancy services and support for a range of customers in aerospace, defense, transportation and beyond.