### AN OVERVIEW OF MTBF

*MTBF is an acronym for Mean Time Between Failures (though we kind of like “Make Things Better”!). Typically expressed in units of hours, MTBF indicates the amount of time, on average, between successive system failures. It is a metric often used in the field of reliability engineering to assess failure likelihood with the ultimate goal of improving product quality and reliability. Most often a software tool, such as a reliability prediction package, is used to measure and evaluate MTBF and other related metrics.*

### What is MTBF and what is MTTF?

MTBF, or Mean Time Between Failures, is the amount of time between failures of a system. For example, an MTBF of 100 hours indicates that a system, on average, will successfully operate for 100 hours before experiencing a failure.

MTTF, or Mean Time to Failure, is another often-used reliability engineering metric. MTTF is used when evaluating non-repairable systems. For systems that cannot be repaired upon failure, the MTTF metric indicates how long the system operates until failure.

When discussing repairable systems, MTBF is commonly used. When discussing non-repairable systems, MTTF is used. Sometimes MTTFF, or Mean Time to First Failure, is used for non-repairable systems.

### How do I calculate MTBF and MTTF?

One method of computing MTBF (or MTTF for non-repairable systems), is to track operating systems and record the times to failure. MTBF is then computed as the average of the failure times. For example, I set up a test case with 5 systems; I turn them all on at the same time, and then record the times of failure. If my sample results are failures at the following times: 100 hours, 230 hours, 400 hours, 510 hours, and 1000 hours, my MTBF is (100 + 230 + 400 + 510 + 1000) / 5, or 448 hours. This indicates that *on average* my system will fail every 448 hours of operation – not that it will fail precisely at 448 hours.

This is a very simple example of calculating MTBF. Even in this case, you can see that the more systems I test, and the more data I have, the more accurate my MTBF value will be. Also, there are more complexities to consider. How do I account for units that do not fail? What if I want to consider repair times or designs that incorporate redundancy? These factors will have significant impact on the MTBF value. Not only are there more complicated mathematical scenarios, but there is also the issue of feasibility in setting up complex and costly test beds.

Another way to compute MTBF is using the failure rate value of a system in its “useful life” period, or the part of product lifecycle where the failure rate of the system is constant. (For more information about the product lifecycle and the related bathtub curve, see our blog post on this topic.) If the failure rate is known, then MTBF is equal to 1 / failure rate. So, if I know the failure rate of my system is 500 FPMH (failures per million hours), then the MTBF of my system is equal to 1 / 500 failures / 1,000,000 hours, or 2000 hours.

Failure rates of electromechanical systems can be computed using analytical techniques and methodologies well known and established in the reliability engineering field, such as reliability prediction and reliability block diagram (RBD) analysis.

### Tools for calculating MTBF

As noted above, you can calculate MTBF by setting up test beds and observing and tracking operating times and failures. While this is a straightforward way to directly observe MTBF, its feasibility is questionable, and its costs are prohibitive.

If you have some type of issue tracking system, such as a CAPA (Corrective and Preventive Action) or FRACAS (Failure Reporting, Analysis, and Corrective Action Software) in place, you can use the data available there to evaluate MTBF. For example, your FRACAS will have a record of all reported field failures. Using this data along with operating time information on your system, you can determine a field-based MTBF value.

Most often, MTBF calculations are performed using a software package that evaluates the components of your system and estimates a likely, or predicted, failure rate using mathematical algorithms. *Reliability Prediction software* is the most efficient way to calculate failure rate and MTBF.

### MTBF prediction software

The most cost effective and easiest approach for assessing the MTBF and failure rate of your system if to use a software tool designed for this purpose. *Reliability Prediction* software has long been used to perform failure rate analysis on complex electromechanical systems.

While Reliability Prediction is the appropriate term for this analysis, it is sometimes referred to as an MTBF prediction calculation, or an MTBF calculator tool. Reliability Prediction tools evaluate failure rate assuming systems are in their “useful life”, or constant failure rate phase of the product lifecycle. In this situation, MTBF is equivalent to the inverse of the failure rate, so either or both metrics can be used.

Reliability Predictions are one of the most widely used techniques in reliability analysis. The methods used to assess failure rate are described in the reliability prediction *standards*. There are number of reliability prediction standards in use today, including MIL-HDBK-217, Telcordia SR-332 (formerly Bellcore), 217Plus, and China’s GJB/z 299.

The basis of reliability prediction standards is the statistical analysis of a wide range of component data gathered in the field over a long time period. This data is then analyzed to develop a series of equations that can be used to model the underlying failure characteristics of the devices. The equations developed include a number of variables that affect reliability, such as operating environments, temperatures, quality levels, and stress factors.

To perform a reliability prediction, you begin by listing all the components in your system design. You then compute the estimated failure rates of your components using the equations from the standard you choose to use. The summation of all the components’ failure rates is the predicted failure rate of your system.

For more details on reliability prediction standards, feel free to review our blog post on this topic.

### Why perform MTBF analysis?

The most significant benefit of adding MTBF analysis to your reliability toolkit is to measure, assess, and improve your product quality and reliability. While reliability predictions are most often done during the product design stage, they can be useful during all stages of the product lifecycle.

Using Reliability Prediction analysis during the design stage is uniquely beneficial because it enables you to “design-in” reliability. By assessing your predicted product failure profile before actual manufacture and deployment, and addressing reliability areas of concern ahead of time, you are not only saving real costs, but reputational costs as well.

Reliability Predictions can be used at any other point in the product lifecycle: during prototyping, testing, manufacturing, and deployment. Essentially, any time when you would like a more in-depth review of the reliability of your product, reliability predictions are a useful tool for that purpose.

Armed with a reliability profile of your product, this knowledge can then be used to assess, improve, and track the reliability of your product or system. You can see how the information from a reliability prediction assessment can be used to further your reliability improvement objectives, including:

- Helping to assess the feasibility of a proposed design
- Comparing design alternatives for the most reliable option
- Identifying potential reliability areas of concern
- Performing What-If? analyses to determine the effects of specific design modifications
- Tracking reliability improvement
- Addressing product quality issues in early design before they become problematic
- Decreasing the Cost of Poor Quality (COPQ)
- Meeting contractual compliance requirements