A Guide to MIL-HDBK-217, Telcordia SR-332, and Other Reliability Prediction Methods
How Were Reliability Prediction Methods Developed?
Reliability Predictions are often used in product design and development as part of reliability and quality continuous improvements efforts. To perform a reliability prediction analysis, a standard is employed. Each Reliability Prediction standard offers a set of mathematical formulas to model and calculate the failure rate of a variety of electromechanical components that make up a product or system.
These equations were built by analyzing a huge amount of field data over a long period of time. Statistical analysis was then used to determine the equations which best modeled the failure characteristics of the accumulated data.
The variables used in the reliability calculation formulas to calculate component failure rates vary, but include data such as device ratings, temperatures, operating parameters, and environmental conditions. The result of a reliability prediction analysis is the predicted failure rate or Mean Time Between Failures (MTBF) of a product or system, and of its subsystems, components, and parts.
Reliability Prediction’s historical roots are in the military and defense sector, but over the years have been adapted and broadened for use in a wide range of industries. Essentially, the advantages afforded by reliability prediction analyses make it an important part of managing and maintaining reliability and quality objectives.
This article provides an overview of the most commonly used reliability prediction standards.
How To Use Failure Rate Predictions to Improve Reliability
Reliability Predictions offer a path to product improvement by supporting the ability to “design in” reliability. At the early design stage, Reliability Predictions enable you to perform an assessment of likely failure rate characteristics. By predicting failure rates, you can then make design changes as needed for areas of weakness.
Reliability Predictions can also be used to evaluate design options by considering the reliability profiles of the various alternatives. This ability to perform design trade-off analysis with metric-based assessments empowers you to make the best decisions for your business.
What are the Primary Reliability Prediction Standards?
There are several widely accepted Reliability Prediction standards including:
- MIL-HDBK-217 Part Stress
- MIL-HDBK-217 Parts Count
- Telcordia SR-332
- 217Plus Part Count
- China’s GJB/z 299C Part Stress
- China’s GJB/z 299C Parts Count
Additionally, component databases NPRD (Non-electronic Parts Reliability Data) and EPRD (Electronic Parts Reliability Data) are often used in conjunction with the Reliability Prediction standards to augment prediction analyses. The NPRD and EPRD databases include failure data on a wide range of electrical components and electromechanical parts and assemblies. This field-based failure data can be used in your reliability prediction analyses.
When evaluating the similarities and differences between the reliability prediction methods, it is most useful to delve into the calculations used to predict failure rate presented in the various standards. The equations offer valuable insight into the type of data and information you will need about the devices in your system in order to perform a reliability prediction analysis. Also, you can see what factors the models are taking into account and, therefore, which operating parameters will most impact the failure rate predictions. It may also help to assess both the complexity of a particular model, as well as its thoroughness.
The MIL-HDBK-217 Reliability Prediction Standard
MIL-HDBK-217 is one of the most widely known Reliability Prediction standards. It was one of the first models developed, and many other reliability standards available today have their roots in MIL-HDBK-217.
MIL-HDBK-217’s official name is Military Handbook: Reliability Prediction of Electronic Equipment. It was originally developed and published for use by the Department of Defense. Over the years there have been many updates to the MIL-HDBK-217 document, which have resulted in the suffix designations in the document name: MIL-HDBK-217D and MIL-HDBK-217E Notice 1 for example. The current release of MIL-HDBK-217 is MIL-HDBK-217F Notice 2.
There are two primary sections in the MIL-HDBK-217 standard: the Part Stress section and the Parts Count section.
MIL-HDBK-217: Part Stress Section
The Part Stress section leads off the document and includes a number of equations that predict the failure rate for a wide variety of electrical components. For example, the equation for Microcircuits, Gate/Logic Arrays and Microprocessors is:
λp = (C1 * πT + C2 * πE) * πQ * πL
where λp is the failure rate in failures/million hours (or failures/10e6 hours, or FPMH)
The factors in the equation are various operating, rated, temperature, and environmental conditions of the device in the system. For this above equation, the following list describes the variables:
- C1 factors in the complexity of the device, such as the number of gates or transistors
- πT factors in the ambient temperature and any temperature rise associated with the device
- C2 factors in the package of the device, or how it is manufactured and placed in the system, such as surface mounted and/or hermetically sealed
- πE factors in the environment that the device is operating in, such as in space, in an aircraft, in the sea, on the ground, etc.
- πQ factors in the quality of the device based on how it is procured
- πL factors in how long the device has been manufactured
The equations, the variables, and the data parameters needed vary for all the different components modeled. The Part Stress section of MIL-HDBK-217 includes complete details on all the equations and how to assess the variables used in the equations.
MIL-HDBK-217: Parts Count Section
The Parts Count reliability prediction is useful in early design stages when the design is still in progress and not all operating parameters are known. Parts Count predictions do not require as many data parameters for analysis compared to Part Stress predictions.
MIL-HDBK-217 Parts Count analyses can be used as an estimation technique, and, in general, are not as accurate as Part Stress analyses. By using Parts Count models, you can obtain early failure rate assessments and then refine them as your product design evolves and is finalized.
For example, the equation shown above for Microcircuits, Gate/Logic Arrays and Microprocessors in Parts Count is:
λp = λg * πQ
where λg is a generic failure rate based on a subset of information; in this example it is based on device technology type, environment, and device complexity.
In many cases, Parts Count is used to start a Reliability Prediction analysis. Then, as the product design becomes more solidified and data parameters are established, the Parts Count prediction is moved over to Part Stress, maintaining all the data already entered during the Parts Count assessment.
The Telcordia SR-332/Bellcore Standard
Another widely used and accepted Reliability Prediction standard is commonly referred to as Telcordia SR-332. Early on, Telcordia was referred to as the Bellcore standard. The full name of the Telcordia standard is Telcordia: Reliability Prediction Procedure for Electronic Equipment, Special Report SR-332. The Telcordia standard has also been through several updates and revisions, which are designated by the Issue Number. Telcordia Issue 3 is a commonly used standard, while Telcordia SR-332 Issue 4 represents the latest Telcordia Reliability Prediction standard.
Initially, the Bellcore/Telcordia standard was developed for use in the telecommunications industry. Today, Telcordia is commonly used in the commercial sector. However, its use over the years has become widespread. It is now used throughout a broad range of industries, including those related to military and defense applications.
The basis for the Telcordia models is what is referred to as the “Black Box Technique.” Telcordia SR-332 includes equations for the black-box steady state failure rates of devices, as well as equations for the upper confidence level and standard deviation of the black box steady-state failure rates. Example Telcordia formulas to compute the black-box steady state failure rate of a device are:
λBB = λG * πQ * πS * πT
where λBB is the failure rate in failures per billion hours (failures/10e9 hours, or FITs)
σBB = σG * πQ * πS * πT
where σBB is the standard deviation of the black-box steady state failure rate
The factors used in the equations are:
- λG is the device generic failure rate, which is obtained from a series of tables in the Telcordia standard and is based on device parameters which vary according to the device under analysis
- σG is the standard deviation of the generic steady-state failure rate
- πQ factors in the device quality level
- πS factors in the device stresses, such as electrical stress
- πT factors in the device temperature stress
Additionally, the πE, which factors in the environmental condition, is factored into the overall failure rate calculation.
Once the device level black-box steady state failure rates are determined, the unit level and system level failure rates can be calculated.
Using the black-box steady state failure rates as a basis, the Telcordia standard includes additional methodologies for augmenting failure assessments by taking into account other data that may be available about the devices, units, or systems under analysis. This additional information is not required, but can be used if available to adjust failure rates to reflect actual product performance.
Telcordia Reliability Predictions can:
- Compute the upper confidence level of steady state failure rates
- Integrate laboratory data from devices, units, or systems with or without burn-in data
- Integrate field data from devices, units, or systems with or without burn-in data
- Determine early life factors based on no burn-in, limited burn-in, or extensive burn-in
Essentially, real-world data available can be used to further refine the estimated failure rate values. It should be noted that any of this additional data is not required to perform a reliability prediction based on the Telcordia standard. It is up to the analyst to determine if any of this additional data is available and if it is helpful to include in the reliability prediction analysis. In some cases, Telcordia analyses are initially performed to obtain the black-box steady state failure rates, and then updated as laboratory, field, and burn-in data become available.
In summation, some of the unique features of Telcordia include:
- Models for components not found in MIL-HDBK-217, such as lithium batteries, hard disk drives, AC/DC power supplies, gyroscopes, and many more.
- Early life calculations to help analyze failure rates during initial product introduction, or the early life phase, when infant mortality rates are a factor.
- Augmenting failure rates based on data obtained from laboratory test data. By factoring in test data information, your predictions are weighted according to the amount of test data you have.
- Augmenting failure rates based on data obtained from fielded products. By adjusting your failure rates based on this real-world information, your predictions will more accurately reflect your product performance.
The 217Plus Standard
The 217Plus™ reliability prediction standard was developed by Quanterion Solutions. Work on 217Plus was started under Department of Defense contracts with the Reliability Analysis Center (RAC) and Reliability Information Analysis Center (RIAC), and was released originally under the name PRISM.
The failure rate models of 217Plus have their roots in MIL-HDBK-217, but have enhancements to include the effects of operating profiles, cycling factors, and process grades on reliability.
The official 217Plus standard name is Handbook of 217Plus Reliability Prediction Models. An example equation for capacitors in 217Plus 2015 Notice 1 is:
λP =πG* πC * (λOB * πDCO * πTO * πS + λEB * πDCN * πTE + λTCB * πCR * πDT ) + λSJB * πSJDT + λIND
where λp is the failure rate in failure per million calendar hours.
For the equation above, the following list describes the variables:
- πG is the reliability growth factor
- πC is the capacitance factor
- λOB is the operating base device failure rate
- πDCO is the operating duty cycle factor
- πTO is the operating temperature factor
- πS is the stress factor
- λEB is the environmental base failure rate
- πDCN is the non-operating duty cycle factor
- πTE is the non-operating environment temperature factor
- λTCB is the cycling temperature base failure rate
- πCR is the cycling rate factor
- πDT is the delta temperature factor
- λSJB is the solder joint base failure rate
- πSJDT is the solder joint delta temperature factor
- λIND is the induced failure rate
The equations, the variables, and the data parameters vary based on the specific device being modeled.
Once the device failure rates are evaluated, they are summed up to determine a base system failure rate. At this point, further analysis can be done at the system level if more data about the system is available, such as test or field data. By factoring in this information, the 217Plus analysis will provide a more accurate predicted failure rate estimation. At the system level, 217Plus can incorporate environmental stresses, operating profile factors, and process grades. If this data is not known, default values are used.
As with MIL-HDBK-217, there is a Part Count reliability prediction intended for use in early design when all data parameters are not yet finalized, and provides a simpler approach to prediction calculations. The Part Count section of 217Plus includes a number of tables for device failure rates that are based on the combination of the environment and operating profile of the system. In this case, a table lookup will provide the failure rates for your devices without the need for calculations.
China’s GJB/z 299 Reliability Prediction Standard
China’s GJB/z 299 is the most widely used Reliability Prediction standard in the extensive Chinese market. The full name of the standard is GJB/Z 299: Reliability Prediction Model for Electronic Equipment. Its revisions and updates are designated with suffix notations similar to MIL-HDBK-217. The most recent China GJB/z standard is China’s GJB/z 299C.
China’s GJB/z 299 Reliability Prediction standard has its roots in MIL-HDBK-217 and has been developed to align with the procedures and devices found in China.
In a similar fashion to MIL-HDBK-217, there are two components of the China’s GJB/z 299 standard: the Part Stress section and the Parts Count section. The Part Stress section includes complete details on all the equations and how to assess the variables used in the equations.
Parts Count predictions do not require as many data parameters for analysis compared to Part Stress reliability predictions, and are meant to be used in early design when not all data parameters are known. Typical usage is to start with a Parts Count analysis and then move to a Part Stress prediction as actual design parameters are finalized.
China’s GJB/z 299 also includes an appendix for failure rate analysis for imported components, or those not manufactured in China. This enables the Chinese reliability prediction standard to be used across a broad range of products that include components manufactured across the globe.
An example equation from China’s GJB/z 299C, the latest version, for Bipolar Digital Circuits is:
λp= πQ * [C1 * πT * πV + (C2 + C3) * πE] * πL
λp is the failure rate in failures/million hours (or failures/10e6 hours, or FPMH)
- πQ factors in the quality of the device based on how it is procured
- C1 and C2 factor in the complexity of the device, such as the number of gates or transistors
- πT factors in the ambient temperature and any temperature rise associated with the device itself
- πV factors in the voltage stress
- C3 factors in the package of the device, or how it is manufactured and placed in the system, such as surface mounted and/or hermetically sealed
- πE factors in the environment that the device is operating in, such as in space, in an aircraft, in the sea, on the ground, etc.
- πL factors in how long the device has been in production
NPRD and EPRD Databases
The NPRD (Non-electronic Parts Reliability Data) and EPRD (Electronic Parts Reliability Data) include failure data on a wide range of electrical components and electromechanical parts and assemblies. These databases glean failure rate information from an array of sources. Failure data spans a variety of environments and quality levels, allowing you to select components that most accurately reflect your usage.
When utilizing NPRD or EPRD databases, there is no equation to be evaluated, and, therefore, no data parameters to enter. You scan the database of components and select one that matches, or most closely matches, the device you are modeling. The component or assembly failure rate obtained on field-based failures can then be used in your reliability prediction.
The latest versions of these databases, NPRD-2016 and EPRD-2014, can be used alongside the prediction standards and work well together. Oftentimes, NPRD-2016 and/or EPRD-2014 can be used to include failure rate estimates for devices not modeled in the prediction standards.
How do I choose which Reliability Prediction method to use?
There are several aspects to consider when selecting a Reliability Prediction method to use for your analyses.
Oftentimes you may not have a choice: there may be contractual requirements, or the model choice may be selected by a systems integrator, or it may be set by a reliability group. For example, many military and defense-based contracts will require you to use MIL-HDBK-217.
In non-military applications, such as commercial industries including telecom, medical devices, and consumer electronics, Telcordia is often the prediction standard used.
217Plus is used in both military and commercial applications. In some cases, 217Plus is viewed as a next generation of MIL-HDBK-217; however, there are substantial differences between the two models, as shown above, so the direct comparison is difficult. In many cases, 217Plus failure rate predictions are not as pessimistic as MIL-HDBK-217.
China’s GJB/z 299 is employed almost exclusively in China, or in companies doing business with Chinese companies. The failure rate estimates from China’s GJB/z 299 tend to be very divergent from the other standards, especially for microelectronic devices.
Factors to Consider in Choosing A Reliability Prediction Standard
One significant factor to consider when determining which standard to use is the environments and part types supported. For example, MIL-HDBK-217 and 217Plus both support a broad list of environments, whereas Telcordia supports a smaller set of environments, which does not include military environments such as aircraft and naval. Also, the types of parts supported in each reliability prediction standards varies, so it can be ideal to select the reliability prediction standard which supports the types of parts included in your design.
While most of the key device types are covered in all, there is variation. So it is best to review the devices you use in your products to see what model to use. However, you can use more than one model and analysts do this for complete coverage.
Quality levels are also different between models. MIL-HDBK-217 includes quality levels for both military-level screened devices, as well as commercial quality levels. Telcordia was developed for commercial applications.
Benefits of Choosing Telcordia SR-332
Telcordia can set an upper confidence level on calculations and factor in burn-in data; MIL-HDBK-217 does not offer these features. Telcordia also includes the ability to adjust failure rate estimates based on laboratory test data and/or field data; MIL-HDBK-217 does not include this type of adjustment ability. Additionally, Telcordia includes the ability to calculate an infant mortality failure rates. However, some reliability prediction software packages enable you to use these types of adjustments across all models to allow for more flexibility.
The amount of data parameters for devices does vary by device type, but overall, Telcordia generally asks for less data and MIL-HDBK-217 and 217Plus require more. However, some reliability prediction software packages do not require you to enter all the data parameters and will use average default values, which enables you to perform a prediction with minimal input.
Using the 217Plus & MIL-HDBK-217 Reliability Prediction Methods
217Plus can adjust failure rate estimates based on process grades. Process grade factoring is found only in 217Plus and accounts for various elements that may affect device reliability such as the experience of the design team and wear out. 217Plus also considers the operating profile of your system and provides results in units based on calendar hours – another differentiating factor.
MIL-HDBK-217 failure rate predictions are generally more pessimistic than Telcordia and 217Plus reliability predictions. However, this is variable and depends on the devices in your system.
Choosing One (Or More) Reliability Prediction Standards
Unless you have a contractual requirement to use a specific standard, the selection of the reliability prediction standard should be based on your particular needs related to the design in question. Or, one of the reliability standards may be more commonly used in your industry. Or, you may review the standards to determine which one includes the environments and components best matching your design.
You can also mix and match standards. For example, one popular way of performing reliability prediction analyses is to use MIL-HDBK-217, Telcordia, and the NPRD/EPRD databases together. In this case, you get coverage of almost all device types used in product design.
All reliability predictions standards mentioned here are widely known and accepted. The choice is based on particular requirements. There is no right or wrong choice – the selection must be based on which standard best suits your needs.
Calculating Failure Rates with Reliability Prediction Software
You can use reliability prediction analysis to help in many aspects of product design and development. Engineers have used reliability prediction analysis for many years across the globe in a wide span of industries to help in the following ways:
- Assessing the feasibility of a proposed design
- Comparing design alternatives for the most reliable option
- Identifying potential reliability areas of concern
- Performing What-If? analyses to determine the effects of specific design modifications
- Tracking reliability improvement
- Addressing product quality issues in early design before they become problematic
- Decreasing the Cost of Poor Quality (COPQ)
- Meeting contractual compliance requirements
So, no matter which method you choose to perform your reliability prediction calculation and analysis, you will gain the advantages inherent in adding this technique to your reliability and quality tool set.
Using reliability prediction software to calculate failure rates makes this process simpler and provides critical data for reliability requirements. The best tools will even allow you to mix and match standards, provide built-in component libraries, and enable you to view how design changes impact reliability. See how Relyence’s reliability prediction software can help with failure rate calculations – start a free trial today!