Description
Consumption is a major concept in economics and is also studied by many other social sciences. Economists are particularly interested in the relationship between consumption and income, and therefore in economics the consumption function plays a major role.
ABSTRACT
Title of the Thesis: LIFE CONSUMPTION MONITORING FOR ELECTRONICS
Degree Candidate: Satchidananda Mishra, Master of Science, 2003
Thesis directed by: Professor Michael Pecht
Department of Mechanical Engineering
Life consumption monitoring is a method to assess product’s reliability based on
its remaining life in a given life cycle environment. The life consumption monitoring
process involves continuous or periodic measurement, sensing, recording, and
interpretation of physical parameters associated with a system’s life cycle environment to
quantify the amount of degradation.
This thesis explains a life consumption monitoring methodology for electronic
products, which includes failure modes, mechanisms and effects analysis (FMMEA),
virtual reliability assessment, monitoring product parameters, data simplification, stress
and damage accumulation analysis and remaining life estimation. It presents two case
studies to estimate the remaining life of identical circuit card assemblies in an automobile
underhood environment using the life consumption monitoring methodology. Failure
modes, mechanisms, and effects analysis along with virtual reliability assessment is used
to determine the dominant failure mechanism in the given life cycle environment.
Temperature and vibration are found to be the environmental factors, which could
potentially cause malfunction of the circuit card assembly through solder joint fatigue.
Temperature sensor and accelerometers are used along with a data logger to monitor and
record the environmental loads during the experiment. A data simplification scheme is
used to make the raw sensor data suitable for further processing. Stress and damage
models are used to estimate the remaining life of the circuit card assembly based on the
simplified data. Performances of the test board assemblies are monitored through
resistance monitoring. The life cycle environment and results for the case studies are
compared with each other. The estimated results are also compared with experimental life
results.
LIFE CONSUMPTION MONITORING FOR ELECTRONICS
By
Satchidananda Mishra
Thesis submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Master of Science
2003
Advisory Committee:
Professor Michael Pecht, Chair
Associate Professor Patrick McCluskey
Associate Professor Peter Sandborn
© Copyright by
Satchidananda Mishra
2003
PREFACE
In today’s world, increasing global competition along with consumers’
perceptions toward performance, quality, reliability, safety, and environmental
considerations are compelling manufacturers to improve their design for higher reliability
in field applications. In electronics industry, product development trends have supported
this requirement through rapid technological changes resulting in rapid market growth.
However, as manufacturers try to keep pace with performance requirements of modern
electronic industry, reliability is being traded-off at an affordable cost. Intense market
competition has reduced time-to-market for electronics products tremendously thereby
providing less time for extensive reliability testing. There has been a continuous
transition in the electronics industry from military-specification parts to commercial-off-
the-shelf (COTS) parts, many of which are now targeted for lifetimes in the 5 to 7 year
range. Wearout of electronics parts has become a relevant concern with this transition.
Hence there is a need for today’s companies to consider novel approaches to improve
design and maintain operational efficiency of their products in field applications to ensure
customer satisfaction.
Health monitoring has emerged as a promising alternative to traditional reliability
prediction, scheduled maintenance, and run-to-failure operations. Health monitoring is
the method of monitoring product reliability in terms of its health in the life cycle
environment. Life consumption monitoring (LCM) is a health monitoring method to
assess product’s reliability based on its remaining life in a given life cycle environment.
The aim of this thesis is to develop and demonstrate a life consumption monitoring
ii
methodology to determine the remaining life and reliability of an electronic product based
on monitored environmental and operational data. Life consumption monitoring is a
prognostic process unlike many other health monitoring approaches. A well-designed
maintenance procedure based on life consumption monitoring can be used to predict and
prevent system failure and hence to reduce operating costs.
Chapter 1 discusses the concept and motivation behind the reliability prediction
practices followed in industry. This chapter highlights the drawbacks of current
reliability prediction techniques and explains health monitoring as a solution to the
reliability prediction challenge. The approaches adopted for health and life consumption
monitoring are described along with various examples.
Chapter 2 describes a physics-of-failure-based life consumption monitoring to
determine the remaining life of an electronic product. The life consumption monitoring
process involves continuous or periodic measurement, sensing, recording, and
interpretation of physical parameters associated with a system’s life cycle environment to
quantify the amount of system degradation. The process is documented in the form of a
flowchart, which includes failure modes, mechanisms, and effects analysis (FMMEA),
virtual reliability assessment, monitoring product parameters in the product’s life cycle
environment, data simplification, stress and damage accumulation analysis and remaining
life estimation.
Chapter 3 describes two case studies to demonstrate the developed life
consumption monitoring methodology. For the case studies, two circuit card assemblies
were mounted under-the-hood of an automobile. Failure modes, mechanisms, and effects
analysis (FMMEA) was conducted along with a virtual reliability assessment for the
iii
circuit card assembly in the given life cycle environment, which revealed that solder joint
fatigue is the dominant failure mechanism. The environmental parameters that can cause
damage are identified as temperature, and vibration. A suitable data simplification
scheme was developed to make the sensor data suitable for input to solder joint fatigue
models. The identified environmental parameters were monitored and recorded with the
help of sensors and a battery powered data logger. The collected and simplified data was
used with stress and damage models using the calcePWA analysis software to estimate
the remaining life. The actual life of the circuit card assemblies were checked through
resistance monitoring and compared with the estimated life.
Chapter 4 presents a summary and some discussion on the life consumption
monitoring methodology described in the thesis. Chapter 5 highlights the specific
contributions made in this thesis.
iv
ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to Dr. Michael Pecht, Dr. Peter Sandborn,
and Dr. Patrick McCluskey of the University of Maryland, College Park for their advice
and support during the course of this thesis and my stay at the University of Maryland. I
am grateful to Dr. Diganta Das, Dr. Miky Lee, Dr. Sanka Ganesan, Keith Rogers, and
Dan Danahoe of CALCE center at the University of Maryland, without whose help this
thesis would not have reached a fruitful completion. I also wish to express my gratitude
to Mr. Doug Goodman from Ridgetop Group Inc., Mr. Bart Feys from Dallas
Instruments, and Mr. Paul Macmillan from ACI-AppliCAD Inc.
I express my special thanks to my colleagues Yuki Fukuda, Jeremy Cunningham,
Sathyanarayan Ganesan, Niranjan Vijayragavan, Yu-Chul Hwang, Paul Casey, Anoop
Rawat, Vidyasagar Shetty, Lewis Gershan, Ji Wu, Sanjay Tiku, Subramaniam Rajagopal,
Leila Jannessari, Joseph Varghese, Arindam Goswami, Ricky Valentin, Karumbu,
Kaushik Ghosh, who have always been helpful to me during my thesis work.
v
TABLE OF CONTENTS
LIST OF TABLES viii
LIST OF FIGURES ix
1 INTRODUCTION 1
1.1 RELIABILITY PREDICTION OF ELECTRONICS 3
1.2 HEALTH MONITORING 5
1.3 APPROACHES FOR HEALTH MONITORING 7
1.3.1 Current Condition Monitoring 7
1.3.2 Life Consumption Monitoring 7
1.4 CURRENT STATE OF HEALTH MONITORING RESEARCH 8
1.5 HEALTH MONITORING EXAMPLES 10
2 LIFE CONSUMPTION MONITORING METHODOLOGY FOR ELECTRONICS13
2.1 FAILURE MODES, MECHANISMS AND EFFECTS ANALYSIS 14
2.1.1 Identification of Failure Modes and Corresponding Failure Sites 15
2.1.2 Identification of Failure Mechanisms and Models 15
2.1.3 Identification of the Life Cycle Conditions 15
2.1.4 Selection of failure mechanisms that can precipitate a failure mode 16
2.2 VIRTUAL RELIABILITY ASSESSMENT 16
2.2.1 Prioritization of the Failure Mechanisms Based on Time-to-failures 17
2.2.2 Identification of The dominant Failure Mechanisms 17
2.3 MONITORING APPROPRIATE PRODUCT PARAMETERS 18
2.4 DATA SIMPLIFICATION PROCESSES 19
2.5 STRESS AND DAMAGE ACCUMULATION ANALYSIS 20
2.5.1 Stress and Damage Models 20
2.5.2 Damage Accumulation Theories 22
2.6 ESTIMATION OF REMAINING LIFE 24
2.7 ACCEPTABLE REMAINING LIFE 24
3 EXPERIMENTAL CASE STUDIES ON LIFE CONSUMPTION MONITORING 26
3.1 FAILURE MODES, MECHANISMS AND EFFECTS ANALYSIS 27
3.2 VIRTUAL RELIABILITY ASSESSMENT 28
3.3 MONITORING PRODUCT PARAMETERS 33
3.3.1 Sampling Issues for Monitoring of Continuous Signals 35
3.4 DATA SIMPLIFICATION 38
3.4.1 Temperature Data Simplification 38
3.4.2 Vibration Data Simplification 47
3.5 STRESS AND DAMAGE ACCUMULATION ANALYSIS 48
vi
3.6 REMAINING LIFE ASSESSMENT 51
3.7 FAILURE DEFINITION AND DETECTION 51
3.8 MONITORED ENVIRONMENT AND RESULTS 52
3.8.1 Case Study-I 52
3.8.2 Case Study-II 61
3.8.3 Comparison between case study-I and II 67
4 SUMMARY AND DISCUSSION 68
4.1 USING HANDBOOKS AND SIMILARITY ANALYSIS TO DETERMINE ENVIRONMENTAL
AND OPERATIONAL LIFE CYCLE CONDITIONS 70
4.2 DETERMINING THE NUMBER OF DATA POINTS REQUIRED FOR LIFE ESTIMATION 72
5 CONTRIBUTIONS 75
APPENDIX I: DAMAGE ASSESSMENT MODEL FOR TEMPERATURE INDUCED
FATIGUE ANALYSIS 76
APPENDIX I: DAMAGE ASSESSMENT MODEL FOR TEMPERATURE INDUCED
FATIGUE ANALYSIS 76
APPENDIX II: DAMAGE ASSESSMENT MODEL FOR VIBRATION INDUCED
FATIGUE ANALYSIS 79
APPENDIX III: ANALYSIS OF CAR ACCIDENT FOR CASE STUDY-I 82
APPENDIX IV: EFFECT OF TEMPERATURE DATA REDUCTION ON
PREDICTION ACCURACY OF LIFE CONSUMPTION MONITORING 85
REFERENCES 88
vii
LIST OF TABLES
Table 3.1: Failure Modes and Effects Analysis (FMEA) for the circuit card assembly
used for the experiment..................................................................................................... 29
Table 3.2: Data used for defining the temperature environment for the virtual reliability
assessment......................................................................................................................... 31
Table 3.3: Virtual reliability assessment........................................................................... 32
Table 3.4: Comparison between case studies I and II....................................................... 67
viii
LIST OF FIGURES
Figure 2.1: Various steps in the life consumption monitoring approach...........................14
Figure 3.1: Experimental setup with the test board mounted under-the-hood of a car
(1997 Toyota 4Runner)......................................................................................................27
Figure 3.2: The power spectral density plot used for the virtual reliability assessment ....31
Figure 3.3: The data logger with external temperature and vibration sensor ....................34
Figure 3.4: Sampling of a continuous time record.............................................................36
Figure 3.5: Temperature history showing the reversals (i.e., peaks and valleys) ..............40
Figure 3.6: Identifying cycles in a load history .................................................................42
Figure 3.7: Rainflow cycle counting..................................................................................45
Figure 3.8: Loop condition and loop reaping operations...................................................46
Figure 3.9: Ratio of the response of the PCB to the excitation vs. frequency. The peaks in
this plot identify the natural frequencies............................................................................50
Figure 3.10: Monitored temperature during case study-I ..................................................53
Figure 3.11: Monitored temperature converted to the peaks and valleys for case study-I 54
Figure 3.12: Power spectral density (PSD) vs. frequency plot for case study-I ................55
Figure 3.13: Estimated board displacement due to vibration for case study-I...................56
Figure 3.14: Recorded vibration event during the car accident. The maximum
acceleration values were from +22 g to –23 g. ..................................................................57
Figure 3.15: Accumulated damage estimated using calcePWA and Miner’s rule for case
study-I ................................................................................................................................58
Figure 3.16: Resistances of the solder joints along with the intermittent resistance spikes
for case study-I...................................................................................................................59
Figure 3.17: Remaining life estimation summary for case study-I....................................60
Figure 3.18: Crack in one of the solder joints....................................................................60
Figure 3.19: Experimental setup for case study-II.............................................................61
ix
Figure 3.20: Monitored temperature for case study-II.......................................................62
Figure 3.21: Monitored temperature profile converted to the peaks and valleys for case
study-II...............................................................................................................................63
Figure 3.22: Power spectral density (PSD) vs. frequency plots for case study-II .............64
Figure 3.23: Accumulated damage for case study-II .........................................................65
Figure 3.24: Resistances of the solder joints along with the intermittent resistance spikes
for case study -II. ...............................................................................................................66
Figure 3.25: Remaining life estimation summary for case study-II ..................................66
Figure 4.1: Extension of life based on life consumption monitoring results.....................72
Figure 4.2: Cumulative past vs. near past for life estimation ............................................74
x
1 INTRODUCTION
Reliability is defined as the ability of a product to perform as intended (i.e.,
without failure and within specified performance limits) for a specified time, in its life
cycle application environment. During the past 25 years, there has been a lot of
improvement in the reliability of electronic products to keep pace with the increased
warranties and the possible liabilities of product failures. Various technology
improvements including semiconductor manufacturing processes have continuously
helped to increase device reliability. However, there has been continuous shrinking of the
feature sizes for electronic devices along with improved performance requirements. In
fact, according to Moore’s law number of transistors in a given semiconductor has been
doubling every eighteen months. This trend has decreased the device dimensions thereby
resulting in higher electric fields and higher localized heating. Surface mount technology
(SMT) has become a common practice to take into account the higher I/O requirement,
which has made interconnects more vulnerable to harsh environments. In other words, as
manufacturers try to keep pace with performance requirements, reliability of electronic
systems is getting traded-off for increased functionality at an affordable cost.
There has been intense competition among rival companies based on cost and
quality of electronics products. Increasing market competition and customers’
expectation for latest technology has reduced the allowable “time to market” in
electronics industry tremendously. With reduction in allowable time for the product
development cycles, there is less opportunity for extensive reliability testing. As a result,
outputs from the current reliability assessment schemes, which are based on extensive
reliability trials, may not satisfy the customer requirements. Failure to resolve this issue
can result in high risk of in-service availability and inflated life support costs for
electronic systems.
The modern electronics industry is more and more being driven by the consumer
electronics segment as compared to space, military, avionics and oil-drilling segment
(i.e., the low volume complex electronics or LVCES industry). This has compelled the
LVCES industry to adapt to the commercial-off-the-shelf (COTS) parts instead of
traditional military-specification parts. With this transition from military-specification
parts to COTS parts, many of which are now targeted for lifetimes in the 5 to 7 year
range, wearout of electronics parts is becoming a concern.
There has been a constant need for the maintenance strategies to be more
proactive based on in-service reliability of the products. By knowing whether and, more
importantly, when maintenance is needed, production or operation schedules can be
synchronized, and the cost of maintenance can be reduced. Hence there is a need for the
companies to consider various approaches in order to improve reliability and maintain the
operational efficiency of their products in the field applications including in-service
reliability monitoring.
In-service reliability of a product is dependent on the environmental and
operational conditions of the product in its field applications. Traditionally, reliability of
electronic products has been predicted without keeping in mind the actual environmental
and operational parameters in its life cycle environment. Current data collection schemes
for reliability assessment are often designed before in-service operational and
environmental aspects of the system are entirely understood. Further the allowable time
2
for reliability trials are reducing because of reduction in product development cycles,
thereby causing lack of suitable test data.
In summary, today’s organizations are faced with the challenge of maintaining
electronic product reliability with increased performance requirements, increased market
competition, and less allowable time-to-market. There is a continuous demand to lower
maintenance costs and to hasten operational readiness/responsiveness.
Health monitoring has emerged as a promising alternative to traditional reliability
prediction, scheduled maintenance, and run-to-failure operations. Health monitoring is
the method of monitoring product reliability in terms of its health in the life cycle
environment. Life consumption monitoring (LCM) is a health monitoring method to
assess product’s reliability based on its remaining life in a given life cycle environment.
The life consumption monitoring process involves continuous or periodic measurement,
sensing, recording, and interpretation of physical parameters associated with a product’s
life cycle environment to quantify the amount of degradation. This thesis describes the
concept and various aspects of life consumption monitoring along with two case studies.
1.1 Reliability Prediction of Electronics
Reliability is defined as the ability of a product to perform as intended (i.e.,
without failure and within specified performance limits) for a specified time, in its life
cycle application environment. An efficient reliability prediction can be used for
numerous purposes, including the following [1]
• Comparisons of the designs and products
• Methods to identify potential reliability improvement opportunities
• Logistics support
3
? Forecast warranty and life cycle costs
? Spare parts provisioning
? Availability
• Safety analysis
• Mission reliability estimation
• End item reliability estimation
• Prediction of reliability performance
IEEE standard 1413 [1], titled “Standard Methodology for Reliability Prediction
and Assessment for Electronic Systems and Equipment” presents the key parameters of
importance for reliability prediction include structural architecture, material properties,
fabrication and assembly processes, and the life cycle environment.
Defining and characterizing the product life cycle environment is often the most
uncertain input into a reliability prediction scheme. Product life cycle environment
typically includes storage, transportation, handling and application scenario of the
product. It also describes the expected severity and duration of the load conditions for
each scenario [2], [3]. Load conditions for an electronic product include temperature,
humidity, vibration or shock loads, contaminants, radiation levels, electromagnetic
interference and loads caused by operational parameters such as current, power and heat
dissipation. Life cycle environment characterization also requires knowledge of
parameters like the application length, the number of applications in the expected life of
the product, and the product utilization or non-utilization profile (storage, testing,
transportation).
4
The common practice of design has been to provide a safety margin, i.e.,
designing for a high product load and recommending operation at a lower value, due to
uncertainties regarding the actual life cycle loads for a product [4]. If the actual life cycle
loads are different from the designed ones, this design practice can lead to costly over
design or hazardous under design, and consequently, increased costs.
1.2 Health Monitoring
Health monitoring is one of the emerging and most promising developments in the
evolution of in-service reliability assessment and maintenance practices. A product’s
health is the extent of degradation or deviation from its “normal” operating state. Hence
health monitoring is based on the condition of the actual system or equipment concerned,
not on the statistical mean. By determining whether and, more importantly, when failure
can occur, procedures can be developed to mitigate, manage or maintain the product [5].
An efficient health monitoring scheme can be used to [6]:
• Reduce lost output penalties
• Reduce forced outage repair and labor costs
• Reduce spares holdings
• Reduce severity of failures
• Improve safety margins
• Reduce insurance premiums
• Extend maintenance cycles
• Maintain the effectiveness of equipment through timely repair actions
• Improve repair quality
• Increase profitability
5
Methods employed for health monitoring can include non-destructive tests (e.g.,
ultrasonic inspection, liquid penetrant inspection, and visual inspection) and operating
parameter monitoring (e.g., vibration monitoring, oil consumption monitoring and
thermography (infrared) monitoring) [7]. Predictive or prognostic health monitoring
methods involve monitoring of the life cycle environment of the product (e.g.,
temperature, humidity, shock, vibration, current, power, heat dissipation) to predict when
the product is going to fail in real life.
Health monitoring has been used for both electrical and mechanical systems for
reliability prediction and hence reduction in maintenance expenses. An example of the
monetary benefit of health monitoring was presented in the context of corrosion in a
workshop on condition-based maintenance (CBM) on November 17-18, 1998 in Atlanta,
organized by the Advanced Technology Program (ATP) of the National Institute of
Standards and Technology (NIST) [8]. It was stated that if corrosion could be measured
directly at a refinery plant, the downtime for maintenance could be reduced from every
year to potentially every 3 years. Since a typical maintenance period is two weeks to a
month (about 10% of the available operating time), an economic value can be assigned to
a reduction in downtime. However, offsetting maintenance costs is probably not the
primary economic driver for CBM. Instead, it should be looked at as an integral part of a
business strategy for profitability. In the case of CBM, it contributes to maximum up time
(capacity) with reduced operating costs.
There are some technical barriers for implementation of health monitoring, which
include the inability to continually monitor a system and accurately predict the remaining
useful life. Further the use of health monitoring in real life can be more appreciated if it
6
can help in learning and identifying impending failures for a system as well as
recommending an action.
1.3 Approaches for Health Monitoring
Health monitoring is the method of evaluating reliability in terms of product’s
health in its life cycle environment. Health monitoring methods can be broadly classified
into two categories, i.e., current condition monitoring and life consumption monitoring.
1.3.1 Current Condition Monitoring
Current condition monitoring is a method of evaluating the product’s operating
state in terms its physical degradation (e.g., cracks, corrosion, delamination), electrical
degradation (e.g., increase in resistance, increase in threshold voltage), and performance
degradation (e.g., shift of the product’s operating parameters from expected values). The
objective of condition monitoring (also called condition-based maintenance) is to
accurately detect the current state of electrical and mechanical systems and enable the
user to make a decision on whether to perform maintenance. Hence condition monitoring
is mainly a diagnostic activity. This helps to prevent operational deficiencies and failures,
eliminates costly periodic maintenance, and reduces the likelihood of machinery failures.
1.3.2 Life Consumption Monitoring
Life consumption monitoring (LCM) is a method to assess product’s reliability
based on its remaining life in a given life cycle environment. In life consumption
monitoring, product’s reliability can be assessed by comparing the remaining life of the
product with estimated total life. The life consumption monitoring process involves
continuous or periodic measurement, sensing, recording, and interpretation of physical
7
parameters associated with a system’s life cycle environment to quantify the amount of
system degradation.
1.4 Current State of Health Monitoring Research
Most of the work on health monitoring available in literature focuses on
diagnostic or condition monitoring of various mechanical (metals and composites)
structures. This is often sited as “structural health monitoring”. Typical methods used for
condition monitoring include
• Visual inspection
• Optical fibers [9], [10], [11]
• Eddy current [12]
• Acoustic emission [13]
• Vibration signatures and modal analysis [14], [15], [16]
• Piezoelectric materials [17], [18]
Several organizations, professional societies and universities are involved in
activities related to health monitoring. The following section gives a listing of some of
the leading groups involved in health monitoring research:
• Condition Monitoring and Diagnostic Engineering Management (COMADEM) [19]
- Consultancy program on different aspects of condition and diagnostic monitoring
focusing on proactive integrated maintenance management.
- Publishers of “International Journal of COMADEM” dedicated to sensor
technology, structural health monitoring and machinery/process health monitoring
• Society for Machinery Failure Prevention Technology (MFPT) is a professional
society focused on sensors technology, condition monitoring, predictive maintenance,
8
prognostics technology, condition based maintenance, nondestructive evaluation and
testing, life extension and integrated diagnostics in conjunction with the annual
meeting [20].
• The National Aeronautics and Space Administration (NASA)
- Health monitoring for aviation safety program (AvSP) which includes monitoring
fuel flow, rotor speeds, oil temperature/ pressure, engine vibration [21]
- Efficient checkout, testing and monitoring of space transportation vehicles,
subsystems and components before, during and after operation under the vehicle
health monitoring (VHM) program [22].
- Develops smart sensors for health monitoring, e.g., solenoid health monitor for
valve health monitoring and failure prediction.
• Office of Naval Research conducts research for on-board mechanical diagnostics and
vehicle health monitoring (integrated avionics) for improved operational effectiveness
of air vehicles with increased capability, range, speed, time-on-station, and carrier
suitability [23].
• Department of Defense (DoD) has a prognostics health management (PHM) program
for joint strike fighter (JSF) [24]
• QinetiQ, UK
- Vehicle health and usage monitoring system (vHUMS) program that uses various
sensor technologies, data analysis and reporting tools [25]
- Integrated engine management program using condition monitoring [26]
• UK Ministry of Defense (MoD) - Engine health monitoring systems [27]
9
• US Army Material Systems Analysis Activity (AMSAA) – Physics-of-failure
approach for reliability modeling [28]
• CALCE EPSC at University of Maryland conducts research on diagnostic and
prognostic health monitoring focusing on electronics [29]
• Pennsylvania State University Applied Research Laboratory (ARL) [30]
1.5 Health Monitoring Examples
An example of condition monitoring in mechanical systems is applied in the
ETOPS (Extended-range Twin-engine Operations) program. ETOPS restriction,
formalized under US FAA Regulations in 1953, prohibits passenger carrying aircraft with
only two engines from flying any route more that a given single-engine flying time from
a suitable and open landing site. Gradual relaxation in this rule has resulted from
improvements in health monitoring technologies, which now provide the continuous
monitoring of critical aircraft systems necessary to identify problems before they affect
aircraft operation or safety along with reductions in engine in-flight shutdown rates. The
ETOPS philosophy is a real-time approach to maintenance and includes continual
monitoring of application conditions to identify problems. Two typical examples of
ETOPS are engine condition monitoring (ECM) and oil consumption monitoring. ETOPS
operators are required to use ECM programs to monitor adverse trends in engine
performance and execute maintenance to avoid serious failures (e.g., those that could
cause in-flight shutdowns, diversions, or turnbacks). The ECM programs allow for
monitoring of engine parameters such as exhaust temperature, fuel and oil pressures, and
vibration. In some cases, oil consumption data and ECM data can be correlated to define
10
certain problems. Any engine deterioration that might affect ETOPS operations is
monitored through a disciplined data collection and analysis program [31].
Built-in-test (BIT) is a condition monitoring technique used for electronics that
uses hardware-software diagnostic mean to identify and locate faults. Two types of BIT
concepts are employed in electronic systems, interruptive BIT (I-BIT) and continuous
BIT (C-BIT). The concept of I-BIT is that normal equipment operation is suspended
during BIT operation. Such BITs are typically initiated by the operator or during a
power-up process. The concept of C-BIT is that equipment is monitored continuously and
automatically without affecting normal operation [32].
JDIS (Joint Distributed Information System), another example of health
monitoring scheme applied to mechanical systems. JDIS can anticipate maintenance and
repair needs to ensure that equipment and personnel are available precisely when needed
[33]. It can reduce the time taken to deliver aircraft replacement parts to be hours, as
compared to weeks or months under current practices. During a demonstration, Boeing
simulated how a network of computers and aircraft sensors can trigger an autonomic
response to a pending maintenance need under JDIS scheme. For instance, if a part
failure occurs or is predicted to occur, JDIS initiates a series of actions that can provide
the right information for the engineer about the replacement of parts at the right time.
This way, human interaction is minimized as data flows from the aircraft through the
maintenance infrastructure and ultimately to the supplier community.
General Motors’ research labs are using predictive equations for calculating
remaining oil based on monitoring engine usage over time [34]. Engine oil breaks down
as a function of time at temperature oxidation and engine usage related contamination.
11
On selected vehicles this algorithm is programmed into the engine control modules
(ECM) and keeps the driver aware of their oil life status and displays via the vehicle's
driver information center (DIC) display.
12
2 LIFE CONSUMPTION MONITORING METHODOLOGY FOR
ELECTRONICS
Life consumption monitoring (LCM) is a health monitoring method to assess
product’s reliability based on its remaining life in a given life cycle environment. The life
consumption monitoring process involves continuous or periodic measurement, sensing,
recording, and interpretation of physical parameters associated with a system’s life cycle
environment to quantify the amount of system degradation. This section explains a life
consumption monitoring methodology for electronics.
Life consumption monitoring methodology has six steps to estimate the remaining
life of an electronic product (Figure 2.1). These steps include failure modes, mechanisms
and effect analysis (FMMEA), virtual reliability assessment, monitoring of the critical
parameters of the product’s life cycle environment, simplification of the monitored data,
stress and damage accumulation analysis, and remaining life estimation. Each step will be
described in this section.
The life consumption monitoring methodology described in this thesis is an
improvement over the existing methodology developed by Ramakrishnan et al. [35] for
his Masters thesis. The existing methodology focused on estimation of accumulated
damage of solder joints for electronics. An assumption was made that temperature and
vibration are the dominant environmental parameters that can cause failure due to solder
joint fatigue.
The improved methodology has been extended and generalized to system level,
where there is a possibility of various other failure mechanisms. Failure modes,
mechanisms, and effects analysis (FMMEA) and virtual reliability assessment has been
13
included in the improved methodology to determine the dominant failure mechanism in a
given life cycle environment and the corresponding environmental and operational
parameters. Another step has been added to determine the remaining life of the product
based on the accumulated damage information.
Step 1: Conduct failure modes, mechanisms and effects analysis
Step 6: Estimate the remaining life of the product
Step 5: Perform stress and damage accumulation analysis
Continue
monitoring
Is the
remaining-life
acceptable?
No
Yes
Step 4: Conduct data simplification to make sensor data suitable for stress
and damage models
Schedule a maintenance action
Step 3: Monitor appropriate product parameters
environmental (e.g, shock, vibration, temperature, humidity)
operational (e.g., voltage, power, heat dissipation)
Step 2: Conduct a virtual reliability assessment to assess the failure
mechanisms with earliest time-to-failure
Step 1: Conduct failure modes, mechanisms and effects analysis
Step 6: Estimate the remaining life of the product
Step 5: Perform stress and damage accumulation analysis
Continue
monitoring
Is the
remaining-life
acceptable?
Is the
remaining-life
acceptable?
No
Yes
Step 4: Conduct data simplification to make sensor data suitable for stress
and damage models
Schedule a maintenance action
Step 3: Monitor appropriate product parameters
environmental (e.g, shock, vibration, temperature, humidity)
operational (e.g., voltage, power, heat dissipation)
Step 2: Conduct a virtual reliability assessment to assess the failure
mechanisms with earliest time-to-failure
Figure 2.1: Various steps in the life consumption monitoring approach
2.1 Failure Modes, Mechanisms and Effects Analysis
An electronic product is typically a combination of components and interconnects,
all having various failure mechanism by which they can fail in the life cycle applications.
The objective of the failure modes, mechanisms, and effects analysis (FMMEA) in life
consumption monitoring is to identify the failure mechanisms that can precipitate a
14
failure mode in the given environmental and operational conditions. Following sections
discuss about the steps involved in FMMEA.
2.1.1 Identification of Failure Modes and Corresponding Failure Sites
The failure modes, mechanisms, and effects analysis (FMMEA) starts with
identification of all possible failure modes and the corresponding failure sites. A failure
mode is defined by how a failure is observed. Hence it is closely related to the functional
and performance requirements of the product. Failure modes are identified by
determining what could possibly go wrong or how a product can fail to meet its
specifications. Typical failure modes for electronic products include electrical opens or
shorts, change in resistance, intermittent resistance change. Failure site defines the
location of failure, e.g., printed circuit board, plated through holes (PTH), components,
interconnects.
2.1.2 Identification of Failure Mechanisms and Models
The second step in FMMEA is to determine all possible failure mechanisms
followed by identification of corresponding failure models available. Failure models are
used to identify the environmental and operational parameters along with the product
geometry responsible for a specific failure mechanism. More details about various failure
models for electronic components and printed circuit boards can be found in literature
[36]-[41].
2.1.3 Identification of the Life Cycle Conditions
This step requires determination of the life cycle environment conditions for the
product. The life cycle environment of a product consists of the assembly, storage,
15
handling, and usage conditions of the product, including the severity and duration of
these conditions. Information on product usage conditions can be obtained from
environmental handbooks or data monitored in similar environments. Some times it may
be necessary to include the assembly, storage, handling and transportation conditions.
The life cycle conditions are compared with the inputs to the failure models in the next
step.
2.1.4 Selection of failure mechanisms that can precipitate a failure mode
Depending on the life cycle environment, particular failure mechanisms have the
potential to cause product failure. This step eliminates some of the failure mechanisms
based on inputs to failure models, life cycle environment conditions and product
geometry. For example, metallization corrosion models require high moisture content to
precipitate a failure mode. Hence if the moisture content is very low in a given
environment, failure due to metallization corrosion might be eliminated. The failure
mechanisms that cannot be eliminated are selected for analysis using virtual reliability
assessment.
2.2 Virtual Reliability Assessment
Virtual reliability assessment method is used to assess potential failure mechanisms
identified by the failure modes, mechanisms, and effects analysis (FMMEA). The
objective of this step to identify the dominant failure mechanisms and corresponding
environmental and operational parameters based on time-to-failures and identify the
environmental and operational parameters for monitoring.
16
2.2.1 Prioritization of the Failure Mechanisms Based on Time-to-failures
This step starts with estimation of time-to-failures based on the failure
mechanisms and models selected by FMMEA. The failure mechanisms are then ranked
based on the time-to-failures. Failure models typically require product geometry along
with life cycle environmental and operational parameters. At this stage of life
consumption monitoring the product geometry is available Life cycle environment
conditions for the analysis are taken from the sources identified during FMMEA. In cases
of new products, environmental handbooks are good sources of information.
2.2.2 Identification of The Dominant Failure Mechanisms
This step identifies the dominant failure mechanisms based on the time-to-failure
rankings. In principle, for a non-repairable unit the dominant failure mechanism is the
one by which the first failure is expected to occur. But in practice, more than one
dominant failure mechanism may need to be considered because of variability in
materials, manufacturing processes, and life cycle loads. Failure mechanisms with time-
to-failures less than the product life expectation are considered as candidate dominant
failure mechanisms. In other words, failure mechanisms with time-to-failure greater than
(20%, based on rule of thumb) the expected product life need not be considered. If
maintenance is available until end of product life expectation, all failure mechanisms
with time-to-failures below the product life expectation are considered as dominant
failure mechanisms. While choosing dominant failure mechanisms for complex systems,
tradeoff may be made based on cost, memory and processing capabilities.
17
2.3 Monitoring Appropriate Product Parameters
Monitoring product parameters involve measurement and monitoring of product
life cycle environment, which includes the environmental and operational parameters
identified by virtual reliability assessment. The life cycle environment of a product
consists of the assembly, storage, handling, and usage conditions of the product,
including the severity and duration of these conditions [2]. Specific life cycle loads on an
electronic product include environmental conditions such as temperature, humidity,
pressure, vibration or shock, radiation, contaminants, and loads due to electrical operating
conditions, such as current, power and heat dissipation. These loads can affect the
reliability of the product either individually or in combination with each other. Product
parameters can be monitored in a continuous or periodic manner using various sensors
mounted on or within the product. An ideal sensing device should be:
• Compatible with existing electronics (i.e., it should have minimal impact on the total
cost, performance, and reliability of the existing product)
• Accurate, have low response time, and self-correcting (e.g., having temperature
compensation) in operation
• Small, lightweight, and consuming system-independent and little power (preferably
self-powered)
• Easy to incorporate into the system
• Easily accessible for data acquisition, service, maintenance, and upgrades
Typical sensors include temperature sensors (thermocouples, thermistors, and
resistance thermo detector (RTD) sensors), humidity sensors, accelerometers, pressure
sensors (piezoelectric, MEMS) etc. The data measured by these sensors are recorded by a
18
data logger for further processing. The recording process needs specification of
parameters including sampling intervals for measurements, signal trigger values
1
.
2.4 Data Simplification Processes
Data simplification is the process of converting the raw sensor data into a form
suitable for the stress and damage models. Simplification of data is necessary since the
monitored data cannot be directly used with the stress and damage models in many cases.
For example, Engelmaier’s model for thermal fatigue of solder joints requires
temperature data in the form of temperature cycles and hence there is a need to convert
the temperature data from sensors to equivalent temperature cycles. The data
simplification process typically depends on the input requirements of the stress and
damage assessment model. Some examples of data simplification include
• Conversion of irregular temperature history into a regular sequence of peaks and
valleys for thermal fatigue analysis.
• Conversion of temperature reversals into relevant temperature cycle information.
• Conversion of acceleration data in time domain to power spectral density (PSD) in
frequency domain.
Data simplification process can also provide data reduction if necessary. A suitable
data reduction scheme is useful for analyzing large amounts of data by gain in computing
speed and reduction in memory requirements.
1
Some times data is recorded only if the value is more than a certain pre specified value. The pre specified
value is known as signal trigger value.
19
2.5 Stress and Damage Accumulation Analysis
Stress and damage accumulation analysis is used to estimate the accumulated
damage in the product based on simplified data. This step begins by creating numerical
models based on product geometry and material properties. For example, creation of
model for a circuit card assembly requires information on board material and dimensions,
component material, dimensions, and their respective orientations. Information on
product geometry is obtained from design specifications and manufacturer data sheets.
The numerical model is used to estimate the stress at individual failure sites based on life
cycle environment loads.
Based on the estimated stress values, the accumulated damage for the product in
its life cycle environment is estimated. Estimation of accumulated damage involves two
separate steps: 1) application of stress and damage models, 2) application of a suitable
damage accumulation theory.
2.5.1 Stress and Damage Models
The purpose of stress and damage models is to determine or predict the
occurrence of a specific wear out failure mechanism in a specific application. The
prediction process looks at each individual failure mechanism (such as solder joint
fatigue, electromigration, conductive filament formation, die cracking to name a few) to
estimate the probability of failure. This approach can be applied to electronic parts used
in military, space, telecommunication, industrial, automotive, aviation, and consumer
utilization applications, and is applicable to the entire life cycle of the product.
Selecting the proper damage model is the key to the accuracy of the stress and
damage accumulation analysis. Specific models for each individual failure mechanism
20
are available from a variety of reference books. These models can either be in the time
domain or in the frequency domain. Stress and damage models can be divided into two
major classes depending on the input requirements, i.e., models requiring cyclic inputs
and models requiring non-cyclic inputs.
Models requiring cyclic inputs are typically used to determine the fatigue life or
damage for a part. Examples of this class of models include Coffin-Manson’s model for
cyclic fatigue, Suhir’s model for die fracture, and Pecht and Lall’s model for wire fatigue.
The following equation shows Coffin-Manson’s model for thermal fatigue of solder
joints.
c
f
F
N
1
2
2
1
|
|
.
|
\
|
?
?
=
?
?
(2.1)
where N
F
is the number of cycles to failure, ?? is the cyclic strain, and c, ??
f
are material
constants. A cycle is identified “when a material remembers its prior deformation history
and changes its tangent stiffness to follow the original loading path”. Since the data
measured by a sensor is typically in the time domain (or in the frequency domain for
vibration sensors), cycle counting methods are used to transform the original history into
an equivalent cyclic history that can be directly incorporated into a fatigue damage
model.
Models requiring non-cyclic inputs usually require time varying value of the
independent variable as an input. Examples of this class of models include Black’s model
for electromigration, Kidson’s model for intermetallic formation, and Howard’s model
for metallization corrosion. The following equation shows Black’s model for
electromigration.
21
kT
m
E
e Aj t
2
F
?
=
(2.2)
where t
F
is the time to failure, j is the current density (the time varying independent
variable), T is the absolute temperature and A, E
m
, k are constants.
2.5.2 Damage Accumulation Theories
Failures can be classified in two types – overstress and wear out. Overstress
failures are catastrophic failures occurring due to single occurrence of a stress event that
can exceed the intrinsic strength of the material. On the other hand, failures due to
gradual accumulation of damage beyond the endurance limit of the material are known as
wear out mechanisms. In well-designed and high-quality hardware, the accumulated
damage should not exceed the damage threshold within the usage life of the product.
Failure models describing wear out failures are usually based on values of environmental
or operational variables.
Damage is defined as the extent of a system’s degradation or deviation from a
defect-free normal operating state. The basic postulate adopted by most fatigue engineers
is that operation at a given cyclic stress amplitude will produce fatigue damage in a
certain number of operation cycles. It is further postulated that the damage incurred is
permanent, and operation at several different stress amplitudes in sequence will result in
an accumulated damage equal to the sum of the damage accrued at each individual stress
level. When the total accumulated damage reaches a threshold level, fatigue failure
occurs. Many different damage models have been proposed to quantify damage caused
by operation at varying stress levels. The Palmgren-Miner cumulative damage theory or
the linear damage theory is the most common among these theories because of its
22
simplicity. This damage theory was proposed by Palmgren in 1924 and later developed
by Miner in 1945 [42].
According to the classic S-N curve, operation at constant stress amplitude S
produces complete damage in N cycles. Operation at the same stress amplitude (S) for
number of cycles smaller than N, will produce a fractional damage. In the same way,
operations over different stress levels S
i
result in different damage fractions D
i
. Failure is
predicted to occur when the sum of the damage fractions equal or exceed unity, i.e.,
1
1 3 2 1
? + + + + +
? i i
D D D D D L L (2.3)
The Palmgren-Miner hypothesis states that the damage fraction at any stress level
S
i
is linearly proportional to the ratio of the number of cycles of operation to the total
number of cycles that would produce failure at that stress level, thus,
i
i
i
N
n
D = (2.4)
Similarly, for damage models that estimate time-to-failure (e.g., Black’s model
for electromigration), damage is defined as
i
i
i
TTF
t
D = (2.5)
where t
i
is the time of operation and TTF
i
is the time-to-failure estimated by the model.
The Palmgren-Miner hypothesis is the most widely used model in industry,
mainly due to its simplicity. However the hypothesis does not recognize the influence of
the order of application of various stress levels. Damage is assumed to accumulate at the
same rate at a given stress level without regard to past history. In applying the Palmgren-
23
Miner rule to an irregular load history, care should be taken that cycles are defined in a
rational manner.
2.6 Estimation of Remaining Life
Remaining life estimation is the process of estimating the remaining life of the
product based on accumulated damage information. Sometimes, it is more useful to
quantify product degradation in terms of physical parameters (e.g., time in days, distance
in miles) than in terms of accumulated damage. Accumulated damage of the product is
combined with product usage history to estimate the remaining life. This process assumes
that there is no abnormality in product usage pattern in future. In other words, this step
converts the accumulated damage in the electronic product into an equivalent amount of
time that the product can continue to function before the start of wear out failures.
The remaining life estimation is updated regularly at the end of a pre selected time
period. Hence the remaining life estimation process can take in to account any sudden
change in the life cycle environment or usage of the product. The time interval between
two updates is decided based on the product usage and its estimated lifetime based on
virtual reliability analysis. Sometimes the safety level associated with the product can
play an important role in determining the time interval.
2.7 Acceptable Remaining Life
The life consumption monitoring methodology described above concludes with an
estimation of the useful remaining life of the product. At this point, the user is required to
decide whether to keep the product in operation and continue monitoring or to abandon
the mission and schedule a maintenance action. The choice of acceptable amount of
remaining life depends on a variety of factors, such as the user’s application and the
24
safety level associated with it. For example, if the application is known to be fairly
reliable with multiple redundancies, a higher limit of acceptable remaining life may be
chosen but if the application involves human participation or may compromise the safety
of personnel, a lower acceptable limit of remaining life may is required.
25
3 EXPERIMENTAL CASE STUDIES ON LIFE CONSUMPTION
MONITORING
The life consumption monitoring methodology described in chapter 2 was
demonstrated in a real-time environment through two case studies. The life cycle
environment for the case studies was chosen to be the underhood of an automobile. The
case studies involved the following steps:
1. Mounting test boards under the hood of a car (1997 Toyota 4Runner) (Figure 3.1)
2. Conducting failure modes and effects analysis (FMEA) and virtual reliability
assessment to determine and assess the dominant failure mechanisms and the
corresponding environmental parameters.
3. Monitoring the underhood thermal, shock and vibration environment of the test board
in the car.
4. Simplifying the monitored environment and performing a physics-of-failure-based
reliability analysis to estimate the life consumption for the test board.
5. Monitoring resistance of the solder joints in real time to find out the actual life.
6. Comparing the estimated and the actual life results.
The test board was a FR-4 printed circuit board (PCB) consisting of eight surface
mount leadless inductors manufactured by ACI AppliCAD Inc. The inductors were
soldered to the PCB with Pb-Sn eutectic solder. The board was bolted at its two corners
to an aluminum bracket, which made the board act like a cantilever to vibrations
2
.
2
Cantilever mounting was designed in order to accelerate the effect of road vibration and was not planned
26
FR-4 PCB with 8 surface mount
inductors
Clamping points
Figure 3.1: Experimental setup with the test board mounted under-the-hood of a
car (1997 Toyota 4Runner).
3.1 Failure Modes, Mechanisms and Effects Analysis
Failure modes, mechanisms, and effects analysis (FMMEA) was conducted for
the test board assembly to assess all possible failure modes and mechanisms in the
automobile underhood environment. Environmental and operational parameters of
interest were identified based on inputs to the available failure models. When a failure
model is not available for a particular failure mechanism, environmental and operational
parameters were identified based on prior experience and literature. Identified
to be representative of time-to-failure in automobile electronic modules.
27
environmental and operational requirements were compared with the existing loading
conditions in the underhood environment to determine the potential failure mechanisms.
The potential failure mechanisms were used for analysis by virtual reliability assessment.
Table 3.1 shows the failure modes, mechanisms, and effects analysis for the circuit card
assembly. Plated through hole (PTH) fatigue, conductive filament formation (CFF),
electromigration, metallization corrosion, and solder joint fatigue were identified as the
potential failure mechanisms by this analysis.
3.2 Virtual Reliability Assessment
Virtual reliability assessment was conducted to assess the time-to-failure using the
failure mechanisms and models identified by failure modes, mechanisms, and effects
analysis (FMMEA) including plated through hole fatigue, conductive filament formation,
electromigration, metallization corrosion, and solder joint fatigue. Information about
product dimensions and geometry were obtained from design specification, board layout
drawing and component manufacturer data sheets. Environmental data for analysis
including temperature, vibration and humidity were obtained from the Society of
Automotive Engineers (SAE) environmental handbook and Washington DC area weather
reports. Figure 3.2 shows the average power spectral density (PSD) plot for the vibration
on a car frame from SAE handbook [43]. The car was assumed to run average 3 hours per
day. Table 3.2 shows the temperature data used for defining the underhood environment.
The maximum relative humidity for the underhood environment was 98 % at 38
o
C [43].
Humidity conditions were used to estimate time-to-failure for corrosion and conductive
filament formation.
28
Table 3.1: Failure Modes and Effects Analysis (FMEA) for the circuit card assembly
used for the experiment.
Item name/
failure site
Failure
mode
Failure
effect
Failure
mechanism
Failure
model
Cause of
failure
Comments
Electrical
open in
PTH
Change in
resistance of
PCB
assembly
PTH fatigue
CALCE
PTH
barrel
thermal
fatigue
model
Temperatu
re cycling
Virtual
reliability
assessment
required
Electrical
short
between
PTHs
No current
flow
through
components
Conductive
filament
formation
(CFF)
Rudra and
Pecht
model
Voltage,
high RH,
and tighter
PTH
spacing
Virtual
reliability
assessment
required
Electro-
migration
Black's
model
High
current
density
and
temperatur
e
Virtual
reliability
assessment
required
Change in
resistance of
PCB
assembly
Corrosion in
metallizatio
n traces
Howard's
model
High RH,
electrical
bias, ionic
contamina
tion
Virtual
reliability
assessment
required
Electrical
short/
open,
change in
resistance
in the
metallizati
on traces
Open EOS/ ESD
No model
available
for EOS/
ESD of
board
metallizati
on traces
Discharge
of high
potential
through
dielectric
material
Too high
conductor
spacing (in
the order of
centimeters)
to cause
EOS/ESD
Printed
circuit board
(PCB)
Fracture in
the PCB
Crack/
breaking of
PCB
Buckling
Overstress
failure
dependent
on critical
load
Compressi
ve loads
on the
PCB
No
compressive
loads applied
to the PCB
29
Item name/
failure site
Failure
mode
Failure
effect
Failure
mechanism
Failure
model
Cause of
failure
Comments
Short
between
windings
Change in
inductance
of PCB
assembly
Wearout of
winding
insulation
No model
available
for
Inductors
Overheati
ng due to
excessive
current
and
prolonged
use at high
temperatur
e
Short
between
windings
and the
core
Change in
resistance of
PCB
assembly
Wearout of
winding
insulation
No model
available
for
Inductors
Overheati
ng due to
excessive
current
and
prolonged
use at high
temperatur
es
Components
(Inductors)
Open
circuit
inside the
inductor
No current
flow
through
PCB
assembly
Breaking of
winding
No model
available
for
Inductors
Prolonged
use at high
temperatur
es
Maximum
operating
temperature is
low compared
to rated
temperature of
the inductors
(125 C).
Current
passing
through the
inductors (~50
mA) is much
below the
maximum
rated current
(9 Amps)
Solder joints
Intermitte
nt change
in
electrical
resistance
Intermittent
malfunctioni
ng of PCB
assembly
Solder joint
fatigue
Engelmaie
r's thermal
fatigue/
Steinberg'
s vibration
Temperatu
re cycling
and
vibration
Virtual
reliability
assessment
required
30
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1 10 100 1000 10000
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure 3.2: The power spectral density plot used for the virtual reliability
assessment
Table 3.2: Data used for defining the temperature environment for the virtual
reliability assessment
Maximum under hood temperature (near the frame)
121
o
C
Average daily maximum temperature [44]
27
o
C
Average daily minimum temperature [44]
16
o
C
Table 3.3 shows the time-to-failures for different failure mechanisms obtained from
virtual reliability assessment. It is clear from the table that solder joint fatigue is the
dominant mechanism in the given life cycle environment. The environmental factors that
can cause solder joint fatigue were found to include temperature cycling and vibration.
The virtual reliability assessment predicted. Virtual reliability assessment predicted 34
days to failure based on solder joint fatigue.
31
Table 3.3: Virtual reliability assessment
Failure
mechanism
Failure model
Time-to-
failure
Probability of
failure
Plated through hole
(PTH) fatigue
CALCE PTH barrel thermal
fatigue model (calcePWA)
> 10 years Low
Conductive
filament formation
(CFF)
Rudra and Pecht model
(calceFAST)
4.6 years Low
Electromigration Black’s model (calceFAST) >10 years Low
Corrosion in board
metallization traces
Howard’s model 1 year
3
Low
Solder joint fatigue
Engelmaier’s thermal fatigue
and Steinberg’s vibration
model (calcePWA)
34 days High
A monitoring and data simplification schemes were developed for monitoring and
analyzing the automobile underhood environment for solder joint fatigue analysis. The
electrical indications of failure in case of solder joint fatigue are characteristically
intermittent because the fractured solder joint surfaces do not separate physically as long
as the component is attached to the substrate by other solder joints [45]. For the case
study, product malfunction was monitored through intermittent change in resistances,
which is consistent with characteristics of solder joint failure.
3
Time-to-failure was obtained in the worst case conditions with the presence of an electrolyte. The actual
time-to-failure will be much higher than one year.
32
3.3 Monitoring Product Parameters
A battery powered data logging device equipped with an internal tri-axial
accelerometer and an integrated temperature and humidity sensor (Figure 3.3) was used
to record the shock, vibration and temperature environment of the test board assembly at
programmed intervals [46]. The data logger is capable of recording both static and
dynamic type of data. Static data can be completely characterized by a single reading,
where as dynamic data varies rapidly with time. In addition to integrated sensors, the
data logger also provides extra static and dynamic channels for connection to external
sensors. For the experiment temperature was monitored as static data and vibration was
monitored as dynamic data. The recorder’s memory (8 MB) was divided into two
partitions:
Time-triggered: This allows the user to specify the minimum time interval to elapse
before recording begins. This feature is primarily for measuring static data.
•
Signal-triggered: This allows the user to specify the minimum value of a parameter
(i.e., signal trigger) that should be exceeded before the recording begins. An event is
defined when the measured parameter exceeds the signal trigger value and a
predefined number of samples are recorded. This feature is primarily for measuring
dynamic data.
•
For the experimental set up, the data logger was programmed in such a way that it
required the specification of the following parameters:
• Time interval between temperature measurements
• Sampling rate for shock and vibration (i.e., the number of samples counted per
second)
• Sample size for shock and vibration (i.e., the number of samples per event)
33
• Signal trigger values
• Filter frequency for vibration
Temperature sensor
Vibration
sensor
Channel 4
T/H channel
Temperature sensor
Vibration
sensor
Channel 4
T/H channel
Figure 3.3: The data logger with external temperature and vibration sensor
The data recorder could not be placed directly under the hood of a car because its
internal sensors had a maximum rated temperature of 55 °C. Further its size and weight
precluded its installation anywhere under-the-hood. Hence external temperature and
vibration sensors were used for monitoring the underhood environment.
An external RTD (resistance thermo-detector) temperature sensor was taped on
the test board with the help of high temperature resistant tape to monitor the temperature.
The sampling interval for temperature measurement was chosen to be one minute so that
the data recording device can capture even quick temperature changes arising due to
starting of the engine.
34
A piezoelectric accelerometer was mounted on one of the clamping points of the
test board to monitor vibration input. The vibration of the circuit card assembly for this
experimental setup was mainly due to the road and driving condition and hence could be
categorized as random vibration. A random vibration signal typically consists of several
frequencies. The range of frequencies that can be accurately monitored is dependent on
the chosen sampling scheme (i.e., sample rate, sample size, filter frequency).
3.3.1 Sampling Issues for Monitoring of Continuous Signals
Most signals behave as continuous phenomena over their period of acquisition,
and provide a history of the parameter being measured as a function of time. Sampling is
the process of obtaining a series of discrete numerical values from a continuous function.
There are electronic circuits associated with sensors that observe an instantaneous source
signal at regular intervals and convert it into an electrical signal with a numerical value
analogous to the source signal.
Signals are made suitable for digital processing by converting their sampled value
into an equivalent numerical value by an analog-to-digital converter. Each value is
represented by a finite number of on/off states of electronic elements. The converter has
an input-output relationship, which is a series of steps. The size of the steps depends on
the total range to be covered and the number of steps available.
The most important consideration in sampling is the selection of the sampling
interval. Sampling points that are too close together will yield redundant data, while
sampling points that are too far apart will lead to confusion between the low and high
frequency components of the signal. For example, consider the time record shown in
Figure 3.4. Let the record be sampled such that the time interval between adjacent
35
sampling points is h seconds. The sampling rate is hence 1/h samples per second. Since
at least two sample points are required to define a cycle of given frequency (i.e., one
point each for the start and the end of a cycle), the number of cycles per second (or the
frequency of sampling) is 1/2h. Thus, the highest frequency component that can be
defined by sampling at the rate of 1/h samples per second is 1/2h. This cutoff frequency
f
C
(equal to 1/2h) is called the “Nyquist frequency”, and the corresponding time between
samples h is called the “Nyquist interval”.
Time
x(t)
h
Time
x(t)
h
Figure 3.4: Sampling of a continuous time record
Any frequency f above f
C
contained in the signal will be superimposed or
“folded” back into the frequency range from 0 to f
C
and be confused with data in the low-
frequency range. This problem, which is called aliasing, is a potential source of error in
sampling. For any frequency f in the range 0 ? f ? f
C
, the frequencies that will be aliased
with f are 2nf
C
± f, where n is a natural number from 1 to N. To prove this, consider a
sampling interval t = 1/2f
C
. Then,
( ) ( ) ?ft
f
?f
f
?f
n?
f
f nf ? t f nf ?
C C C
C C
2 cos
2
1
2 cos
2
2
2 cos
2
1
2 2 cos 2 2 cos = =
|
|
.
|
\
|
± = ± = ±
(3.1)
36
Thus, all data at frequencies (2nf
C
± f) have the same cosine function amplitude as
the data at frequency f when sampled at times 1/2f
C
apart, and hence all data at the higher
frequencies will be aliased (or superimposed) on data at frequency f. For example, data
at frequencies 170 Hz, 230 Hz, 370 Hz, 430 Hz, and so on will be aliased with data at 30
Hz if f
C
= 100 Hz. Hence, the sampling interval h should be carefully chosen to prevent
aliasing.
Two methods are available to prevent aliasing: 1) to choose h sufficiently small so
that it is physically unreasonable for data to exist beyond the cutoff frequency f
C
or 2) to
filter the original data prior to sampling so that information beyond a maximum
frequency is no longer contained in the filtered data. The second method, which involves
the use of a low-pass filter circuit that only allows passage of frequencies below the
frequency of the filter, is often preferred over the first method to save on computing time
and costs [47]. Hence a low-pass filter was used in this thesis to screen out all frequencies
above the Nyquist frequency to prevent aliasing.
In order to select the vibration analysis range, the data recorder was first
programmed to collect data at its maximum sampling rate
4
, and the resulting frequency
spectrum was analyzed. It was observed that the power spectral density (PSD) of the
vibrations was concentrated below 400 Hz and the PSD was negligible above 600 Hz
(below 10
-6
g
2
/Hz.). Hence, the frequency analysis range was conservatively selected to
be 1 to 700 Hz. Accordingly the sampling rate and sample size were selected to be 1800
4
This was done to find the frequency range of interest and to ensure that any high-frequency vibrations
would not be eliminated in the actual experiment.
37
samples/second and 1024 samples. The anti-alising filter frequency was chosen to be 700
Hz.
3.4 Data Simplification
Almost all monitoring systems use sensors to measure various loads present in a
product’s life cycle environment. Sensors mounted either near or within the product to
be monitored provide electrical output signal in response to a specified measurand. Most
signals from sensors behave as continuous phenomena over their period of acquisition,
and provide a time history of the parameter being measured. This data in time domain
cannot be directly used with the physics-of-failure models. This section describes the
method to make temperature and vibration data compatible to the physics-of-failure
models.
3.4.1 Temperature Data Simplification
Temperature based damage estimation models require temperature data in terms
of cycles. The physics-of-failure definition of cycles includes the maximum, minimum
temperature, ramp times and dwell times at maximum and minimum. Rainflow cycle
counting algorithms are usually used to identify the cycles based on a given loading
profile [48]. The temperature based reliability assessment models require cycle
information that includes cycle maximum, minimum temperature, dwell and ramp times.
The 3-parameter rainflow cycle counting method can identify cycles in a manner
consistent with the PoF definition of cycles [49], [50]. The input to the 3-parameter
rainflow cycle counting algorithm is a time history consisting of several reversals
5
.
5
A reversal is defined as a point where the first derivative changes its sign, i. e., a peak or a valley.
38
3.4.1.1 Ordered Overall Range (OOR) method
The data is first converted to a sequence of reversals using the ordered overall
range (OOR) method. The OOR method allows the user to convert an irregular history in
time domain into a regular sequence of peaks and valleys. The OOR method can be
described as follows [51]
• The largest peak
6
of the temperature history is selected as the first candidate
• The next valley that differs from the largest peak by more than a cut off level is
selected as the tentative candidate.
• The cut off level is defined as a fraction of the difference between the largest peak
and the lowest valley. The fraction is known as the reversal elimination index (s).
• Peaks are checked to see if they differ from the new candidate by more than the
screening level (event ‘x’), and valleys are checked to see if they are lower than the
candidate (event ‘y’). If event ‘y’ occurs first (i.e., before event ‘x’), then the
candidate is rejected and the new valley becomes a candidate. If event ‘x’ occurs
first, the candidate is validated and the newly found peak becomes the next candidate.
• The next peak that differs from the new candidate by more than a cut off level is
selected as the tentative candidate.
• Valleys are checked to see if they differ from the candidate by more than the
screening level (event ‘x’), and peaks are checked to see if they are higher than the
candidate (event ‘y’). If event ‘y’ occurs first (before event ‘x’), then the candidate is
6
The algorithm requires selection of either of the extreme reversals in the history (either the largest peak or
the smallest valley) as the first candidate. In this article the algorithm is explained taking the largest peak as
the first candidate.
39
rejected and the new peak becomes a candidate. If event ‘x’ occurs first, the
candidate is validated and the newly found valley becomes the next candidate.
• This process continues until the last reversal is counted.
• Since the counting process starts from the largest peak (which may not be the first
reversal in the history), the method has to be applied to both sides of the starting
reversal to take the entire load history into account.
When the reversal elimination index is equal to zero, all the reversals in the load
sequence are preserved. The OOR algorithm has the capability of eliminating some of the
temperature reversals, which are potentially less damaging, there by achieving data
reduction. Data reduction can be achieved by specifying a non-zero reversal elimination
index. Figure 3.5 explains the underlying concept of the OOR algorithm pictorially. In
the figure, the highlighted points are selected only if L s l × ? .
Time
T
e
m
p
e
r
a
t
u
r
e
Highest peak
Lowest valley
L
l
Time
T
e
m
p
e
r
a
t
u
r
e
Highest peak
Lowest valley
L
l
Figure 3.5: Temperature history showing the reversals (i.e., peaks and valleys)
40
3.4.1.2 Cycle Counting
Cycle counting methods [48] are used to transform a time history consisting of
several reversals (peaks and valleys) into an equivalent cyclic history. Cycle counting
methods are used when a fatigue analysis needs to be performed.
The physical interpretation of a cycle is a condition when the applied load returns
the material to the state it was before the load excursion occurred. If the applied load is
of a mechanical nature (such as force or torque), the material forms a closed stress-strain
hysteresis loop when this condition is satisfied. For a repeatedly applied load history, the
following two rules apply:
• When the load reaches a value at which loading was previously in the reverse
direction, a stress-strain hysteresis loop is closed, defining a cycle. The stress-strain
path beyond this point is the same as if the loading had not been reversed.
• Once a load sequence forms a closed loop, this sequence does not affect the
subsequent behavior.
For the load history shown in Figure 3.6, the first rule is invoked at points 2', 7',
5', and 1'. The first rule is also satisfied just beyond 5', where the load reaches the same
value it had at point 3. But the second rule also applies, and since excursion 2-3-2' has
already formed a cycle, there is no additional closed cycle.
41
Load
T
i
m
e
½cycle
1
2
3
4
5
6
7
0
2'
8
7'
1'
½cycle
1 cycle
1 cycle
5'
1 cycle
½cycle
½cycle
Load
T
i
m
e
½cycle
1
2
3
4
5
6
7
0
2'
8
7'
1'
½cycle
1 cycle
1 cycle
5'
1 cycle
½cycle
½cycle
Figure 3.6: Identifying cycles in a load history
For non-repeating and open-ended load histories, the rules stated above are
incomplete if the absolute value of the load at any point during the history exceeds its
value at the first peak. Of the various cycle counting methods available (peak counting,
simple range counting, peak-between mean counting, level crossing counting, fatigue
meter counting, range-pair counting, and rainflow counting), only the rainflow and the
range-pair counting methods are capable of handling this more general situation (of non-
repeating histories). However, no damage is calculated for some parts of the original
history if the range pair method is used, whereas the rainflow method accounts for every
part of the history. Hence, the rainflow method was used in this thesis for counting
cycles.
42
In the rainflow cycle counting method, the load-time history is plotted in such a
way that the time axis is vertically downward, and the lines connecting the load peaks are
imagined to be a series of sloping roofs. The rain flow is initiated by placing drops
successively at the inside of each reversal. The method considers cycles as closed
hysteresis loops formed during a history, which is consistent with the definition of a cycle
described in the previous section. Following rules are applied on the rain dripping down
the roofs to identify cycles and half cycles:
• The rain is allowed to flow on the roof and drip down to the next slope except that, if
it initiates at a valley, it must be terminated when it comes opposite a valley equal to
or more negative than the valley from which it initiated. For example, in Figure 3.7,
the flow begins at valley 1 and stops opposite valley 9, valley 9 being more negative
than valley 1. A half cycle is thus defined between valley 1 and peak 8.
• Similarly, if the rain flow is initiated at a peak, it must be terminated when it comes
opposite a peak equal to or more positive than the peak from which it initiated. In
Figure 3.7, the flow begins from peak 2 and stops opposite peak 4, peak 4 being more
positive than peak 2. A half cycle is thus counted between peak 2 and valley 3.
• The rain flow must also stop if it meets rain from a roof above. In Figure 3.7, the
flow beginning at valley 3 ends beneath peak 2. This ensures that every part of the
load history is counted once and only once.
• Cycles are counted when a counted range can be paired with a subsequent range of
equal magnitude in the opposite direction. If cycles are to be counted over the
duration of a profile that is to be repeated block by block, cycle counting should be
started by initiating the first raindrop either at the most negative valley or at the most
43
positive peak, and continuing until all cycles in one block are counted in sequence.
This ensures that a complete cycle will be counted between the most positive peak
and the most negative valley.
The simple rainflow method does not provide any information about the mean
load or the cycle time. A modified method called 3-parameter rainflow cycle counting is
used to handle this situation. This method accepts a sequence of successive differences
between peak and valley values (P/V ranges) in the time history as an input, and
determines the range of the cycle, the mean of the cycle, and the cycle time. The
modified method identifies cycles as follows [50]: Consider three successive P/V
differences d
1
, d
2
, and d
3
, as shown in Figure 3.8. A cycle is identified only if the
following condition is true:
1 2
d d d > ?
3
(3.2)
The condition of the above equation is called the ‘loop condition.’ For a given
sequence of P/V ranges, if the loop condition exists, the method picks the loop
corresponding to size d
2
off the cycle, leaving only the residual wave 1-2-4-5,
corresponding to a half-loop size (d
3
-d
2
+d
1
) in the load plot. This operation is called
‘loop-reaping.’
44
1
2
3
4
5
6
8
9
10
12
11
13
14
15
16
17
18
19
21
23
24
25
27
28
26
29
30
20
22
Load
T
i
m
e
7
Counting
terminated
1
2
3
4
5
6
8
9
10
12
11
13
14
15
16
17
18
19
21
23
24
25
27
28
26
29
30
20
22
Load
T
i
m
e
7
Counting
terminated
Figure 3.7: Rainflow cycle counting
45
Load
T
i
m
e
d
1
d
2
d
3
1
2
3
4
5
Load
T
i
m
e
d
1
d
2
d
3
1
2
3
4
5
Figure 3.8: Loop condition and loop reaping operations
For a given sequence of P/V ranges, the 3-parameter rainflow method “reaps” the
smaller cycles that occur during a larger cycle. The range, mean, and half-cycle time of
the residual half cycle is adjusted according to the loop-reaping condition, and the
process is applied until the last P/V range is read.
Solder joint fatigue models are based on total possible thermal expansion
mismatch and creep at extreme temperature [45]. Stress relaxation in solder joints
(viscoplastic materials) is a time dependent creep phenomenon and requires sufficient
dwell time at the extreme temperature. However, once the stress relaxation is complete
for a given cycle, there is no more damage in the solder joints due to creep in that cycle.
The damage due to the total thermal expansion mismatch remains the same as long as the
extreme temperatures are the same. Hence the data simplification algorithm should
provide enough dwell time at the extreme temperatures for a conservative estimation. To
account for this, the data simplification algorithm assumes one fourth of the half cycle
46
time at temperature extremes as the dwell time, where half cycle time is defined as the
time for temperature transition from valley to peak or peak to valley [49].
3.4.2 Vibration Data Simplification
Vibration data is typically measured as acceleration with the help of
accelerometers. In general the data collected from accelerometers represent random
vibration (i.e., it cannot be described by an explicit mathematical relationship). The
physics-of-failure based reliability assessment models require the random vibration data
to be described in terms of its power spectral density (PSD). The PSD describes the
frequency composition of the vibration in terms of its mean square value over a
frequency range. Fourier Transform analysis is used to transform the acceleration data
from time domain to the frequency domain, and vice versa. The result of the Fourier
Transform analysis is usually a plot of amplitude/ power as a function of frequency. In
real life collected data is sampled and not continuous. Hence Fast Fourier Transform
(FFT) is employed to analyze discrete (or sampled) data. The power spectral density was
calculated from the sampled data (i.e., the type of data recorded by the data logger) using
the Cooley-Tukey method, which is based on fast fourier transform (FFT) of the original
sampled acceleration data [47]. For a sequence of acceleration values h
k
sampled over a
record length T, the Cooley-Tukey method defines the PSD function at any frequency f
as
2
k
X
N
2h
G(f) = (3.3)
where X
k
are the FFT components of the N sampled acceleration values of amplitude h
k
averaged over the record length T. The expression for X
k
is given by
47
?
?
=
=
1 N
0 k
n
kN 2
i
k k
e h
X
?
(3.4)
where N is the sample size. The independent variable n can be related to the frequency by
the relation f
n
= (n/Nh), where h is the sampling interval between adjacent points and T is
the total record length.
For this thesis the acceleration data from the piezoelectric accelerometers were
converted to respective PSD using the SAVER PSD analysis software [46].
3.5 Stress and Damage Accumulation Analysis
The objective of a physics-of-failure stress and damage accumulation analysis in
life consumption monitoring is to determine the accumulated damage due to various
failure mechanisms for the electronic product in the given environment.
For this thesis, the circuit card assembly was modeled in calcePWA
7
reliability
assessment software. The software creates a finite element model of the circuit board
assembly based on the various material properties, board dimension, component type,
component dimensions and their respective orientations. Material properties for the
model were taken from calcePWA material database
8
. Component dimensions were
obtained from the respective part data sheet from Vishay Dale. Board dimensions and the
component orientations were taken from the board layout drawing.
7
calcePWA is a physics-of-failure based virtual reliability assessment tool for circuit card assemblies
developed by CALCE Electronic Products and Systems Center, University of Maryland, College Park. The
software makes use of numerical analysis and failure mechanisms) to estimate time-to-failure.
8
calcePWA software has material property database for a number of materials commonly used in the
electronic industry.
48
Once a model of the board is created, the software uses the environmental data to
estimate the stress near each potential failure site (solder joint in this case) using the finite
element model. For determining the stress under each solder joint it is important to
provide the correct boundary condition to the software. The boundary conditions of the
test board were defined as follows:
• Temperature on the test board was considered to be uniform with conduction and
natural convection. This is a reasonable assumption as there is no power generation
by the components.
• For vibration analysis two corners of the board were modeled as clamped supports.
Boundary conditions assumed for the vibration analysis were verified by
comparing the experimentally evaluated natural frequencies of the board with the
modeling results. The modeling predicted natural frequencies of the circuit card assembly
as 25.7 Hz, 77.5 Hz, 145.4 Hz, and 274.8 Hz. To check this, another external
accelerometer was mounted on the PCB. The ratio of the response of the PCB to the
excitation given to the PCB at various frequencies was plotted against the frequency.
Peaks of this plot give the experimental natural frequencies. Figure 3.9 shows that the
natural frequencies occur at 19.3 Hz, 79.1 Hz, 149 Hz, and 262 Hz., which is in close
agreement with the modeling prediction.
49
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1 10 100 1000
Frequency (Hz.)
P
S
D
R
e
s
p
o
n
s
e
/
P
S
D
E
x
c
i
t
a
t
i
o
n
19.3 Hz.
79.1 Hz.
149 Hz.
262 Hz.
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1 10 100 1000
Frequency (Hz.)
P
S
D
R
e
s
p
o
n
s
e
/
P
S
D
E
x
c
i
t
a
t
i
o
n
19.3 Hz.
79.1 Hz.
149 Hz.
262 Hz.
Figure 3.9: Ratio of the response of the PCB to the excitation vs. frequency.
The peaks in this plot identify the natural frequencies.
The computed stress is then used to estimate the damage fraction based on
physics-of-failure models for selected failure mechanisms. The basis of damage
assessment is that operation of the product at a given stress amplitude will produce some
amount of damage, the magnitude of which will be related to the total time of operation
at that stress amplitude and the total time that would be required to produce failure of an
undamaged part at that stress amplitude. When the total accumulated damage reaches a
critical level, failure is predicted to occur. For this thesis, solder joint failure models for
temperature cycling, shock and vibration were used to estimate the accumulated damage.
The software used Engelmaier’s first order model for thermal fatigue and Steinberg’s
equation for vibration induced fatigue on the solder joints. More details about the damage
models can be found in the appendix.
50
3.6 Remaining Life Assessment
Remaining life estimation step calculates the useful life of the product (e.g., the
time in days, distance in miles) through which the product can function reliably, based on
the damage accumulation information. The remaining life was calculated on a daily basis
by subtracting the life consumed on that day from the estimated remaining life on the
previous day. This approach used an iterative formula to find out the remaining life [52].
1 1
*
? ?
? =
N N N N
TL D RL RL
(3.5)
where RL
N
is the remaining life at the end of day N, TL
N
is estimated total life at the end
of day N and D
N
is the damage ratio accumulated for day N.
3.7 Failure Definition and Detection
Solder joint failure due to fatigue is defined as the complete fracture through the
cross section of the solder joint with solder joint parts having no adhesion to each other.
A solder joint that fails fully by fracturing does not necessarily exhibit an electrical open
or even a very noticeable increase in electrical resistance. Electrically, the solder joint
failure manifests itself only during thermal and mechanical transients or disturbances in
the form of short duration resistance spikes. The thermal and vibration fatigue models
used for the analysis are also based on intermittent resistance spikes, i.e., interruption of
electrical discontinuity for small periods of time (more than 1 µs) [45].
For the case studies in the thesis, the functional degradation of the circuit card
assembly was monitored experimentally in terms of resistance change of the solder joints.
An event detector circuit was connected in series with all the components and solder
joints to indicate intermittent resistance increase. The event detector was connected to
51
the data logger to record the time when there was an increase in resistance. The
intermittent increases in resistances were termed as “resistance spikes.” Resistance spikes
for the experiment were defined to be intermittent increase in resistance by 100 ohms for
each solder joint [54], [55]. Failure was defined as occurrence of fifteen such resistance
spikes.
The event detector circuit sends a continuous direct current signal through the
daisy-chained circuit containing the inductors and the solder joints in series. Since the
inductors offer zero resistance to the direct current, the resistance of the daisy-chained
circuit is dependent on the resistance of the solder joints. The resistance offered to the
direct current was compared with the preset value (100 ohms increase for each solder
joint). The comparison results were logged at the end of every second. This time interval
was limited by the capability of the data logger.
To determine change in resistance of the individual components and solder joints,
the resistances were measured during the experiment on a regular basis with the car
engine off.
3.8 Monitored Environment and Results
This section describes the monitored environmental parameters (e.g., temperature,
shock and vibration) for the test board assembly for the case studies. The monitored data
was simplified and used with the physics-of-failure models to estimate the remaining life
of the test board assembly.
3.8.1 Case Study-I
The temperature sensor used for the case study was Kele’s Model STR-91S two-
wire strap-on RTD sensor with a temperature range up to 200
o
C. A single axis
52
piezoelectric (Endevco’s model 2226C) accelerometer was mounted on one of the
clamping points of the test board to measure the out-of-plane acceleration for the board.
Figure 3.1 shows the experimental setup for case study-I.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40
Time (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.10: Monitored temperature during case study-I
The temperature on the test board in the underhood environment was monitored
for a period of 42 days. Figure 3.10 shows the monitored temperature for the experiment
period. Each data point is separated at an interval of one minute. The average temperature
and the maximum temperature seen by the test board assembly during case study-I are 30
o
C and 61
o
C respectively.
53
The temperature vs. time history for 42 days was converted to an equivalent
sequence of peaks and valleys (for thermal fatigue analysis) using the ordered overall
range (OOR) method. For this analysis the reversal elimination index was chosen to be
0%, i.e., all the reversals were chosen as input to the cycle counting algorithm. Figure
3.11 shows the acquired temperature data converted to peaks and valleys. The sequence
was converted to temperature cycles using the 3-parameter rainflow cycle counting
method. 275 temperature cycles were identified for case study-I. The cycle information
obtained from the rain flow cycle counting method was used as input to the calcePWA
thermal module.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40
Time (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.11: Monitored temperature converted to the peaks and valleys for case
study-I
54
Figure 3.12 shows the power spectral density (PSD) vs. frequency for the out-of-
plane vibration of the board. The PSD shown in the figure is the averaged value over the
experiment duration. The frequency analysis range was 1 to 700 Hz. Accordingly the
sampling rate and sample size were selected to be 1800 samples/second and 1024
samples.
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E+00 1.0E+01 1.0E+02 1.0E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure 3.12: Power spectral density (PSD) vs. frequency plot for case study-I
The damage accumulation in the circuit card assembly was determined using
calcePWA software. The output of the 3-parameter rainflow cycle counting algorithm
and the power spectral density (PSD) vs. frequency data described above were used as
input to the software. The displacements of the board and each component due to
vibration were estimated through finite element analysis. Figure 3.13 shows the estimated
displacement of the board. The reference point for the shown displacements was chosen
to be the clamping points. The curvature along the horizontal axis of the board was found
55
to be constant (1.1 x 10
-3
/ inch), which indicated the equal amount of damage for all
solder joints due to vibration.
Clamping
Points
Clamping
Points
Figure 3.13: Estimated board displacement due to vibration for case study-I
An accident occurred during case study-I, where the car used for experiment was
hit by another car. Vibration events with high g-values were recorded during the crash
and during dis-engagement of the cars. The maximum g-levels for these events were an
order of magnitude higher than normal conditions. Figure 3.14 shows the vibration event
with highest g-level recorded during the crash. In this case the maximum value of
acceleration was from +22g to –23 g (45 g peak-to-peak) as compared to 2g in case of
normal random vibration. The highest g-level recorded during dis-engagement of the cars
was + 9g to –9g (18 g peak-to-peak).
Under high levels of vibration, there is a chance of failure of the circuit card
assembly if the stress under maximum acceleration exceeds the material strength of the
solder joints. This failure due to shock is considered to be due to overstress mechanism.
56
Overstress models are usually used to find out whether the board can sustain the impact.
An overstress analysis was conducted using calcePWA for the maximum acceleration
value (45 g peak to peak), which showed no overstress failure.
-25.0
-20.0
-15.0
-10.0
-5.0
0.0
5.0
10.0
15.0
20.0
25.0
0 10 20 30 40 50 60 70 80 90 100 110
Time (m. sec.)
A
c
c
e
l
e
r
a
t
i
o
n
(
g
)
Figure 3.14: Recorded vibration event during the car accident. The maximum
acceleration values were from +22 g to –23 g.
Hence a random vibration analysis was conducted with the PSD data obtained
from all the events recorded during the crash and dis-engagement of the cars. The random
vibration analysis resulted in maximum of 15% accumulated damage of the solder joints
of the board. A more detailed section on the random vibration analysis is given in
appendix-3.
Accumulated damage
9
was estimated using the physics-of-failure models and
Palmgren-Miner theory on a daily basis. The results obtained are shown in the form of a
bar chart (Figure 3.15).
9
Estimated damage of 100% corresponds to the predicted end-of-life of the board.
57
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40
Time in Use (days)
A
c
c
u
m
u
l
a
t
e
d
D
a
m
a
g
e
(
%
)
Damage due to temperature cycling
Damage due to vibration
Figure 3.15: Accumulated damage estimated using calcePWA and Miner’s rule for
case study-I
Figure 3.16 shows the measured resistances of solder joints as a function of time.
These resistance values were measured on a daily basis with the car engine off. The plot
also shows the occurrences of the resistance spikes. The actual life of the circuit card
assembly found to be 39 days according to the failure criteria.
58
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
Figure 3.16: Resistances of the solder joints along with the intermittent
resistance spikes for case study-I
Figure 3.25 shows the estimated remaining life and the actual life for the
experiment. Initial predictions based on the similarity analysis and SAE environmental
handbook data were 25 days and 34 days respectively. The estimated life based on life
consumption monitoring with out taking into account the accident is 46 days. There was a
drop in estimated life of the circuit card assembly by 6 days because of the accident.
Hence the final estimated life is 40 days. The actual life based on resistance monitoring is
39 days, which is close to the estimated life based on life consumption monitoring.
59
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45 50
Time in Use (days)
E
s
t
i
m
a
t
e
d
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45 50
Time in Use (days)
E
s
t
i
m
a
t
e
d
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
Figure 3.17: Remaining life estimation summary for case study-I
All solder joints of the test board assembly were photographed with the help of
optical microscope from time to time during the experiment. Two of the solder joints
showed cracks at a magnification of 50.
0.4 mm. 0.4 mm. 0.4 mm. 0.4 mm.
Figure 3.18: Crack in one of the solder joints
60
3.8.2 Case Study-II
A 2
nd
case study was conducted to demonstrate the life consumption monitoring
methodology. There were some changes in the experimental setup in order to compare
the in-plane and out-of-plane accelerations of the test board during the experiment.
Figure 3.19: Experimental setup for case study-II
The temperature sensor used for the case study was the same as case study-I
(Kele’s Model STR-91S two-wire strap-on RTD sensor). A 3-D piezoelectric (Endevco’s
Model 2228C) accelerometer was mounted on one of the clamping points of the test
board to measure accelerations in all 3 directions (out-of-plane acceleration for the board,
acceleration along the car motion and the transverse direction). Figure 3.19 shows the
experimental setup for case study- II.
61
Figure 3.20 shows the temperature variations on the test board for 66 days. Each
data point is separated at an interval of one minute. The figure shows that the maximum
temperature seen by the circuit card assembly is 89
o
C, which is well below the glass
transition temperature of FR-4 (130
o
C) and the rated temperature of the inductors (125
o
C). The minimum temperature seen by the circuit card assembly is 4
o
C.
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 45 50 55 60 65
Time in Use (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.20: Monitored temperature for case study-II
The temperature vs. time history for 66 days was converted to an equivalent
sequence of peaks and valleys using the ordered overall range (OOR) method. For this
analysis the reversal elimination index was chosen to be 0% like case study-I. Figure 3.21
shows the acquired temperature data converted to peaks and valleys. The temperature-
time history was converted to temperature cycles using the 3-parameter rainflow cycle
counting method. 423 temperature cycles were identified for case study-II.
62
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 45 50 55 60 65
Time in Use (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.21: Monitored temperature profile converted to the peaks and
valleys for case study-II
Figure 3.22 shows the power spectral density (PSD) vs. frequency plots for three
different directions (out-of-plane acceleration for the board, acceleration along the car
motion and the transverse direction) averaged over the experiment duration for frequency
range of 1 to 700 Hz. The out-of-plane vibration was found to be at least 2 orders
magnitude higher than the other directions. Further according to studies conducted by
Steinberg, stress in the solder joints can be related to the out-of-plane displacement of the
board [60]. Hence only out-of-plane (z-direction) vibration was used for vibration
analysis.
63
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
1.0E+00 1.0E+01 1.0E+02 1.0E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
X-Acceleration Y-Acceleration Z-Acceleration
Figure 3.22: Power spectral density (PSD) vs. frequency plots for case study-II
The damage accumulation in the circuit card assembly was determined using
calcePWA. The output of the 3-parameter rainflow cycle counting algorithm and the
power spectral density (PSD) vs. frequency data were used estimate the damage. The
displacements and radius of curvatures under each component were estimated through
numerical analysis. The radius of curvature along the horizontal axis of the board is
constant which predicts the same amount of damage accumulation in all solder joints as
in case of case study-I. Figure 3.23 shows the results of the damage analysis in the form
of a bar chart.
64
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40 45 50 55 60
Time in Use (days)
A
c
c
u
m
u
l
a
t
e
d
D
a
m
a
g
e
(
%
)
Damage due to temperature cycling
Damage due to vibration
Figure 3.23: Accumulated damage for case study-II
Figure 3.24 shows the measured resistances of solder joints as a function of time
with the corresponding ranges. These resistance values were measured on a daily basis. It
can be seen from the plot that there is a gradual change in resistance through out the
experiment starting from an average value of 12.8 ohms to a value of 14.6 ohms. The plot
also shows the occurrences of the resistance spikes. The actual life of the circuit card
assembly found to be 66 days according to the defined failure criteria, i.e., fifteen
consecutive resistance spikes were observed on the 66
th
day.
Figure 3.25 shows the estimated remaining life and the actual life for the
experiment. Initial predictions based on SAE environmental handbook data and the 1
st
case study were 33 days and 46 days respectively. The predicted remaining life based on
65
the life consumption monitoring methodology is 61 days, which is a conservative
estimation of the failure (66 days).
10.0
11.0
12.0
13.0
14.0
15.0
16.0
17.0
18.0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
10.0
11.0
12.0
13.0
14.0
15.0
16.0
17.0
18.0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Figure 3.24: Resistances of the solder joints along with the intermittent resistance
spikes for case study -II.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Prediction based on
environmental data from
case study-I (46 days)
Prediction based on
handbook data (33 days)
Predicted life based on
environmental
monitoring (61 days)
Actual life from
resistance monitoring
(66 days)
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Prediction based on
environmental data from
case study-I (46 days)
Prediction based on
handbook data (33 days)
Predicted life based on
environmental
monitoring (61 days)
Actual life from
resistance monitoring
(66 days)
Figure 3.25: Remaining life estimation summary for case study-II
66
All solder joints of the test board assembly were photographed with the help of
optical microscope at the end of the experiment as in case study-I. A visible crack was
observed at the knee of a solder joint at a magnification of 50.
3.8.3 Comparison between case study-I and II
A comparison between the case study-I and case study-II is given in Table 3.4.
Environmental temperature values are higher for case study-II compared to case study-I.
This difference in temperature is because case study-I was conducted in winter and case
study-II was conducted in summer. Power spectral density (PSD) values for case study-II
are lower compared to case study-I. Difference in PSD values arises because of road
conditions
Table 3.4: Comparison between case studies I and II
Case Study I Case Study II
Temperature
Avg. 30
o
C
Max. 61
o
C
Avg. 41
o
C
Max. 89
o
C
PSD value
8.73e-3 g
2
/Hz. (max.) 6.09e-3 g
2
/Hz. (max.)
Predicted life using life
consumption monitoring
40 days 61 days
Actual life based on
resistance monitoring
39 days 66 days
% Damage due to
temperature
5 % 13%
% Damage due to vibration 95% 87%
Type of failure
Cracks in solder joints
Cracks in solder joints
Electrical indication of
failure
• Intermittent resistance
spikes
• Permanent resistance
increase of the solder
joints (16 %)
• Intermittent
resistance spikes
• Permanent
resistance increase
of the solder joints
(14 %)
67
4 SUMMARY AND DISCUSSION
This thesis describes a life consumption monitoring methodology for electronic
products. Steps involved in the methodology have been explained in the thesis. Two
example case studies were conducted to demonstrate the methodology for real life
applications. The case studies were conducted in automobile underhood environment.
Solder joint fatigue was identified as the dominant failure mechanism for the chosen
environment based on failure modes, mechanisms, and effects analysis (FMMEA) and
virtual reliability assessment. Accordingly, the following steps in the life consumption
monitoring were customized. Ordered overall range algorithm and 3-parameter rainflow
cycle counting algorithms were combined to develop a suitable data simplification
scheme for the chosen conditions. Proper stress and damage models were identified and
used along with the data simplification scheme for the case study conditions. An
algorithm was developed to estimate the remaining life of the electronics for the case
studies based on the results from stress and damage models. The steps followed for case
studies can be summarized as follows:
• Two identical test board assemblies were mounted in a cantilever fashion under-the-
hood of an automobile.
• Failure modes, mechanisms, and effects analysis (FMMEA) was conducted along
with virtual reliability assessment to determine the dominant failure mechanism for
the given life cycle environment. Solder joint fatigue was identified as the dominant
failure mechanism for the given environment.
68
• Temperature and vibration were identified as the environmental parameters for
monitoring were identified based on inputs to the solder joint stress and damage
models.
• Vibration and temperature data were monitored and simplified for compatibility with
stress and damage models.
• Stress and damage models were used along with the remaining life estimation
algorithm to determine the remaining life of the circuit card assemblies.
• Electrical performances of the circuit card assemblies were checked through
resistance monitoring to determine their actual life.
• The predicted life of the test board assemblies were found to be in agreement with
the experimental life.
• Remaining life of the test board assemblies were predicted with information from
various sources. All estimation results were compared.
The remaining life values for the case studies mentioned in the paper were found
to be very close to the experimental values, which can be explained by the following
reasons. The circuit card assemblies were designed intentionally with large surface mount
leadless inductors to precipitate failure solder joint fatigue much before other failure
modes. Further the stress and damage models used for analysis using calcePWA are well
calibrated for leadless components on a FR-4 printed circuit board. However, in more
complex circuit card assemblies, the life estimation results are dependent on various other
failure sites and mechanisms, which might result in higher amounts of error in the
estimation.
69
4.1 Using Handbooks and Similarity Analysis to Determine Environmental and
Operational Life Cycle Conditions
Life consumption monitoring process requires information on product geometry,
material properties along with actual environmental and operational parameters in its life
cycle environment. Once the product design is complete, its geometry and material
properties can be obtained from various sources including design layout, manufacturer
data sheets. However, until the beginning of product parameter monitoring, no
information on environmental and operational parameters is available. For life estimation
at this stage, life cycle information is taken either from environmental handbooks or data
monitored in similar applications. Environmental handbooks are good sources of
information for newly designed products. An example is data from SAE environmental
handbook for automobile electronics. For the case studies mentioned in the paper, data
from SAE environmental handbook resulted in 34 days of total life as compared to 66
days from the experiment in case study-II. This difference in the results can be because
the SAE environmental handbook provides a generic set of data not pertaining to a
specific car manufacturer or geographical location. Further the data available in SAE
environmental handbook represents twenty year old information [43].
Another source of information is the data collected from sensors for same or
similar products in a similar environment. This is possible only if the same product or a
similar product is already in production and monitored data from sensors are available.
This is called “using similarity to determine environmental and operational conditions”.
A virtual reliability assessment based on case study-I gives a total life of 46 days, which
70
is different from a total life of 66 days in case of case study-II. This difference can be
because of higher levels of vibration along with the car accident.
The process of using environmental data from handbooks or similarity analysis
with stress and damage models is often very useful because it can provide a quick
estimate of product life before spending a lot of time and money in environmental
monitoring. However, as shown in the case studies, the results may not be very accurate
as the process takes into account an approximate life cycle environment and does not
account for any change in life cycle conditions. Sometimes a sudden change in product
life cycle environment (e.g., accident of the automobile mentioned in case study-II) can
have a catastrophic impact on its life.
In general, the actual life cycle environment that a product encounters is different
from the statistically averaged values from handbooks. For automobile electronics, life
cycle environment depends on road conditions (for vibration), geographical location and
part of the year (for temperature, humidity). Hence after placing the product in the field
conditions and obtaining information about the life cycle environment more precise
information about remaining life can be obtained using the life consumption monitoring
approach. If there is any change in the life cycle environment (e.g., the car accident
mentioned in one of the case studies), there is an abrupt change in remaining life. By
knowing the remaining life of the product after an accident, the product mission can be
rescheduled to get intended life. This concept is known as “extension of life”, which
explains how life consumption monitoring results can be used in real life.
Figure 4.1 shows the idealized remaining life vs. time plot for an electronic
product. The time corresponding to 0% remaining life of the product gives the time-to-
71
failure of the product at a given load condition. This illustrates change of mission profile
to achieve extension of life.
R
e
m
a
i
n
i
n
g
L
i
f
e
(
%
)
Time
0 %
Change in
mission profile
Extension of life
Effect of accident
Expected plot for
remaining life
100%
R
e
m
a
i
n
i
n
g
L
i
f
e
(
%
)
Time
0 %
Change in
mission profile
Extension of life
Effect of accident
Expected plot for
remaining life
100%
Figure 4.1: Extension of life based on life consumption monitoring results
4.2 Determining the Number of Data Points Required for Life Estimation
Total useful life of a product at any point on the remaining life vs. time plot can
be estimated by adding the time in use (or x-coordinate) and the remaining life (or y-
coordinate) at that point. However, this analysis is a one-point estimation and does not
take into account the product usage trend. The product usage trend can be taken in
account by extrapolating a trend line using the available remaining life data points. The
intersection of the trend line with the time axis gives the total useful life of the product
(see Figure 4.2).
72
The number of data points that can be considered for the trend line varies from
two data points to all data points. “Cumulative past analysis for life estimation” is the life
estimation process where all the available data points in the remaining life plot are used.
On the other hand, “near past analysis for life estimation” is the life estimation process
where only some of the data points are used. Results obtained can be different by using
cumulative past and near past data if there is a large variation in the results from the
damage models. This variation can occur from variation in environmental and operational
parameters that are inputs to the damage models. An example of such variation is the
result due to the car accident mentioned in one of the case studies.
Figure 4.2 explains the concept of cumulative past and near past analysis with the
help of 24 data points from Figure 3.17 including the car accident. The plot shows that
life estimation results (at the end of 24
th
day) based on near past (i.e., three data points)
and cumulative past (i.e., all data points). The near past analysis results differ from the
cumulative past analysis and the actual results by 26%. This error reduces by using more
number of data points for near past analysis. This explains the requirements to compare
variations in results obtained using different number of data points.
73
Trend line based on four
points (useful life: 29 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
Trend line based on four
points (useful life: 29 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
Figure 4.2: Cumulative past vs. near past for life estimation
As a rule of thumb, if the result using “n” data points (“n” starts from 2) and
“n+1” data points differ by less than the acceptable remaining life (see section 2.7), “n”
data points can be used for the analysis. More data points need to be considered if the
variation is more than the acceptable remaining life. This concept takes into if the
changes in damage model results (e.g., due to change in geographical location of the
product or use in a different application) are expected to continue in future.
74
5 CONTRIBUTIONS
This thesis describes the development of a life consumption monitoring
methodology for remaining life estimation of electronic products. The life consumption
monitoring methodology has been described in a general way so that it can be extended
to various electronics products. Failure modes, mechanisms, and effects analysis
(FMMEA) and virtual reliability assessment are included in the methodology to identify
the dominant failure mechanisms in a given life cycle environment. A process has been
explained to identify the environmental and operational parameters based on the failure
models for the dominant failure mechanism. An iterative method was described to
estimate the remaining life based on the accumulated damage information obtained from
stress and damage accumulation analysis.
Two case studies were conducted to demonstrate the life consumption monitoring
methodology in automotive underhood applications. Monitored environmental data was
used on a daily basis to estimate remaining life of two circuit card assemblies. The
electrical performances of the circuit card assemblies were monitored throughout the
experiments to determine field failure. The estimated life results were found to be in
agreement with the actual life results.
75
Appendix I: Damage assessment model for temperature induced fatigue analysis
Failure of solder interconnects due to temperature cycling is a common problem
in electronic hardware. Solder joint failures typically arise from fatigue due to thermal
expansion mismatch
• Between the package and the board (global mismatch)
• Between interconnect, solder, board (local mismatch)
A common model for solder joint fatigue due to thermal fatigue is based on the
work of Werner Engelmaier. The model uses a strain range as the metric for calculating
cycles to failure. The calculation of strain range considers both global mismatch and the
local mismatch in thermal expansion.
Assumptions associated with the model are:
• Fatigue failure of solder joints can be described as a power law similar to the Coffin-
Mansion low cycle fatigue equation.
• Strain in the solder arises from global as well as local thermal expansion mismatch
and the strain arising from local mismatch may be added to strain produced by global
strain to get the global strain (worst case).
• In-plane deformations are large compared to out-of-plane warping.
• Complete stress relaxation occurs during the thermal cycle.
The Engelmaier’s thermal fatigue model can be written in equation form as [45]:
c
f
p
f
N
1
2 2
1
|
|
.
|
\
| ?
=
?
?
76
where N
f
is the mean number of cycles to failure, ?
p
is the inelastic strain range and ?
f
is
a material constant (0.325 for eutectic solder).
The exponent c is known as the fatigue ductility coefficient and given by the
following relation
( ) ( )
|
|
.
|
\
|
+ × + × ? ? =
? ?
d
m
t
T c
360
1 ln 10 74 . 1 10 6 422 . 0
2 4
where T
m
is mean cyclic temperature of the solder in
o
C and t
d
is the dwell time in
minutes at the maximum temperature
The inelastic strain range ??
p
for the fatigue relationship is calculated by
considering the response of the package assembly to the change in temperature. The
stress state (stress and strain) is a function of the package, interconnect, and board
geometry and material.
l g t
? ? ? ? + ? = ?
The strain range due to global mismatch for leadless interconnects with eutectic can
be approximated to be [45], [59].
( )
c b g
LT LT
h
FI
? ? ? ? ? ? = ?
5 . 0
where F is the user defined calibration factor
10
I is the calibration factor
11
, h = height of
solder joint (mils) and T
c
, T
s
= temperatures of component and printed circuit board (
o
C).
10
This empirical correction factor accounts for idealized assumptions (F varies from 0.5 to 1.5, typical
values are around 1.0 and are determined by fitting fatigue life results to predicted life)
11
This factor is calibrated in calcePWA software based on various experimental results
77
( ) ( ) ( )( )
min max 2 2
c c cx y cx x c
T T L L LT ? + = ? ? ? ?
( ) ( ) ( )
( )
min max 2 2
b b by y bx x b
T T L L LT ? + = ? ? ? ?
where ?
cx
, ?
cy
, ?
bx and
?
by
are the coefficients of linear thermal expansion for
component and board, in x and y directions respectively (ppm/
o
C); L
x
and L
y
are the span
of interconnect in x and y directions (inch).The strain range due to local thermal
expansion mismatch can be approximated to be
( )
( )
eff
eff
l
Al bA
Al T
cosh
sinh ?
?
? ?
= ?
where G is the shear modulus of the solder, b is the solder thickness, E
l
is the modulus of
elasticity of lead, E
b
is the modulus of elasticity of the board, t
l
is the lead thickness, and
t
b
is the board thickness. The factor A is given by
|
|
.
|
\
|
+ =
b b l l
t E t E b
G
A
1 1
For leadless packages, l
eff
= 0 and hence ??
l
=0.
78
Appendix II: Damage assessment model for vibration induced fatigue analysis
For harmonic vibrations, maximum acceleration of the printed circuit board
(PCB) can be written in terms of maximum displacement, Z
o
as
o
Z a
2
max
? =
In terms of g’s
( )
g
Z f
g
Z
g
a
G
o o
2 2
max
max
2? ?
= = =
The maximum displacement can be written in terms of natural frequency, f
n
as
2
8 . 9
n
in
o
f
Q G
Z =
where G
in
is the input acceleration to the PCB and Q is the transmissibility of the PCB at
its natural frequency. For natural frequencies between 200-400 Hz., a good
approximation is
n
f = Q [60].
Failures due to high cycle fatigue (vibration induced) in solder joints can be
described by the following relationship:
=
b
f
N ? Constant
where ? is the solder joint maximum stress amplitude, N
f
is the cycles to failure and b is a
material property. Steinberg assumes that the stress in the solder joints can be directly
related to the out of plane displacement of the board, i.e.,
=
b
f
ZN Constant
Hence, Z
b b
N Z N
2 2 1 1
=
79
Solving for N
2
,
b
Z
Z
N N
1
2
1
1 2
|
|
.
|
\
|
=
where N
1
and Z
1
are the life and displacement from Steinberg’s equation, N
1
= 10,000,000
for sinusoidal and 20,000,000 cycles for random vibration [60], and Z
2
is the maximum
board displacement amplitude as given by equation 9. Through extensive testing and
design experience of PCB assemblies, Steinberg developed an empirical equation for
maximum allowable displacement, Z
1
L ct
B
Z
00022 . 0
1
=
The life of the components also depends on where they are placed. This variation
in life is a function of radius of curvature under the components and can be included in
the relationship as:
b
y x Z
Z
N N
1
2
1
1 2
) sin( ) sin(
|
|
.
|
\
|
=
? ?
where x, and y are the non-dimensional board co-ordinates of the component center.
Random vibration response is usually discussed in terms of the root mean square
(RMS) acceleration. When the distribution is normal, the RMS value is the mean of the
distribution. To account for the 3? extremes for random vibration [60]
(
(
¸
(
¸
=
2
8 . 9
3
n
rms
o
f
G
Z
80
RMS acceleration of the board is approximated as
Q f PSD G
n rms
2
?
=
For the case study, the board displacement under each solder joint is estimated by
finite element method using the calcePWA software.
In shock environment, Steinberg [60] uses a simple rule of thumb that in shock
environment of less than few thousand total cycles, the maximum allowable displacement
of the PWB is:
L ct
B
d
00132 . 0
=
where B is the length of the PWB edge parallel to the component, L is the length of the
components (inches), c is a constant depending on the package type.
81
Appendix III: Analysis of Car Accident for Case Study-I
The car accident mentioned in one of the case studies, resulted in high levels of
vibration during and after the accident. The data recording device recorded a number of
high vibration events. The maximum g-levels for these events were an order of
magnitude higher than normal conditions. The event with highest acceleration was
selected for input to calcePWA shock analysis module.
Shock analysis is typically conducted with the help of an overstress model.
Overstress failure of solder joints is defined as failure due to stresses that exceed the
ultimate strength of the solder. Shock analysis of solder joints can be conducted with
calcePWA software for an individual shock pulse. A shock pulse can be specified in
calcePWA by the type of pulse (e.g., half sine, unit impulse, terminal sawtooth),
maximum acceleration value and its time duration. The shock analysis can only
determine whether there is an overstress failure.
The identified event was found to contain a number of acceleration peaks. The
maximum peak-to-peak acceleration (45 g) was modeled in calcePWA as a half sine
pulse for shock analysis. The time duration of the 45 g pulse was determined as 3
milliseconds from the recorded data. calcePWA shock analysis showed no overstress
failure. Since the analysis was conducted at the worst case conditions, it was concluded
that no overstress failure occurred during the car accident. The electrical resistances of
the solder joints also confirmed no failure. No information on the accumulated damage
could be obtained from the shock analysis. However, this does not mean that there is no
accumulated damage in the circuit card assembly because of the accident.
82
Since shock analysis was conducted for a duration of 3 milliseconds with a single
pulse, the possibility of accumulated damage could not be ruled out. Further, high
vibration levels lasted for half an hour
12
, which confirmed the fact that there can be
accumulated damage. Hence an attempt was made to analyze the car accident using the
existing random vibration models.
Random vibration damage analysis is typically conducted using high cycle fatigue
models, which require specification of more than 10
4
cycles. The fundamental frequency
of the circuit card assembly for the given mounting system was estimated to be 25.7 Hz
from calcePWA. It was found that thirteen vibration cycles were possible per event (time
duration of approximately 500 milliseconds) at this fundamental frequency. This
precluded the possibility of random vibration analysis of individual events. To overcome
this limitation, all events recorded over a duration of 30 minutes were used to estimate
the power spectral density (PSD). The estimated PSD vs. frequency is shown in Figure.
1. A random vibration analysis was conducted using calcePWA with this PSD
information for a time duration of 30 minutes. Table. 1 gives the power spectral density
vs. frequency input to calcePWA. Random vibration analysis of the shock showed
maximum 15% damage on the solder joints and reduction of life by 6 days. Analysis of
the car accident was possible because of the availability of all the data points over the
interval of time.
12
High vibration events were observed from the moment of the car accident to the moment when the circuit
card assembly and the sensors were removed to repair the car. The time frame also included the vibration
event during disengagement of the cars.
83
Table. 1: Power spectral density vs. frequency input to calcePWA
Frequency
(Hz.)
1.76 12.3 17.6 26.4 76 141 248 378 527 693
Power
Spectral
Density
(g
2
/Hz.)
0.225 0.39 0.632 0.029 0.034 0.645 0.52 0.024 0.007 0.15
er spectral density (PSD) vs. frequency for the car accident
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
1.00E+00 1.00E+01 1.00E+02 1.00E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure. 1: Pow
84
Appendix IV: Effect of temperature data reduction on prediction accuracy of life
consumption monitoring
To estimate the effect of temperature data reduction on the prediction accuracy of
life consumption monitoring, the temperature data collected from case study-II was
analyzed. For this analysis, the data was sampled at the rate of one data point per ten
minutes. The collected temperature data was simplified using the ordered overall range
(OOR) method and the 3-parameter rainflow cycle counting algorithm. A program was
developed to combine the OOR method, 3-parameter rainflow cycle counting algorithms
and Engelmaier’s model for solder joint fatigue to estimate the accumulated damage. The
accumulated damage of an individual solder joint was simulated with different values of
reversal elimination indices, ‘s’ (from 0 to 90%). The geometry and the properties of the
inductors and solder joints were used for the analysis.
For each value of reversal elimination indices, error was estimated. The error
values are compared to a situation where all reversals are preserved (i.e., reversal
elimination index, S = 0.0). Figure. 2 shows a plot of error % as a function of reversal
elimination index.
100 *
0.0) (S on Accumulati Damage
on Accumulati Damage
1 % Error
=
? =
It can be seen from Figure. 2, the value of error is very low for a reversal
elimination index (s) values up to 0.2 (i.e., a peak valley sequence is selected only if their
difference is greater than 0.2 * difference between the highest peak and the lowest
valley). For ‘s’ values greater than 0.2, the error value increases rapidly causing almost
85% error at s=0.9.
85
0
10
20
30
40
50
60
70
80
90
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Reversal Elimination Index
E
r
r
o
r
%
Figure. 2: Estimated error as a function of reversal elimination index
The number of remaining data points (reversals) in the load history decreases with
increase in ‘S’. This is because the cut off value (i.e., ‘S’ times the difference between
highest peak and the lowest valley) increases with increase in ‘S’ given the same load
history (see Figure 3.5). The fraction of data reduction is defined as:
Points Data f o no. Total
Points Data Remaining of No.
Reduction Data Fraction =
Figure. 3 shows the plot of error % as a function of fraction data reduction. It can
be seen that less than 5% error is caused by 96 % data reduction in this case. In other
words, only 5 % of the data points constitute the most damaging reversals for thermal
fatigue analysis in case of solder joints in electronics. Hence the data reduction method
developed can be effectively used for eliminating less damaging reversals.
86
0
10
20
30
40
50
60
70
80
90
0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
Fraction Data Reduction
E
r
r
o
r
%
Figure. 3: Estimated error as a function of data reduction
Different sampling rates result in different number of data points for a given
interval of time. However, a lot of data points are not useful for the temperature analysis
as the OOR algorithm takes into account only those data points that are either peaks or
valleys. Hence the number of data points that are neither peaks nor valleys increase with
sampling rate. As a result, even with reversal elimination index of 0.0 (i.e., all peaks and
valleys are counted), there is high amount of data reduction in case of higher sampling
rate. As long as the sampling rate is above certain limit where all the peaks and valleys
can be recorded, the maximum number of useful data points is independent of sampling
rate (i.e., all peaks and valleys). Specification of a reversal elimination index in such a
case will result in equal number of data points remaining after the OOR analysis. This
will result in equal amounts of damage and hence equal amounts of error for all sampling
rates. In other words, estimated error is independent of reversal elimination index beyond
a certain sampling rate. However, the data reduction will be more for higher sampling
rate for a given amount of error because of higher data reduction at reversal elimination
index of 0.0.
87
REFERENCES
[1] IEEE Reliability Society, “ IEEE Std 1413” IEEE Standard Methodology for
Reliability Prediction and Assessment for Electronic Systems and Equipment,”
New York, NY, January 1999.
[2] Ramakrishnan, A., Syrus, T., and M. Pecht, “Electronic Hardware Reliability,” The
Modern Microwave and RF Handbook, pp. 3-102 to 3-121, CRC Press, Boca
Raton, FL, 2000.
[3] Pecht, M., Product Reliability, Maintainability, and Supportability Handbook, CRC
Press, New York, NY, 1995.
[4] Beder, S., “Making Engineering Design Sustainable,” Transactions of Multi-
Disciplinary Engineering Australia, Vol. GE17, No. 1, pp. 31-35, June 1993.
[5] Kelkar, N., Dasgupta, D., Pecht, M., Knowles, I., Hawley, M. and D. Jennings,
“Smart Electronic Systems for Condition-Based Health Management,” Quality and
Reliability Engineering International, Vol. 13, pp. 3-7, 1997.
[6] Rao, B.K.N., Handbook of Condition Monitoring, Elsevier Science Publishers Ltd.,
Oxford 1996.
[7] Mobley, R.K., An Introduction to Preventive Maintenance, Van Nostrand
Reinhold, New York, NY 1990.
[8] NIST, “Condition-based Maintenance,” Advanced Technology Program Position
Paper,http://www.atp.nist.gov/files/cbm_wp1.pdf (08/01/2003)
88
[9] Borinski, J.W., Meller, S.A., Pulliam, W.J., Murphy, K.A.,and J. Schetz, “Aircraft
health monitoring using optical fiber sensors,” Proceedings of the 19
th
Digital
Avionics Systems Conference, Vol. 2, pp. 6D1/1 –6D1/8, 2000.
[10] Cooper, K.R., Elster, J., Jones, M., and R. G. Kelly, “Optical fiber-based corrosion
sensor systems for health monitoring of aging aircraft,” AUTOTESTCON
Proceedings, IEEE Systems Readiness Technology Conference, pp. 847-856, 2001.
[11] Borinski, J.W., Boyd, C.D., Dietz, J.A., Duke, J.C., and M. R. Home, “Fiber optic
sensors for predictive health monitoring,” AUTOTESTCON Proceedings, IEEE
Systems Readiness Technology Conference, pp. 250-262, 2001.
[12] Goldfine, N., Schlicker, D., Sheiretov, Y., Wailiabaugh, A., Zilberstein, V., and D.
Grundy, “Surface mounted and scanning periodic field eddy-current sensors for
structural health monitoring,” IEEE Aerospace Conference Proceedings, Vol. 6,
pp. 3141-3152, 2002.
[13] Kent, R.M, “Fiber ultrasonics for health monitoring of composites,” Proceedings of
the 19
th
Digital Avionics Systems Conference, Vol.2, pp. 6D3/1 -6D3/6, 2000.
[14] Larson, E.C., and B. E. Parker Jr., “A subspace-based approach to structural health
monitoring,” Proceedings of 19
th
Digital Avionics Systems Conference, Vol.2, pp.
6C5/1 -6C5/8, 2000.
[15] Yen, G. and Tuang Bui “Health monitoring of vibration signatures in rotorcraft
wings,” Proceedings of IEEE Aerospace Conference, Vol.1, pp. 279 –288, Feb
1997.
89
[16] Yen, G., “Health monitoring of vibration signatures,” 23rd International
Conference on Industrial Electronics, Control and Instrumentation, Vol. 3, pp.
1124-1129, Nov 1997.
[17] Hailu, B., Gachagan, A., Hayward, G., and A. McNab, “Embedded piezoelectric
transducers for structural health monitoring,” Proceedings of IEEE Ultrasonics
Symposium, Vol.1, pp. 735 –738, 1999.
[18] Hailu, B., Hayward, G., Gachagan, A., McNab, A., and R. Farlow, “Comparison of
different piezoelectric materials for the design of embedded transducers for
structural health monitoring applications,” IEEE Ultrasonics Symposium, Vol. 2,
pp. 1019-1012, 2000.
[19] Condition Monitoring and Diagnostic Engineering Management (COMADEM)
International,http://www.comadem.com/frameset.htm (08/01/2003)
[20] Society of Machinery Failure Prevention Technology (MFPT),http://www.mfpt.org/ (08/01/2003)
[21] National Aeronautics and Space Administration (NASA), “Aviation Safety
Program (AvSP),”http://www.grc.nasa.gov/WWW/avsp/about.htm (08/01/2003)
[22] National Aeronautics and Space Administration (NASA), “Vehicle Health
Management (VHM),”http://www.grc.nasa.gov/WWW/cdtb/projects/vehiclehealth/index.html
(08/01/2003)
90
[23] Office of Naval Research, “Science & Technology – Aircraft Technology,”http://www.onr.navy.mil/sci_tech/special/351_strike/prog_aircraft.htm
(08/01/2003)
[24] Department of Defense, “The Joint Strike Fighter Prognostics and Health
Management,” www.dtic.mil/ndia/2001systems/hess.pdf (08/01/2003)
[25] QinetiQ, “Vehicle Health and Usage Monitoring System,”http://www.qinetiq.com/etc/medialib/docs/news_room/press_packs/defence.Par.00
04.File.dat/DVD-TSSprel-D5(HK)_latest.doc (08/01/2003)
[26] QinetiQ, “Integrated Engine Management,”http://www.qinetiq.com/news_room/newsreleases/2003/2nd_quarter/integrated0.ht
ml (08/01/2003)
[27] UK Department of Defense,http://www.mod.uk/business/excel/projects/rcsc05.htm
(08/01/2003)
[28] US Army Material Systems Analysis Activity (AMSAA), “AMSAA Capabilities
To Support Simulation-Based Acquisition (SBA),”http://www.amsaa.army.mil/sba/Sba_doc2.html (08/01/2003)
[29] CALCE Electronic Products and Systems Center, University of Maryland, College
Park,http://www.calce.umd.edu/ (08/01/2003)
[30] Pennsylvania State University Applied Research Laboratory,http://www.arl.psu.edu/ (08/01/2003)
[31] The Boeing Company, “ETOPS Maintenance,” Aero Magazine, No. 7, July 1999.
91
[32] Pecht, M., Dube, M., Natishan, M., and I. Knowles, “An Evaluation of Built-In
Test,” IEEE Transactions on Aerospace and Electronic Systems, Vol. 37, No. 1, pp.
266-272, January 2001.
[33] The Boeing Company, “New “Smart” Network to Reduce Boeing JSF Life-Cycle
Costs,” Boeing News Release, Washington, D.C., September 13, 1999.http://www.boeing.com/news/releases/1999/news_release_990913o.htm
(08/01/2003)
[34] General Motors (Mcleish, James. G.), "Email Communication," Warren, Michigan,
9
th
April 2002.
[35] Ramakrishnan, A. and M. Pecht, “Implementing a Life Consumption Monitoring
Process for Electronic Product, IEEE Transactions on Components and Packaging
Technologies, Vol. 26, No. 3, pp. 625-634, September, 2003.
[36] Pecht, M., Radojcic, R., and G. Rao, Guidebook for Managing Silicon Chip
Reliability, CRC Press, Boca Raton, FL, 1999.
[37] Young, D. and A Christou, “Failure Mechanism Models for Electromigration,”
IEEE Transactions on Reliability, Vol. 43, No. 2, June 1994.
[38] Li., J., and A. Dasgupta, Failure Mechanism Models for Material Aging Due to
Inter-Diffusion IEEE Transactions on Reliability, Vol. 43, No. 1, March 1994.
[39] Rudra, B., and D. Jennings, “Tutorial: Failure-Mechanism Models for Conductive-
Filament Formation,” IEEE Transactions on Reliability, Vol. 43, No. 3, September
1994.
92
[40] Dasgupta, A., “Failure Mechanism Models For Cyclic Fatigue,” IEEE Transactions
on Reliability., Vol. 42, No. 4, pp. 548-555, December 1993.
[41] Pecht, M., Lall, P., and E. Hakim, “The Influence of Temperature on Integrated
Circuit Failure Mechanisms,” Quality and Reliability Engineering Intl, Vol. 8, pp.
167-175, 1992.
[42] Miner, M.A., “Cumulative Damage in Fatigue,” Journal of Applied Mechanics, pp.
A-159 to A-164, September 1945.
[43] SAE J1211, Recommended Environmental Practices for Electronic Equipment
Design, Rev. November 78.
[44] Monthly Temperature Averages for the Washington, DC Area,http://www.weather.com/weather/climatology/monthly/USDC0001 (08/01/2003)
[45] Engelmaier, W., "Generic Reliability Figures of Merit Design Tools for Surface
Mount Solder Attachments,” IEEE Transactions of CHMT, Vol. 16, No. 1., pp.
103-112, 1993.
[46] Dallas Instruments, SAVER' User’s Manual, Dallas, Texas, February 2000.
[47] Bendat, S.J. and A.G. Piersol, Random Data: Analysis and Measurement
Procedures, Wiley-Interscience, New York, NY, 1971.
[48] Collins, J., Failure of Materials in Mechanical Design, John Wiley & Sons, New
York, NY, 1993.
93
[49] Cluff, K.D., “Characterizing the Commercial Avionics Thermal Environment for
Field Reliability Assessment,” Journal of the Institute of Environmental Sciences,
Vol. 40, No. 4, pp. 22-28, Jul.-Aug. 1997.
[50] Anzai, H., “Algorithm of the Rainflow Method,” pp. 11-20, The Rainflow Method
in Fatigue, Butterworth-Heinemann, Oxford, 1991.
[51] Fuchs, H.O., Nelson, D.V., Burke, M.A., and T.L. Toomay, “Shortcuts in
Cumulative Damage Analysis,” SAE National Automobile Engineering Meeting,
Detroit, 1973.
[52] Ramakrishnan, A., “Health and Life Consumption Monitoring Using Sensor
Technologies,” Masters’ Thesis, University of Maryland, 2001
[53] Mishra, S., Pecht, M., Smith, T., McNee, I., and R. Harris, “Life Consumption
Monitoring Approach for Remaining Life Estimation”, European Microelectronics
Packaging and Interconnection Symposium, IMAPS, pp. 136-142, Cracow, Poland,
June 2002.
[54] IPC, “IPC J-STD-029” Performance and Reliability Test Methods for Flip Chip,
Chip Scale, BGA, and other Surface Mount Array Package Applications,” February
2000.
[55] MEG-Array, “Solder Joint Reliability Testing Results Summary, IPC-SM-785,”http://www.fciconnect.com/pdffiles/highspeed/MEG-Array_IPC-SM-
785_Results.pdf (08/01/2003)
94
95
[56] Collins, J., Failure of Materials in Mechanical Design, John Wiley & Sons, New
York, NY, 1993.
[57] Anzai, H., “Algorithm of the Rainflow Method,” pp. 11-20, The Rainflow Method
in Fatigue, Butterworth-Heinemann, Oxford, 1991.
[58] Constable, J. H., “Electrical Resistance as an Indicator of Fatigue,” IEEE
Transactions on Components, Hybrid and Manufacturing Technology, Vol. 15, No.
6, December 1992.
[59] Institute for Interconnecting and Packaging Electronic Circuits, “IPC SM-785”
Guidelines for Accelerated Reliability Testing of Surface Mount Solder
Attachments," Lincolnwood, IL, July 1992.
[60] D. S. Steinberg, Vibration Analysis for Electronic Equipment, John Willey & Sons,
New York, NY, 1988.
doc_700786555.pdf
Consumption is a major concept in economics and is also studied by many other social sciences. Economists are particularly interested in the relationship between consumption and income, and therefore in economics the consumption function plays a major role.
ABSTRACT
Title of the Thesis: LIFE CONSUMPTION MONITORING FOR ELECTRONICS
Degree Candidate: Satchidananda Mishra, Master of Science, 2003
Thesis directed by: Professor Michael Pecht
Department of Mechanical Engineering
Life consumption monitoring is a method to assess product’s reliability based on
its remaining life in a given life cycle environment. The life consumption monitoring
process involves continuous or periodic measurement, sensing, recording, and
interpretation of physical parameters associated with a system’s life cycle environment to
quantify the amount of degradation.
This thesis explains a life consumption monitoring methodology for electronic
products, which includes failure modes, mechanisms and effects analysis (FMMEA),
virtual reliability assessment, monitoring product parameters, data simplification, stress
and damage accumulation analysis and remaining life estimation. It presents two case
studies to estimate the remaining life of identical circuit card assemblies in an automobile
underhood environment using the life consumption monitoring methodology. Failure
modes, mechanisms, and effects analysis along with virtual reliability assessment is used
to determine the dominant failure mechanism in the given life cycle environment.
Temperature and vibration are found to be the environmental factors, which could
potentially cause malfunction of the circuit card assembly through solder joint fatigue.
Temperature sensor and accelerometers are used along with a data logger to monitor and
record the environmental loads during the experiment. A data simplification scheme is
used to make the raw sensor data suitable for further processing. Stress and damage
models are used to estimate the remaining life of the circuit card assembly based on the
simplified data. Performances of the test board assemblies are monitored through
resistance monitoring. The life cycle environment and results for the case studies are
compared with each other. The estimated results are also compared with experimental life
results.
LIFE CONSUMPTION MONITORING FOR ELECTRONICS
By
Satchidananda Mishra
Thesis submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Master of Science
2003
Advisory Committee:
Professor Michael Pecht, Chair
Associate Professor Patrick McCluskey
Associate Professor Peter Sandborn
© Copyright by
Satchidananda Mishra
2003
PREFACE
In today’s world, increasing global competition along with consumers’
perceptions toward performance, quality, reliability, safety, and environmental
considerations are compelling manufacturers to improve their design for higher reliability
in field applications. In electronics industry, product development trends have supported
this requirement through rapid technological changes resulting in rapid market growth.
However, as manufacturers try to keep pace with performance requirements of modern
electronic industry, reliability is being traded-off at an affordable cost. Intense market
competition has reduced time-to-market for electronics products tremendously thereby
providing less time for extensive reliability testing. There has been a continuous
transition in the electronics industry from military-specification parts to commercial-off-
the-shelf (COTS) parts, many of which are now targeted for lifetimes in the 5 to 7 year
range. Wearout of electronics parts has become a relevant concern with this transition.
Hence there is a need for today’s companies to consider novel approaches to improve
design and maintain operational efficiency of their products in field applications to ensure
customer satisfaction.
Health monitoring has emerged as a promising alternative to traditional reliability
prediction, scheduled maintenance, and run-to-failure operations. Health monitoring is
the method of monitoring product reliability in terms of its health in the life cycle
environment. Life consumption monitoring (LCM) is a health monitoring method to
assess product’s reliability based on its remaining life in a given life cycle environment.
The aim of this thesis is to develop and demonstrate a life consumption monitoring
ii
methodology to determine the remaining life and reliability of an electronic product based
on monitored environmental and operational data. Life consumption monitoring is a
prognostic process unlike many other health monitoring approaches. A well-designed
maintenance procedure based on life consumption monitoring can be used to predict and
prevent system failure and hence to reduce operating costs.
Chapter 1 discusses the concept and motivation behind the reliability prediction
practices followed in industry. This chapter highlights the drawbacks of current
reliability prediction techniques and explains health monitoring as a solution to the
reliability prediction challenge. The approaches adopted for health and life consumption
monitoring are described along with various examples.
Chapter 2 describes a physics-of-failure-based life consumption monitoring to
determine the remaining life of an electronic product. The life consumption monitoring
process involves continuous or periodic measurement, sensing, recording, and
interpretation of physical parameters associated with a system’s life cycle environment to
quantify the amount of system degradation. The process is documented in the form of a
flowchart, which includes failure modes, mechanisms, and effects analysis (FMMEA),
virtual reliability assessment, monitoring product parameters in the product’s life cycle
environment, data simplification, stress and damage accumulation analysis and remaining
life estimation.
Chapter 3 describes two case studies to demonstrate the developed life
consumption monitoring methodology. For the case studies, two circuit card assemblies
were mounted under-the-hood of an automobile. Failure modes, mechanisms, and effects
analysis (FMMEA) was conducted along with a virtual reliability assessment for the
iii
circuit card assembly in the given life cycle environment, which revealed that solder joint
fatigue is the dominant failure mechanism. The environmental parameters that can cause
damage are identified as temperature, and vibration. A suitable data simplification
scheme was developed to make the sensor data suitable for input to solder joint fatigue
models. The identified environmental parameters were monitored and recorded with the
help of sensors and a battery powered data logger. The collected and simplified data was
used with stress and damage models using the calcePWA analysis software to estimate
the remaining life. The actual life of the circuit card assemblies were checked through
resistance monitoring and compared with the estimated life.
Chapter 4 presents a summary and some discussion on the life consumption
monitoring methodology described in the thesis. Chapter 5 highlights the specific
contributions made in this thesis.
iv
ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to Dr. Michael Pecht, Dr. Peter Sandborn,
and Dr. Patrick McCluskey of the University of Maryland, College Park for their advice
and support during the course of this thesis and my stay at the University of Maryland. I
am grateful to Dr. Diganta Das, Dr. Miky Lee, Dr. Sanka Ganesan, Keith Rogers, and
Dan Danahoe of CALCE center at the University of Maryland, without whose help this
thesis would not have reached a fruitful completion. I also wish to express my gratitude
to Mr. Doug Goodman from Ridgetop Group Inc., Mr. Bart Feys from Dallas
Instruments, and Mr. Paul Macmillan from ACI-AppliCAD Inc.
I express my special thanks to my colleagues Yuki Fukuda, Jeremy Cunningham,
Sathyanarayan Ganesan, Niranjan Vijayragavan, Yu-Chul Hwang, Paul Casey, Anoop
Rawat, Vidyasagar Shetty, Lewis Gershan, Ji Wu, Sanjay Tiku, Subramaniam Rajagopal,
Leila Jannessari, Joseph Varghese, Arindam Goswami, Ricky Valentin, Karumbu,
Kaushik Ghosh, who have always been helpful to me during my thesis work.
v
TABLE OF CONTENTS
LIST OF TABLES viii
LIST OF FIGURES ix
1 INTRODUCTION 1
1.1 RELIABILITY PREDICTION OF ELECTRONICS 3
1.2 HEALTH MONITORING 5
1.3 APPROACHES FOR HEALTH MONITORING 7
1.3.1 Current Condition Monitoring 7
1.3.2 Life Consumption Monitoring 7
1.4 CURRENT STATE OF HEALTH MONITORING RESEARCH 8
1.5 HEALTH MONITORING EXAMPLES 10
2 LIFE CONSUMPTION MONITORING METHODOLOGY FOR ELECTRONICS13
2.1 FAILURE MODES, MECHANISMS AND EFFECTS ANALYSIS 14
2.1.1 Identification of Failure Modes and Corresponding Failure Sites 15
2.1.2 Identification of Failure Mechanisms and Models 15
2.1.3 Identification of the Life Cycle Conditions 15
2.1.4 Selection of failure mechanisms that can precipitate a failure mode 16
2.2 VIRTUAL RELIABILITY ASSESSMENT 16
2.2.1 Prioritization of the Failure Mechanisms Based on Time-to-failures 17
2.2.2 Identification of The dominant Failure Mechanisms 17
2.3 MONITORING APPROPRIATE PRODUCT PARAMETERS 18
2.4 DATA SIMPLIFICATION PROCESSES 19
2.5 STRESS AND DAMAGE ACCUMULATION ANALYSIS 20
2.5.1 Stress and Damage Models 20
2.5.2 Damage Accumulation Theories 22
2.6 ESTIMATION OF REMAINING LIFE 24
2.7 ACCEPTABLE REMAINING LIFE 24
3 EXPERIMENTAL CASE STUDIES ON LIFE CONSUMPTION MONITORING 26
3.1 FAILURE MODES, MECHANISMS AND EFFECTS ANALYSIS 27
3.2 VIRTUAL RELIABILITY ASSESSMENT 28
3.3 MONITORING PRODUCT PARAMETERS 33
3.3.1 Sampling Issues for Monitoring of Continuous Signals 35
3.4 DATA SIMPLIFICATION 38
3.4.1 Temperature Data Simplification 38
3.4.2 Vibration Data Simplification 47
3.5 STRESS AND DAMAGE ACCUMULATION ANALYSIS 48
vi
3.6 REMAINING LIFE ASSESSMENT 51
3.7 FAILURE DEFINITION AND DETECTION 51
3.8 MONITORED ENVIRONMENT AND RESULTS 52
3.8.1 Case Study-I 52
3.8.2 Case Study-II 61
3.8.3 Comparison between case study-I and II 67
4 SUMMARY AND DISCUSSION 68
4.1 USING HANDBOOKS AND SIMILARITY ANALYSIS TO DETERMINE ENVIRONMENTAL
AND OPERATIONAL LIFE CYCLE CONDITIONS 70
4.2 DETERMINING THE NUMBER OF DATA POINTS REQUIRED FOR LIFE ESTIMATION 72
5 CONTRIBUTIONS 75
APPENDIX I: DAMAGE ASSESSMENT MODEL FOR TEMPERATURE INDUCED
FATIGUE ANALYSIS 76
APPENDIX I: DAMAGE ASSESSMENT MODEL FOR TEMPERATURE INDUCED
FATIGUE ANALYSIS 76
APPENDIX II: DAMAGE ASSESSMENT MODEL FOR VIBRATION INDUCED
FATIGUE ANALYSIS 79
APPENDIX III: ANALYSIS OF CAR ACCIDENT FOR CASE STUDY-I 82
APPENDIX IV: EFFECT OF TEMPERATURE DATA REDUCTION ON
PREDICTION ACCURACY OF LIFE CONSUMPTION MONITORING 85
REFERENCES 88
vii
LIST OF TABLES
Table 3.1: Failure Modes and Effects Analysis (FMEA) for the circuit card assembly
used for the experiment..................................................................................................... 29
Table 3.2: Data used for defining the temperature environment for the virtual reliability
assessment......................................................................................................................... 31
Table 3.3: Virtual reliability assessment........................................................................... 32
Table 3.4: Comparison between case studies I and II....................................................... 67
viii
LIST OF FIGURES
Figure 2.1: Various steps in the life consumption monitoring approach...........................14
Figure 3.1: Experimental setup with the test board mounted under-the-hood of a car
(1997 Toyota 4Runner)......................................................................................................27
Figure 3.2: The power spectral density plot used for the virtual reliability assessment ....31
Figure 3.3: The data logger with external temperature and vibration sensor ....................34
Figure 3.4: Sampling of a continuous time record.............................................................36
Figure 3.5: Temperature history showing the reversals (i.e., peaks and valleys) ..............40
Figure 3.6: Identifying cycles in a load history .................................................................42
Figure 3.7: Rainflow cycle counting..................................................................................45
Figure 3.8: Loop condition and loop reaping operations...................................................46
Figure 3.9: Ratio of the response of the PCB to the excitation vs. frequency. The peaks in
this plot identify the natural frequencies............................................................................50
Figure 3.10: Monitored temperature during case study-I ..................................................53
Figure 3.11: Monitored temperature converted to the peaks and valleys for case study-I 54
Figure 3.12: Power spectral density (PSD) vs. frequency plot for case study-I ................55
Figure 3.13: Estimated board displacement due to vibration for case study-I...................56
Figure 3.14: Recorded vibration event during the car accident. The maximum
acceleration values were from +22 g to –23 g. ..................................................................57
Figure 3.15: Accumulated damage estimated using calcePWA and Miner’s rule for case
study-I ................................................................................................................................58
Figure 3.16: Resistances of the solder joints along with the intermittent resistance spikes
for case study-I...................................................................................................................59
Figure 3.17: Remaining life estimation summary for case study-I....................................60
Figure 3.18: Crack in one of the solder joints....................................................................60
Figure 3.19: Experimental setup for case study-II.............................................................61
ix
Figure 3.20: Monitored temperature for case study-II.......................................................62
Figure 3.21: Monitored temperature profile converted to the peaks and valleys for case
study-II...............................................................................................................................63
Figure 3.22: Power spectral density (PSD) vs. frequency plots for case study-II .............64
Figure 3.23: Accumulated damage for case study-II .........................................................65
Figure 3.24: Resistances of the solder joints along with the intermittent resistance spikes
for case study -II. ...............................................................................................................66
Figure 3.25: Remaining life estimation summary for case study-II ..................................66
Figure 4.1: Extension of life based on life consumption monitoring results.....................72
Figure 4.2: Cumulative past vs. near past for life estimation ............................................74
x
1 INTRODUCTION
Reliability is defined as the ability of a product to perform as intended (i.e.,
without failure and within specified performance limits) for a specified time, in its life
cycle application environment. During the past 25 years, there has been a lot of
improvement in the reliability of electronic products to keep pace with the increased
warranties and the possible liabilities of product failures. Various technology
improvements including semiconductor manufacturing processes have continuously
helped to increase device reliability. However, there has been continuous shrinking of the
feature sizes for electronic devices along with improved performance requirements. In
fact, according to Moore’s law number of transistors in a given semiconductor has been
doubling every eighteen months. This trend has decreased the device dimensions thereby
resulting in higher electric fields and higher localized heating. Surface mount technology
(SMT) has become a common practice to take into account the higher I/O requirement,
which has made interconnects more vulnerable to harsh environments. In other words, as
manufacturers try to keep pace with performance requirements, reliability of electronic
systems is getting traded-off for increased functionality at an affordable cost.
There has been intense competition among rival companies based on cost and
quality of electronics products. Increasing market competition and customers’
expectation for latest technology has reduced the allowable “time to market” in
electronics industry tremendously. With reduction in allowable time for the product
development cycles, there is less opportunity for extensive reliability testing. As a result,
outputs from the current reliability assessment schemes, which are based on extensive
reliability trials, may not satisfy the customer requirements. Failure to resolve this issue
can result in high risk of in-service availability and inflated life support costs for
electronic systems.
The modern electronics industry is more and more being driven by the consumer
electronics segment as compared to space, military, avionics and oil-drilling segment
(i.e., the low volume complex electronics or LVCES industry). This has compelled the
LVCES industry to adapt to the commercial-off-the-shelf (COTS) parts instead of
traditional military-specification parts. With this transition from military-specification
parts to COTS parts, many of which are now targeted for lifetimes in the 5 to 7 year
range, wearout of electronics parts is becoming a concern.
There has been a constant need for the maintenance strategies to be more
proactive based on in-service reliability of the products. By knowing whether and, more
importantly, when maintenance is needed, production or operation schedules can be
synchronized, and the cost of maintenance can be reduced. Hence there is a need for the
companies to consider various approaches in order to improve reliability and maintain the
operational efficiency of their products in the field applications including in-service
reliability monitoring.
In-service reliability of a product is dependent on the environmental and
operational conditions of the product in its field applications. Traditionally, reliability of
electronic products has been predicted without keeping in mind the actual environmental
and operational parameters in its life cycle environment. Current data collection schemes
for reliability assessment are often designed before in-service operational and
environmental aspects of the system are entirely understood. Further the allowable time
2
for reliability trials are reducing because of reduction in product development cycles,
thereby causing lack of suitable test data.
In summary, today’s organizations are faced with the challenge of maintaining
electronic product reliability with increased performance requirements, increased market
competition, and less allowable time-to-market. There is a continuous demand to lower
maintenance costs and to hasten operational readiness/responsiveness.
Health monitoring has emerged as a promising alternative to traditional reliability
prediction, scheduled maintenance, and run-to-failure operations. Health monitoring is
the method of monitoring product reliability in terms of its health in the life cycle
environment. Life consumption monitoring (LCM) is a health monitoring method to
assess product’s reliability based on its remaining life in a given life cycle environment.
The life consumption monitoring process involves continuous or periodic measurement,
sensing, recording, and interpretation of physical parameters associated with a product’s
life cycle environment to quantify the amount of degradation. This thesis describes the
concept and various aspects of life consumption monitoring along with two case studies.
1.1 Reliability Prediction of Electronics
Reliability is defined as the ability of a product to perform as intended (i.e.,
without failure and within specified performance limits) for a specified time, in its life
cycle application environment. An efficient reliability prediction can be used for
numerous purposes, including the following [1]
• Comparisons of the designs and products
• Methods to identify potential reliability improvement opportunities
• Logistics support
3
? Forecast warranty and life cycle costs
? Spare parts provisioning
? Availability
• Safety analysis
• Mission reliability estimation
• End item reliability estimation
• Prediction of reliability performance
IEEE standard 1413 [1], titled “Standard Methodology for Reliability Prediction
and Assessment for Electronic Systems and Equipment” presents the key parameters of
importance for reliability prediction include structural architecture, material properties,
fabrication and assembly processes, and the life cycle environment.
Defining and characterizing the product life cycle environment is often the most
uncertain input into a reliability prediction scheme. Product life cycle environment
typically includes storage, transportation, handling and application scenario of the
product. It also describes the expected severity and duration of the load conditions for
each scenario [2], [3]. Load conditions for an electronic product include temperature,
humidity, vibration or shock loads, contaminants, radiation levels, electromagnetic
interference and loads caused by operational parameters such as current, power and heat
dissipation. Life cycle environment characterization also requires knowledge of
parameters like the application length, the number of applications in the expected life of
the product, and the product utilization or non-utilization profile (storage, testing,
transportation).
4
The common practice of design has been to provide a safety margin, i.e.,
designing for a high product load and recommending operation at a lower value, due to
uncertainties regarding the actual life cycle loads for a product [4]. If the actual life cycle
loads are different from the designed ones, this design practice can lead to costly over
design or hazardous under design, and consequently, increased costs.
1.2 Health Monitoring
Health monitoring is one of the emerging and most promising developments in the
evolution of in-service reliability assessment and maintenance practices. A product’s
health is the extent of degradation or deviation from its “normal” operating state. Hence
health monitoring is based on the condition of the actual system or equipment concerned,
not on the statistical mean. By determining whether and, more importantly, when failure
can occur, procedures can be developed to mitigate, manage or maintain the product [5].
An efficient health monitoring scheme can be used to [6]:
• Reduce lost output penalties
• Reduce forced outage repair and labor costs
• Reduce spares holdings
• Reduce severity of failures
• Improve safety margins
• Reduce insurance premiums
• Extend maintenance cycles
• Maintain the effectiveness of equipment through timely repair actions
• Improve repair quality
• Increase profitability
5
Methods employed for health monitoring can include non-destructive tests (e.g.,
ultrasonic inspection, liquid penetrant inspection, and visual inspection) and operating
parameter monitoring (e.g., vibration monitoring, oil consumption monitoring and
thermography (infrared) monitoring) [7]. Predictive or prognostic health monitoring
methods involve monitoring of the life cycle environment of the product (e.g.,
temperature, humidity, shock, vibration, current, power, heat dissipation) to predict when
the product is going to fail in real life.
Health monitoring has been used for both electrical and mechanical systems for
reliability prediction and hence reduction in maintenance expenses. An example of the
monetary benefit of health monitoring was presented in the context of corrosion in a
workshop on condition-based maintenance (CBM) on November 17-18, 1998 in Atlanta,
organized by the Advanced Technology Program (ATP) of the National Institute of
Standards and Technology (NIST) [8]. It was stated that if corrosion could be measured
directly at a refinery plant, the downtime for maintenance could be reduced from every
year to potentially every 3 years. Since a typical maintenance period is two weeks to a
month (about 10% of the available operating time), an economic value can be assigned to
a reduction in downtime. However, offsetting maintenance costs is probably not the
primary economic driver for CBM. Instead, it should be looked at as an integral part of a
business strategy for profitability. In the case of CBM, it contributes to maximum up time
(capacity) with reduced operating costs.
There are some technical barriers for implementation of health monitoring, which
include the inability to continually monitor a system and accurately predict the remaining
useful life. Further the use of health monitoring in real life can be more appreciated if it
6
can help in learning and identifying impending failures for a system as well as
recommending an action.
1.3 Approaches for Health Monitoring
Health monitoring is the method of evaluating reliability in terms of product’s
health in its life cycle environment. Health monitoring methods can be broadly classified
into two categories, i.e., current condition monitoring and life consumption monitoring.
1.3.1 Current Condition Monitoring
Current condition monitoring is a method of evaluating the product’s operating
state in terms its physical degradation (e.g., cracks, corrosion, delamination), electrical
degradation (e.g., increase in resistance, increase in threshold voltage), and performance
degradation (e.g., shift of the product’s operating parameters from expected values). The
objective of condition monitoring (also called condition-based maintenance) is to
accurately detect the current state of electrical and mechanical systems and enable the
user to make a decision on whether to perform maintenance. Hence condition monitoring
is mainly a diagnostic activity. This helps to prevent operational deficiencies and failures,
eliminates costly periodic maintenance, and reduces the likelihood of machinery failures.
1.3.2 Life Consumption Monitoring
Life consumption monitoring (LCM) is a method to assess product’s reliability
based on its remaining life in a given life cycle environment. In life consumption
monitoring, product’s reliability can be assessed by comparing the remaining life of the
product with estimated total life. The life consumption monitoring process involves
continuous or periodic measurement, sensing, recording, and interpretation of physical
7
parameters associated with a system’s life cycle environment to quantify the amount of
system degradation.
1.4 Current State of Health Monitoring Research
Most of the work on health monitoring available in literature focuses on
diagnostic or condition monitoring of various mechanical (metals and composites)
structures. This is often sited as “structural health monitoring”. Typical methods used for
condition monitoring include
• Visual inspection
• Optical fibers [9], [10], [11]
• Eddy current [12]
• Acoustic emission [13]
• Vibration signatures and modal analysis [14], [15], [16]
• Piezoelectric materials [17], [18]
Several organizations, professional societies and universities are involved in
activities related to health monitoring. The following section gives a listing of some of
the leading groups involved in health monitoring research:
• Condition Monitoring and Diagnostic Engineering Management (COMADEM) [19]
- Consultancy program on different aspects of condition and diagnostic monitoring
focusing on proactive integrated maintenance management.
- Publishers of “International Journal of COMADEM” dedicated to sensor
technology, structural health monitoring and machinery/process health monitoring
• Society for Machinery Failure Prevention Technology (MFPT) is a professional
society focused on sensors technology, condition monitoring, predictive maintenance,
8
prognostics technology, condition based maintenance, nondestructive evaluation and
testing, life extension and integrated diagnostics in conjunction with the annual
meeting [20].
• The National Aeronautics and Space Administration (NASA)
- Health monitoring for aviation safety program (AvSP) which includes monitoring
fuel flow, rotor speeds, oil temperature/ pressure, engine vibration [21]
- Efficient checkout, testing and monitoring of space transportation vehicles,
subsystems and components before, during and after operation under the vehicle
health monitoring (VHM) program [22].
- Develops smart sensors for health monitoring, e.g., solenoid health monitor for
valve health monitoring and failure prediction.
• Office of Naval Research conducts research for on-board mechanical diagnostics and
vehicle health monitoring (integrated avionics) for improved operational effectiveness
of air vehicles with increased capability, range, speed, time-on-station, and carrier
suitability [23].
• Department of Defense (DoD) has a prognostics health management (PHM) program
for joint strike fighter (JSF) [24]
• QinetiQ, UK
- Vehicle health and usage monitoring system (vHUMS) program that uses various
sensor technologies, data analysis and reporting tools [25]
- Integrated engine management program using condition monitoring [26]
• UK Ministry of Defense (MoD) - Engine health monitoring systems [27]
9
• US Army Material Systems Analysis Activity (AMSAA) – Physics-of-failure
approach for reliability modeling [28]
• CALCE EPSC at University of Maryland conducts research on diagnostic and
prognostic health monitoring focusing on electronics [29]
• Pennsylvania State University Applied Research Laboratory (ARL) [30]
1.5 Health Monitoring Examples
An example of condition monitoring in mechanical systems is applied in the
ETOPS (Extended-range Twin-engine Operations) program. ETOPS restriction,
formalized under US FAA Regulations in 1953, prohibits passenger carrying aircraft with
only two engines from flying any route more that a given single-engine flying time from
a suitable and open landing site. Gradual relaxation in this rule has resulted from
improvements in health monitoring technologies, which now provide the continuous
monitoring of critical aircraft systems necessary to identify problems before they affect
aircraft operation or safety along with reductions in engine in-flight shutdown rates. The
ETOPS philosophy is a real-time approach to maintenance and includes continual
monitoring of application conditions to identify problems. Two typical examples of
ETOPS are engine condition monitoring (ECM) and oil consumption monitoring. ETOPS
operators are required to use ECM programs to monitor adverse trends in engine
performance and execute maintenance to avoid serious failures (e.g., those that could
cause in-flight shutdowns, diversions, or turnbacks). The ECM programs allow for
monitoring of engine parameters such as exhaust temperature, fuel and oil pressures, and
vibration. In some cases, oil consumption data and ECM data can be correlated to define
10
certain problems. Any engine deterioration that might affect ETOPS operations is
monitored through a disciplined data collection and analysis program [31].
Built-in-test (BIT) is a condition monitoring technique used for electronics that
uses hardware-software diagnostic mean to identify and locate faults. Two types of BIT
concepts are employed in electronic systems, interruptive BIT (I-BIT) and continuous
BIT (C-BIT). The concept of I-BIT is that normal equipment operation is suspended
during BIT operation. Such BITs are typically initiated by the operator or during a
power-up process. The concept of C-BIT is that equipment is monitored continuously and
automatically without affecting normal operation [32].
JDIS (Joint Distributed Information System), another example of health
monitoring scheme applied to mechanical systems. JDIS can anticipate maintenance and
repair needs to ensure that equipment and personnel are available precisely when needed
[33]. It can reduce the time taken to deliver aircraft replacement parts to be hours, as
compared to weeks or months under current practices. During a demonstration, Boeing
simulated how a network of computers and aircraft sensors can trigger an autonomic
response to a pending maintenance need under JDIS scheme. For instance, if a part
failure occurs or is predicted to occur, JDIS initiates a series of actions that can provide
the right information for the engineer about the replacement of parts at the right time.
This way, human interaction is minimized as data flows from the aircraft through the
maintenance infrastructure and ultimately to the supplier community.
General Motors’ research labs are using predictive equations for calculating
remaining oil based on monitoring engine usage over time [34]. Engine oil breaks down
as a function of time at temperature oxidation and engine usage related contamination.
11
On selected vehicles this algorithm is programmed into the engine control modules
(ECM) and keeps the driver aware of their oil life status and displays via the vehicle's
driver information center (DIC) display.
12
2 LIFE CONSUMPTION MONITORING METHODOLOGY FOR
ELECTRONICS
Life consumption monitoring (LCM) is a health monitoring method to assess
product’s reliability based on its remaining life in a given life cycle environment. The life
consumption monitoring process involves continuous or periodic measurement, sensing,
recording, and interpretation of physical parameters associated with a system’s life cycle
environment to quantify the amount of system degradation. This section explains a life
consumption monitoring methodology for electronics.
Life consumption monitoring methodology has six steps to estimate the remaining
life of an electronic product (Figure 2.1). These steps include failure modes, mechanisms
and effect analysis (FMMEA), virtual reliability assessment, monitoring of the critical
parameters of the product’s life cycle environment, simplification of the monitored data,
stress and damage accumulation analysis, and remaining life estimation. Each step will be
described in this section.
The life consumption monitoring methodology described in this thesis is an
improvement over the existing methodology developed by Ramakrishnan et al. [35] for
his Masters thesis. The existing methodology focused on estimation of accumulated
damage of solder joints for electronics. An assumption was made that temperature and
vibration are the dominant environmental parameters that can cause failure due to solder
joint fatigue.
The improved methodology has been extended and generalized to system level,
where there is a possibility of various other failure mechanisms. Failure modes,
mechanisms, and effects analysis (FMMEA) and virtual reliability assessment has been
13
included in the improved methodology to determine the dominant failure mechanism in a
given life cycle environment and the corresponding environmental and operational
parameters. Another step has been added to determine the remaining life of the product
based on the accumulated damage information.
Step 1: Conduct failure modes, mechanisms and effects analysis
Step 6: Estimate the remaining life of the product
Step 5: Perform stress and damage accumulation analysis
Continue
monitoring
Is the
remaining-life
acceptable?
No
Yes
Step 4: Conduct data simplification to make sensor data suitable for stress
and damage models
Schedule a maintenance action
Step 3: Monitor appropriate product parameters
environmental (e.g, shock, vibration, temperature, humidity)
operational (e.g., voltage, power, heat dissipation)
Step 2: Conduct a virtual reliability assessment to assess the failure
mechanisms with earliest time-to-failure
Step 1: Conduct failure modes, mechanisms and effects analysis
Step 6: Estimate the remaining life of the product
Step 5: Perform stress and damage accumulation analysis
Continue
monitoring
Is the
remaining-life
acceptable?
Is the
remaining-life
acceptable?
No
Yes
Step 4: Conduct data simplification to make sensor data suitable for stress
and damage models
Schedule a maintenance action
Step 3: Monitor appropriate product parameters
environmental (e.g, shock, vibration, temperature, humidity)
operational (e.g., voltage, power, heat dissipation)
Step 2: Conduct a virtual reliability assessment to assess the failure
mechanisms with earliest time-to-failure
Figure 2.1: Various steps in the life consumption monitoring approach
2.1 Failure Modes, Mechanisms and Effects Analysis
An electronic product is typically a combination of components and interconnects,
all having various failure mechanism by which they can fail in the life cycle applications.
The objective of the failure modes, mechanisms, and effects analysis (FMMEA) in life
consumption monitoring is to identify the failure mechanisms that can precipitate a
14
failure mode in the given environmental and operational conditions. Following sections
discuss about the steps involved in FMMEA.
2.1.1 Identification of Failure Modes and Corresponding Failure Sites
The failure modes, mechanisms, and effects analysis (FMMEA) starts with
identification of all possible failure modes and the corresponding failure sites. A failure
mode is defined by how a failure is observed. Hence it is closely related to the functional
and performance requirements of the product. Failure modes are identified by
determining what could possibly go wrong or how a product can fail to meet its
specifications. Typical failure modes for electronic products include electrical opens or
shorts, change in resistance, intermittent resistance change. Failure site defines the
location of failure, e.g., printed circuit board, plated through holes (PTH), components,
interconnects.
2.1.2 Identification of Failure Mechanisms and Models
The second step in FMMEA is to determine all possible failure mechanisms
followed by identification of corresponding failure models available. Failure models are
used to identify the environmental and operational parameters along with the product
geometry responsible for a specific failure mechanism. More details about various failure
models for electronic components and printed circuit boards can be found in literature
[36]-[41].
2.1.3 Identification of the Life Cycle Conditions
This step requires determination of the life cycle environment conditions for the
product. The life cycle environment of a product consists of the assembly, storage,
15
handling, and usage conditions of the product, including the severity and duration of
these conditions. Information on product usage conditions can be obtained from
environmental handbooks or data monitored in similar environments. Some times it may
be necessary to include the assembly, storage, handling and transportation conditions.
The life cycle conditions are compared with the inputs to the failure models in the next
step.
2.1.4 Selection of failure mechanisms that can precipitate a failure mode
Depending on the life cycle environment, particular failure mechanisms have the
potential to cause product failure. This step eliminates some of the failure mechanisms
based on inputs to failure models, life cycle environment conditions and product
geometry. For example, metallization corrosion models require high moisture content to
precipitate a failure mode. Hence if the moisture content is very low in a given
environment, failure due to metallization corrosion might be eliminated. The failure
mechanisms that cannot be eliminated are selected for analysis using virtual reliability
assessment.
2.2 Virtual Reliability Assessment
Virtual reliability assessment method is used to assess potential failure mechanisms
identified by the failure modes, mechanisms, and effects analysis (FMMEA). The
objective of this step to identify the dominant failure mechanisms and corresponding
environmental and operational parameters based on time-to-failures and identify the
environmental and operational parameters for monitoring.
16
2.2.1 Prioritization of the Failure Mechanisms Based on Time-to-failures
This step starts with estimation of time-to-failures based on the failure
mechanisms and models selected by FMMEA. The failure mechanisms are then ranked
based on the time-to-failures. Failure models typically require product geometry along
with life cycle environmental and operational parameters. At this stage of life
consumption monitoring the product geometry is available Life cycle environment
conditions for the analysis are taken from the sources identified during FMMEA. In cases
of new products, environmental handbooks are good sources of information.
2.2.2 Identification of The Dominant Failure Mechanisms
This step identifies the dominant failure mechanisms based on the time-to-failure
rankings. In principle, for a non-repairable unit the dominant failure mechanism is the
one by which the first failure is expected to occur. But in practice, more than one
dominant failure mechanism may need to be considered because of variability in
materials, manufacturing processes, and life cycle loads. Failure mechanisms with time-
to-failures less than the product life expectation are considered as candidate dominant
failure mechanisms. In other words, failure mechanisms with time-to-failure greater than
(20%, based on rule of thumb) the expected product life need not be considered. If
maintenance is available until end of product life expectation, all failure mechanisms
with time-to-failures below the product life expectation are considered as dominant
failure mechanisms. While choosing dominant failure mechanisms for complex systems,
tradeoff may be made based on cost, memory and processing capabilities.
17
2.3 Monitoring Appropriate Product Parameters
Monitoring product parameters involve measurement and monitoring of product
life cycle environment, which includes the environmental and operational parameters
identified by virtual reliability assessment. The life cycle environment of a product
consists of the assembly, storage, handling, and usage conditions of the product,
including the severity and duration of these conditions [2]. Specific life cycle loads on an
electronic product include environmental conditions such as temperature, humidity,
pressure, vibration or shock, radiation, contaminants, and loads due to electrical operating
conditions, such as current, power and heat dissipation. These loads can affect the
reliability of the product either individually or in combination with each other. Product
parameters can be monitored in a continuous or periodic manner using various sensors
mounted on or within the product. An ideal sensing device should be:
• Compatible with existing electronics (i.e., it should have minimal impact on the total
cost, performance, and reliability of the existing product)
• Accurate, have low response time, and self-correcting (e.g., having temperature
compensation) in operation
• Small, lightweight, and consuming system-independent and little power (preferably
self-powered)
• Easy to incorporate into the system
• Easily accessible for data acquisition, service, maintenance, and upgrades
Typical sensors include temperature sensors (thermocouples, thermistors, and
resistance thermo detector (RTD) sensors), humidity sensors, accelerometers, pressure
sensors (piezoelectric, MEMS) etc. The data measured by these sensors are recorded by a
18
data logger for further processing. The recording process needs specification of
parameters including sampling intervals for measurements, signal trigger values
1
.
2.4 Data Simplification Processes
Data simplification is the process of converting the raw sensor data into a form
suitable for the stress and damage models. Simplification of data is necessary since the
monitored data cannot be directly used with the stress and damage models in many cases.
For example, Engelmaier’s model for thermal fatigue of solder joints requires
temperature data in the form of temperature cycles and hence there is a need to convert
the temperature data from sensors to equivalent temperature cycles. The data
simplification process typically depends on the input requirements of the stress and
damage assessment model. Some examples of data simplification include
• Conversion of irregular temperature history into a regular sequence of peaks and
valleys for thermal fatigue analysis.
• Conversion of temperature reversals into relevant temperature cycle information.
• Conversion of acceleration data in time domain to power spectral density (PSD) in
frequency domain.
Data simplification process can also provide data reduction if necessary. A suitable
data reduction scheme is useful for analyzing large amounts of data by gain in computing
speed and reduction in memory requirements.
1
Some times data is recorded only if the value is more than a certain pre specified value. The pre specified
value is known as signal trigger value.
19
2.5 Stress and Damage Accumulation Analysis
Stress and damage accumulation analysis is used to estimate the accumulated
damage in the product based on simplified data. This step begins by creating numerical
models based on product geometry and material properties. For example, creation of
model for a circuit card assembly requires information on board material and dimensions,
component material, dimensions, and their respective orientations. Information on
product geometry is obtained from design specifications and manufacturer data sheets.
The numerical model is used to estimate the stress at individual failure sites based on life
cycle environment loads.
Based on the estimated stress values, the accumulated damage for the product in
its life cycle environment is estimated. Estimation of accumulated damage involves two
separate steps: 1) application of stress and damage models, 2) application of a suitable
damage accumulation theory.
2.5.1 Stress and Damage Models
The purpose of stress and damage models is to determine or predict the
occurrence of a specific wear out failure mechanism in a specific application. The
prediction process looks at each individual failure mechanism (such as solder joint
fatigue, electromigration, conductive filament formation, die cracking to name a few) to
estimate the probability of failure. This approach can be applied to electronic parts used
in military, space, telecommunication, industrial, automotive, aviation, and consumer
utilization applications, and is applicable to the entire life cycle of the product.
Selecting the proper damage model is the key to the accuracy of the stress and
damage accumulation analysis. Specific models for each individual failure mechanism
20
are available from a variety of reference books. These models can either be in the time
domain or in the frequency domain. Stress and damage models can be divided into two
major classes depending on the input requirements, i.e., models requiring cyclic inputs
and models requiring non-cyclic inputs.
Models requiring cyclic inputs are typically used to determine the fatigue life or
damage for a part. Examples of this class of models include Coffin-Manson’s model for
cyclic fatigue, Suhir’s model for die fracture, and Pecht and Lall’s model for wire fatigue.
The following equation shows Coffin-Manson’s model for thermal fatigue of solder
joints.
c
f
F
N
1
2
2
1
|
|
.
|
\
|
?
?
=
?
?
(2.1)
where N
F
is the number of cycles to failure, ?? is the cyclic strain, and c, ??
f
are material
constants. A cycle is identified “when a material remembers its prior deformation history
and changes its tangent stiffness to follow the original loading path”. Since the data
measured by a sensor is typically in the time domain (or in the frequency domain for
vibration sensors), cycle counting methods are used to transform the original history into
an equivalent cyclic history that can be directly incorporated into a fatigue damage
model.
Models requiring non-cyclic inputs usually require time varying value of the
independent variable as an input. Examples of this class of models include Black’s model
for electromigration, Kidson’s model for intermetallic formation, and Howard’s model
for metallization corrosion. The following equation shows Black’s model for
electromigration.
21
kT
m
E
e Aj t
2
F
?
=
(2.2)
where t
F
is the time to failure, j is the current density (the time varying independent
variable), T is the absolute temperature and A, E
m
, k are constants.
2.5.2 Damage Accumulation Theories
Failures can be classified in two types – overstress and wear out. Overstress
failures are catastrophic failures occurring due to single occurrence of a stress event that
can exceed the intrinsic strength of the material. On the other hand, failures due to
gradual accumulation of damage beyond the endurance limit of the material are known as
wear out mechanisms. In well-designed and high-quality hardware, the accumulated
damage should not exceed the damage threshold within the usage life of the product.
Failure models describing wear out failures are usually based on values of environmental
or operational variables.
Damage is defined as the extent of a system’s degradation or deviation from a
defect-free normal operating state. The basic postulate adopted by most fatigue engineers
is that operation at a given cyclic stress amplitude will produce fatigue damage in a
certain number of operation cycles. It is further postulated that the damage incurred is
permanent, and operation at several different stress amplitudes in sequence will result in
an accumulated damage equal to the sum of the damage accrued at each individual stress
level. When the total accumulated damage reaches a threshold level, fatigue failure
occurs. Many different damage models have been proposed to quantify damage caused
by operation at varying stress levels. The Palmgren-Miner cumulative damage theory or
the linear damage theory is the most common among these theories because of its
22
simplicity. This damage theory was proposed by Palmgren in 1924 and later developed
by Miner in 1945 [42].
According to the classic S-N curve, operation at constant stress amplitude S
produces complete damage in N cycles. Operation at the same stress amplitude (S) for
number of cycles smaller than N, will produce a fractional damage. In the same way,
operations over different stress levels S
i
result in different damage fractions D
i
. Failure is
predicted to occur when the sum of the damage fractions equal or exceed unity, i.e.,
1
1 3 2 1
? + + + + +
? i i
D D D D D L L (2.3)
The Palmgren-Miner hypothesis states that the damage fraction at any stress level
S
i
is linearly proportional to the ratio of the number of cycles of operation to the total
number of cycles that would produce failure at that stress level, thus,
i
i
i
N
n
D = (2.4)
Similarly, for damage models that estimate time-to-failure (e.g., Black’s model
for electromigration), damage is defined as
i
i
i
TTF
t
D = (2.5)
where t
i
is the time of operation and TTF
i
is the time-to-failure estimated by the model.
The Palmgren-Miner hypothesis is the most widely used model in industry,
mainly due to its simplicity. However the hypothesis does not recognize the influence of
the order of application of various stress levels. Damage is assumed to accumulate at the
same rate at a given stress level without regard to past history. In applying the Palmgren-
23
Miner rule to an irregular load history, care should be taken that cycles are defined in a
rational manner.
2.6 Estimation of Remaining Life
Remaining life estimation is the process of estimating the remaining life of the
product based on accumulated damage information. Sometimes, it is more useful to
quantify product degradation in terms of physical parameters (e.g., time in days, distance
in miles) than in terms of accumulated damage. Accumulated damage of the product is
combined with product usage history to estimate the remaining life. This process assumes
that there is no abnormality in product usage pattern in future. In other words, this step
converts the accumulated damage in the electronic product into an equivalent amount of
time that the product can continue to function before the start of wear out failures.
The remaining life estimation is updated regularly at the end of a pre selected time
period. Hence the remaining life estimation process can take in to account any sudden
change in the life cycle environment or usage of the product. The time interval between
two updates is decided based on the product usage and its estimated lifetime based on
virtual reliability analysis. Sometimes the safety level associated with the product can
play an important role in determining the time interval.
2.7 Acceptable Remaining Life
The life consumption monitoring methodology described above concludes with an
estimation of the useful remaining life of the product. At this point, the user is required to
decide whether to keep the product in operation and continue monitoring or to abandon
the mission and schedule a maintenance action. The choice of acceptable amount of
remaining life depends on a variety of factors, such as the user’s application and the
24
safety level associated with it. For example, if the application is known to be fairly
reliable with multiple redundancies, a higher limit of acceptable remaining life may be
chosen but if the application involves human participation or may compromise the safety
of personnel, a lower acceptable limit of remaining life may is required.
25
3 EXPERIMENTAL CASE STUDIES ON LIFE CONSUMPTION
MONITORING
The life consumption monitoring methodology described in chapter 2 was
demonstrated in a real-time environment through two case studies. The life cycle
environment for the case studies was chosen to be the underhood of an automobile. The
case studies involved the following steps:
1. Mounting test boards under the hood of a car (1997 Toyota 4Runner) (Figure 3.1)
2. Conducting failure modes and effects analysis (FMEA) and virtual reliability
assessment to determine and assess the dominant failure mechanisms and the
corresponding environmental parameters.
3. Monitoring the underhood thermal, shock and vibration environment of the test board
in the car.
4. Simplifying the monitored environment and performing a physics-of-failure-based
reliability analysis to estimate the life consumption for the test board.
5. Monitoring resistance of the solder joints in real time to find out the actual life.
6. Comparing the estimated and the actual life results.
The test board was a FR-4 printed circuit board (PCB) consisting of eight surface
mount leadless inductors manufactured by ACI AppliCAD Inc. The inductors were
soldered to the PCB with Pb-Sn eutectic solder. The board was bolted at its two corners
to an aluminum bracket, which made the board act like a cantilever to vibrations
2
.
2
Cantilever mounting was designed in order to accelerate the effect of road vibration and was not planned
26
FR-4 PCB with 8 surface mount
inductors
Clamping points
Figure 3.1: Experimental setup with the test board mounted under-the-hood of a
car (1997 Toyota 4Runner).
3.1 Failure Modes, Mechanisms and Effects Analysis
Failure modes, mechanisms, and effects analysis (FMMEA) was conducted for
the test board assembly to assess all possible failure modes and mechanisms in the
automobile underhood environment. Environmental and operational parameters of
interest were identified based on inputs to the available failure models. When a failure
model is not available for a particular failure mechanism, environmental and operational
parameters were identified based on prior experience and literature. Identified
to be representative of time-to-failure in automobile electronic modules.
27
environmental and operational requirements were compared with the existing loading
conditions in the underhood environment to determine the potential failure mechanisms.
The potential failure mechanisms were used for analysis by virtual reliability assessment.
Table 3.1 shows the failure modes, mechanisms, and effects analysis for the circuit card
assembly. Plated through hole (PTH) fatigue, conductive filament formation (CFF),
electromigration, metallization corrosion, and solder joint fatigue were identified as the
potential failure mechanisms by this analysis.
3.2 Virtual Reliability Assessment
Virtual reliability assessment was conducted to assess the time-to-failure using the
failure mechanisms and models identified by failure modes, mechanisms, and effects
analysis (FMMEA) including plated through hole fatigue, conductive filament formation,
electromigration, metallization corrosion, and solder joint fatigue. Information about
product dimensions and geometry were obtained from design specification, board layout
drawing and component manufacturer data sheets. Environmental data for analysis
including temperature, vibration and humidity were obtained from the Society of
Automotive Engineers (SAE) environmental handbook and Washington DC area weather
reports. Figure 3.2 shows the average power spectral density (PSD) plot for the vibration
on a car frame from SAE handbook [43]. The car was assumed to run average 3 hours per
day. Table 3.2 shows the temperature data used for defining the underhood environment.
The maximum relative humidity for the underhood environment was 98 % at 38
o
C [43].
Humidity conditions were used to estimate time-to-failure for corrosion and conductive
filament formation.
28
Table 3.1: Failure Modes and Effects Analysis (FMEA) for the circuit card assembly
used for the experiment.
Item name/
failure site
Failure
mode
Failure
effect
Failure
mechanism
Failure
model
Cause of
failure
Comments
Electrical
open in
PTH
Change in
resistance of
PCB
assembly
PTH fatigue
CALCE
PTH
barrel
thermal
fatigue
model
Temperatu
re cycling
Virtual
reliability
assessment
required
Electrical
short
between
PTHs
No current
flow
through
components
Conductive
filament
formation
(CFF)
Rudra and
Pecht
model
Voltage,
high RH,
and tighter
PTH
spacing
Virtual
reliability
assessment
required
Electro-
migration
Black's
model
High
current
density
and
temperatur
e
Virtual
reliability
assessment
required
Change in
resistance of
PCB
assembly
Corrosion in
metallizatio
n traces
Howard's
model
High RH,
electrical
bias, ionic
contamina
tion
Virtual
reliability
assessment
required
Electrical
short/
open,
change in
resistance
in the
metallizati
on traces
Open EOS/ ESD
No model
available
for EOS/
ESD of
board
metallizati
on traces
Discharge
of high
potential
through
dielectric
material
Too high
conductor
spacing (in
the order of
centimeters)
to cause
EOS/ESD
Printed
circuit board
(PCB)
Fracture in
the PCB
Crack/
breaking of
PCB
Buckling
Overstress
failure
dependent
on critical
load
Compressi
ve loads
on the
PCB
No
compressive
loads applied
to the PCB
29
Item name/
failure site
Failure
mode
Failure
effect
Failure
mechanism
Failure
model
Cause of
failure
Comments
Short
between
windings
Change in
inductance
of PCB
assembly
Wearout of
winding
insulation
No model
available
for
Inductors
Overheati
ng due to
excessive
current
and
prolonged
use at high
temperatur
e
Short
between
windings
and the
core
Change in
resistance of
PCB
assembly
Wearout of
winding
insulation
No model
available
for
Inductors
Overheati
ng due to
excessive
current
and
prolonged
use at high
temperatur
es
Components
(Inductors)
Open
circuit
inside the
inductor
No current
flow
through
PCB
assembly
Breaking of
winding
No model
available
for
Inductors
Prolonged
use at high
temperatur
es
Maximum
operating
temperature is
low compared
to rated
temperature of
the inductors
(125 C).
Current
passing
through the
inductors (~50
mA) is much
below the
maximum
rated current
(9 Amps)
Solder joints
Intermitte
nt change
in
electrical
resistance
Intermittent
malfunctioni
ng of PCB
assembly
Solder joint
fatigue
Engelmaie
r's thermal
fatigue/
Steinberg'
s vibration
Temperatu
re cycling
and
vibration
Virtual
reliability
assessment
required
30
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1 10 100 1000 10000
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure 3.2: The power spectral density plot used for the virtual reliability
assessment
Table 3.2: Data used for defining the temperature environment for the virtual
reliability assessment
Maximum under hood temperature (near the frame)
121
o
C
Average daily maximum temperature [44]
27
o
C
Average daily minimum temperature [44]
16
o
C
Table 3.3 shows the time-to-failures for different failure mechanisms obtained from
virtual reliability assessment. It is clear from the table that solder joint fatigue is the
dominant mechanism in the given life cycle environment. The environmental factors that
can cause solder joint fatigue were found to include temperature cycling and vibration.
The virtual reliability assessment predicted. Virtual reliability assessment predicted 34
days to failure based on solder joint fatigue.
31
Table 3.3: Virtual reliability assessment
Failure
mechanism
Failure model
Time-to-
failure
Probability of
failure
Plated through hole
(PTH) fatigue
CALCE PTH barrel thermal
fatigue model (calcePWA)
> 10 years Low
Conductive
filament formation
(CFF)
Rudra and Pecht model
(calceFAST)
4.6 years Low
Electromigration Black’s model (calceFAST) >10 years Low
Corrosion in board
metallization traces
Howard’s model 1 year
3
Low
Solder joint fatigue
Engelmaier’s thermal fatigue
and Steinberg’s vibration
model (calcePWA)
34 days High
A monitoring and data simplification schemes were developed for monitoring and
analyzing the automobile underhood environment for solder joint fatigue analysis. The
electrical indications of failure in case of solder joint fatigue are characteristically
intermittent because the fractured solder joint surfaces do not separate physically as long
as the component is attached to the substrate by other solder joints [45]. For the case
study, product malfunction was monitored through intermittent change in resistances,
which is consistent with characteristics of solder joint failure.
3
Time-to-failure was obtained in the worst case conditions with the presence of an electrolyte. The actual
time-to-failure will be much higher than one year.
32
3.3 Monitoring Product Parameters
A battery powered data logging device equipped with an internal tri-axial
accelerometer and an integrated temperature and humidity sensor (Figure 3.3) was used
to record the shock, vibration and temperature environment of the test board assembly at
programmed intervals [46]. The data logger is capable of recording both static and
dynamic type of data. Static data can be completely characterized by a single reading,
where as dynamic data varies rapidly with time. In addition to integrated sensors, the
data logger also provides extra static and dynamic channels for connection to external
sensors. For the experiment temperature was monitored as static data and vibration was
monitored as dynamic data. The recorder’s memory (8 MB) was divided into two
partitions:
Time-triggered: This allows the user to specify the minimum time interval to elapse
before recording begins. This feature is primarily for measuring static data.
•
Signal-triggered: This allows the user to specify the minimum value of a parameter
(i.e., signal trigger) that should be exceeded before the recording begins. An event is
defined when the measured parameter exceeds the signal trigger value and a
predefined number of samples are recorded. This feature is primarily for measuring
dynamic data.
•
For the experimental set up, the data logger was programmed in such a way that it
required the specification of the following parameters:
• Time interval between temperature measurements
• Sampling rate for shock and vibration (i.e., the number of samples counted per
second)
• Sample size for shock and vibration (i.e., the number of samples per event)
33
• Signal trigger values
• Filter frequency for vibration
Temperature sensor
Vibration
sensor
Channel 4
T/H channel
Temperature sensor
Vibration
sensor
Channel 4
T/H channel
Figure 3.3: The data logger with external temperature and vibration sensor
The data recorder could not be placed directly under the hood of a car because its
internal sensors had a maximum rated temperature of 55 °C. Further its size and weight
precluded its installation anywhere under-the-hood. Hence external temperature and
vibration sensors were used for monitoring the underhood environment.
An external RTD (resistance thermo-detector) temperature sensor was taped on
the test board with the help of high temperature resistant tape to monitor the temperature.
The sampling interval for temperature measurement was chosen to be one minute so that
the data recording device can capture even quick temperature changes arising due to
starting of the engine.
34
A piezoelectric accelerometer was mounted on one of the clamping points of the
test board to monitor vibration input. The vibration of the circuit card assembly for this
experimental setup was mainly due to the road and driving condition and hence could be
categorized as random vibration. A random vibration signal typically consists of several
frequencies. The range of frequencies that can be accurately monitored is dependent on
the chosen sampling scheme (i.e., sample rate, sample size, filter frequency).
3.3.1 Sampling Issues for Monitoring of Continuous Signals
Most signals behave as continuous phenomena over their period of acquisition,
and provide a history of the parameter being measured as a function of time. Sampling is
the process of obtaining a series of discrete numerical values from a continuous function.
There are electronic circuits associated with sensors that observe an instantaneous source
signal at regular intervals and convert it into an electrical signal with a numerical value
analogous to the source signal.
Signals are made suitable for digital processing by converting their sampled value
into an equivalent numerical value by an analog-to-digital converter. Each value is
represented by a finite number of on/off states of electronic elements. The converter has
an input-output relationship, which is a series of steps. The size of the steps depends on
the total range to be covered and the number of steps available.
The most important consideration in sampling is the selection of the sampling
interval. Sampling points that are too close together will yield redundant data, while
sampling points that are too far apart will lead to confusion between the low and high
frequency components of the signal. For example, consider the time record shown in
Figure 3.4. Let the record be sampled such that the time interval between adjacent
35
sampling points is h seconds. The sampling rate is hence 1/h samples per second. Since
at least two sample points are required to define a cycle of given frequency (i.e., one
point each for the start and the end of a cycle), the number of cycles per second (or the
frequency of sampling) is 1/2h. Thus, the highest frequency component that can be
defined by sampling at the rate of 1/h samples per second is 1/2h. This cutoff frequency
f
C
(equal to 1/2h) is called the “Nyquist frequency”, and the corresponding time between
samples h is called the “Nyquist interval”.
Time
x(t)
h
Time
x(t)
h
Figure 3.4: Sampling of a continuous time record
Any frequency f above f
C
contained in the signal will be superimposed or
“folded” back into the frequency range from 0 to f
C
and be confused with data in the low-
frequency range. This problem, which is called aliasing, is a potential source of error in
sampling. For any frequency f in the range 0 ? f ? f
C
, the frequencies that will be aliased
with f are 2nf
C
± f, where n is a natural number from 1 to N. To prove this, consider a
sampling interval t = 1/2f
C
. Then,
( ) ( ) ?ft
f
?f
f
?f
n?
f
f nf ? t f nf ?
C C C
C C
2 cos
2
1
2 cos
2
2
2 cos
2
1
2 2 cos 2 2 cos = =
|
|
.
|
\
|
± = ± = ±
(3.1)
36
Thus, all data at frequencies (2nf
C
± f) have the same cosine function amplitude as
the data at frequency f when sampled at times 1/2f
C
apart, and hence all data at the higher
frequencies will be aliased (or superimposed) on data at frequency f. For example, data
at frequencies 170 Hz, 230 Hz, 370 Hz, 430 Hz, and so on will be aliased with data at 30
Hz if f
C
= 100 Hz. Hence, the sampling interval h should be carefully chosen to prevent
aliasing.
Two methods are available to prevent aliasing: 1) to choose h sufficiently small so
that it is physically unreasonable for data to exist beyond the cutoff frequency f
C
or 2) to
filter the original data prior to sampling so that information beyond a maximum
frequency is no longer contained in the filtered data. The second method, which involves
the use of a low-pass filter circuit that only allows passage of frequencies below the
frequency of the filter, is often preferred over the first method to save on computing time
and costs [47]. Hence a low-pass filter was used in this thesis to screen out all frequencies
above the Nyquist frequency to prevent aliasing.
In order to select the vibration analysis range, the data recorder was first
programmed to collect data at its maximum sampling rate
4
, and the resulting frequency
spectrum was analyzed. It was observed that the power spectral density (PSD) of the
vibrations was concentrated below 400 Hz and the PSD was negligible above 600 Hz
(below 10
-6
g
2
/Hz.). Hence, the frequency analysis range was conservatively selected to
be 1 to 700 Hz. Accordingly the sampling rate and sample size were selected to be 1800
4
This was done to find the frequency range of interest and to ensure that any high-frequency vibrations
would not be eliminated in the actual experiment.
37
samples/second and 1024 samples. The anti-alising filter frequency was chosen to be 700
Hz.
3.4 Data Simplification
Almost all monitoring systems use sensors to measure various loads present in a
product’s life cycle environment. Sensors mounted either near or within the product to
be monitored provide electrical output signal in response to a specified measurand. Most
signals from sensors behave as continuous phenomena over their period of acquisition,
and provide a time history of the parameter being measured. This data in time domain
cannot be directly used with the physics-of-failure models. This section describes the
method to make temperature and vibration data compatible to the physics-of-failure
models.
3.4.1 Temperature Data Simplification
Temperature based damage estimation models require temperature data in terms
of cycles. The physics-of-failure definition of cycles includes the maximum, minimum
temperature, ramp times and dwell times at maximum and minimum. Rainflow cycle
counting algorithms are usually used to identify the cycles based on a given loading
profile [48]. The temperature based reliability assessment models require cycle
information that includes cycle maximum, minimum temperature, dwell and ramp times.
The 3-parameter rainflow cycle counting method can identify cycles in a manner
consistent with the PoF definition of cycles [49], [50]. The input to the 3-parameter
rainflow cycle counting algorithm is a time history consisting of several reversals
5
.
5
A reversal is defined as a point where the first derivative changes its sign, i. e., a peak or a valley.
38
3.4.1.1 Ordered Overall Range (OOR) method
The data is first converted to a sequence of reversals using the ordered overall
range (OOR) method. The OOR method allows the user to convert an irregular history in
time domain into a regular sequence of peaks and valleys. The OOR method can be
described as follows [51]
• The largest peak
6
of the temperature history is selected as the first candidate
• The next valley that differs from the largest peak by more than a cut off level is
selected as the tentative candidate.
• The cut off level is defined as a fraction of the difference between the largest peak
and the lowest valley. The fraction is known as the reversal elimination index (s).
• Peaks are checked to see if they differ from the new candidate by more than the
screening level (event ‘x’), and valleys are checked to see if they are lower than the
candidate (event ‘y’). If event ‘y’ occurs first (i.e., before event ‘x’), then the
candidate is rejected and the new valley becomes a candidate. If event ‘x’ occurs
first, the candidate is validated and the newly found peak becomes the next candidate.
• The next peak that differs from the new candidate by more than a cut off level is
selected as the tentative candidate.
• Valleys are checked to see if they differ from the candidate by more than the
screening level (event ‘x’), and peaks are checked to see if they are higher than the
candidate (event ‘y’). If event ‘y’ occurs first (before event ‘x’), then the candidate is
6
The algorithm requires selection of either of the extreme reversals in the history (either the largest peak or
the smallest valley) as the first candidate. In this article the algorithm is explained taking the largest peak as
the first candidate.
39
rejected and the new peak becomes a candidate. If event ‘x’ occurs first, the
candidate is validated and the newly found valley becomes the next candidate.
• This process continues until the last reversal is counted.
• Since the counting process starts from the largest peak (which may not be the first
reversal in the history), the method has to be applied to both sides of the starting
reversal to take the entire load history into account.
When the reversal elimination index is equal to zero, all the reversals in the load
sequence are preserved. The OOR algorithm has the capability of eliminating some of the
temperature reversals, which are potentially less damaging, there by achieving data
reduction. Data reduction can be achieved by specifying a non-zero reversal elimination
index. Figure 3.5 explains the underlying concept of the OOR algorithm pictorially. In
the figure, the highlighted points are selected only if L s l × ? .
Time
T
e
m
p
e
r
a
t
u
r
e
Highest peak
Lowest valley
L
l
Time
T
e
m
p
e
r
a
t
u
r
e
Highest peak
Lowest valley
L
l
Figure 3.5: Temperature history showing the reversals (i.e., peaks and valleys)
40
3.4.1.2 Cycle Counting
Cycle counting methods [48] are used to transform a time history consisting of
several reversals (peaks and valleys) into an equivalent cyclic history. Cycle counting
methods are used when a fatigue analysis needs to be performed.
The physical interpretation of a cycle is a condition when the applied load returns
the material to the state it was before the load excursion occurred. If the applied load is
of a mechanical nature (such as force or torque), the material forms a closed stress-strain
hysteresis loop when this condition is satisfied. For a repeatedly applied load history, the
following two rules apply:
• When the load reaches a value at which loading was previously in the reverse
direction, a stress-strain hysteresis loop is closed, defining a cycle. The stress-strain
path beyond this point is the same as if the loading had not been reversed.
• Once a load sequence forms a closed loop, this sequence does not affect the
subsequent behavior.
For the load history shown in Figure 3.6, the first rule is invoked at points 2', 7',
5', and 1'. The first rule is also satisfied just beyond 5', where the load reaches the same
value it had at point 3. But the second rule also applies, and since excursion 2-3-2' has
already formed a cycle, there is no additional closed cycle.
41
Load
T
i
m
e
½cycle
1
2
3
4
5
6
7
0
2'
8
7'
1'
½cycle
1 cycle
1 cycle
5'
1 cycle
½cycle
½cycle
Load
T
i
m
e
½cycle
1
2
3
4
5
6
7
0
2'
8
7'
1'
½cycle
1 cycle
1 cycle
5'
1 cycle
½cycle
½cycle
Figure 3.6: Identifying cycles in a load history
For non-repeating and open-ended load histories, the rules stated above are
incomplete if the absolute value of the load at any point during the history exceeds its
value at the first peak. Of the various cycle counting methods available (peak counting,
simple range counting, peak-between mean counting, level crossing counting, fatigue
meter counting, range-pair counting, and rainflow counting), only the rainflow and the
range-pair counting methods are capable of handling this more general situation (of non-
repeating histories). However, no damage is calculated for some parts of the original
history if the range pair method is used, whereas the rainflow method accounts for every
part of the history. Hence, the rainflow method was used in this thesis for counting
cycles.
42
In the rainflow cycle counting method, the load-time history is plotted in such a
way that the time axis is vertically downward, and the lines connecting the load peaks are
imagined to be a series of sloping roofs. The rain flow is initiated by placing drops
successively at the inside of each reversal. The method considers cycles as closed
hysteresis loops formed during a history, which is consistent with the definition of a cycle
described in the previous section. Following rules are applied on the rain dripping down
the roofs to identify cycles and half cycles:
• The rain is allowed to flow on the roof and drip down to the next slope except that, if
it initiates at a valley, it must be terminated when it comes opposite a valley equal to
or more negative than the valley from which it initiated. For example, in Figure 3.7,
the flow begins at valley 1 and stops opposite valley 9, valley 9 being more negative
than valley 1. A half cycle is thus defined between valley 1 and peak 8.
• Similarly, if the rain flow is initiated at a peak, it must be terminated when it comes
opposite a peak equal to or more positive than the peak from which it initiated. In
Figure 3.7, the flow begins from peak 2 and stops opposite peak 4, peak 4 being more
positive than peak 2. A half cycle is thus counted between peak 2 and valley 3.
• The rain flow must also stop if it meets rain from a roof above. In Figure 3.7, the
flow beginning at valley 3 ends beneath peak 2. This ensures that every part of the
load history is counted once and only once.
• Cycles are counted when a counted range can be paired with a subsequent range of
equal magnitude in the opposite direction. If cycles are to be counted over the
duration of a profile that is to be repeated block by block, cycle counting should be
started by initiating the first raindrop either at the most negative valley or at the most
43
positive peak, and continuing until all cycles in one block are counted in sequence.
This ensures that a complete cycle will be counted between the most positive peak
and the most negative valley.
The simple rainflow method does not provide any information about the mean
load or the cycle time. A modified method called 3-parameter rainflow cycle counting is
used to handle this situation. This method accepts a sequence of successive differences
between peak and valley values (P/V ranges) in the time history as an input, and
determines the range of the cycle, the mean of the cycle, and the cycle time. The
modified method identifies cycles as follows [50]: Consider three successive P/V
differences d
1
, d
2
, and d
3
, as shown in Figure 3.8. A cycle is identified only if the
following condition is true:
1 2
d d d > ?
3
(3.2)
The condition of the above equation is called the ‘loop condition.’ For a given
sequence of P/V ranges, if the loop condition exists, the method picks the loop
corresponding to size d
2
off the cycle, leaving only the residual wave 1-2-4-5,
corresponding to a half-loop size (d
3
-d
2
+d
1
) in the load plot. This operation is called
‘loop-reaping.’
44
1
2
3
4
5
6
8
9
10
12
11
13
14
15
16
17
18
19
21
23
24
25
27
28
26
29
30
20
22
Load
T
i
m
e
7
Counting
terminated
1
2
3
4
5
6
8
9
10
12
11
13
14
15
16
17
18
19
21
23
24
25
27
28
26
29
30
20
22
Load
T
i
m
e
7
Counting
terminated
Figure 3.7: Rainflow cycle counting
45
Load
T
i
m
e
d
1
d
2
d
3
1
2
3
4
5
Load
T
i
m
e
d
1
d
2
d
3
1
2
3
4
5
Figure 3.8: Loop condition and loop reaping operations
For a given sequence of P/V ranges, the 3-parameter rainflow method “reaps” the
smaller cycles that occur during a larger cycle. The range, mean, and half-cycle time of
the residual half cycle is adjusted according to the loop-reaping condition, and the
process is applied until the last P/V range is read.
Solder joint fatigue models are based on total possible thermal expansion
mismatch and creep at extreme temperature [45]. Stress relaxation in solder joints
(viscoplastic materials) is a time dependent creep phenomenon and requires sufficient
dwell time at the extreme temperature. However, once the stress relaxation is complete
for a given cycle, there is no more damage in the solder joints due to creep in that cycle.
The damage due to the total thermal expansion mismatch remains the same as long as the
extreme temperatures are the same. Hence the data simplification algorithm should
provide enough dwell time at the extreme temperatures for a conservative estimation. To
account for this, the data simplification algorithm assumes one fourth of the half cycle
46
time at temperature extremes as the dwell time, where half cycle time is defined as the
time for temperature transition from valley to peak or peak to valley [49].
3.4.2 Vibration Data Simplification
Vibration data is typically measured as acceleration with the help of
accelerometers. In general the data collected from accelerometers represent random
vibration (i.e., it cannot be described by an explicit mathematical relationship). The
physics-of-failure based reliability assessment models require the random vibration data
to be described in terms of its power spectral density (PSD). The PSD describes the
frequency composition of the vibration in terms of its mean square value over a
frequency range. Fourier Transform analysis is used to transform the acceleration data
from time domain to the frequency domain, and vice versa. The result of the Fourier
Transform analysis is usually a plot of amplitude/ power as a function of frequency. In
real life collected data is sampled and not continuous. Hence Fast Fourier Transform
(FFT) is employed to analyze discrete (or sampled) data. The power spectral density was
calculated from the sampled data (i.e., the type of data recorded by the data logger) using
the Cooley-Tukey method, which is based on fast fourier transform (FFT) of the original
sampled acceleration data [47]. For a sequence of acceleration values h
k
sampled over a
record length T, the Cooley-Tukey method defines the PSD function at any frequency f
as
2
k
X
N
2h
G(f) = (3.3)
where X
k
are the FFT components of the N sampled acceleration values of amplitude h
k
averaged over the record length T. The expression for X
k
is given by
47
?
?
=
=
1 N
0 k
n
kN 2
i
k k
e h

?
(3.4)
where N is the sample size. The independent variable n can be related to the frequency by
the relation f
n
= (n/Nh), where h is the sampling interval between adjacent points and T is
the total record length.
For this thesis the acceleration data from the piezoelectric accelerometers were
converted to respective PSD using the SAVER PSD analysis software [46].
3.5 Stress and Damage Accumulation Analysis
The objective of a physics-of-failure stress and damage accumulation analysis in
life consumption monitoring is to determine the accumulated damage due to various
failure mechanisms for the electronic product in the given environment.
For this thesis, the circuit card assembly was modeled in calcePWA
7
reliability
assessment software. The software creates a finite element model of the circuit board
assembly based on the various material properties, board dimension, component type,
component dimensions and their respective orientations. Material properties for the
model were taken from calcePWA material database
8
. Component dimensions were
obtained from the respective part data sheet from Vishay Dale. Board dimensions and the
component orientations were taken from the board layout drawing.
7
calcePWA is a physics-of-failure based virtual reliability assessment tool for circuit card assemblies
developed by CALCE Electronic Products and Systems Center, University of Maryland, College Park. The
software makes use of numerical analysis and failure mechanisms) to estimate time-to-failure.
8
calcePWA software has material property database for a number of materials commonly used in the
electronic industry.
48
Once a model of the board is created, the software uses the environmental data to
estimate the stress near each potential failure site (solder joint in this case) using the finite
element model. For determining the stress under each solder joint it is important to
provide the correct boundary condition to the software. The boundary conditions of the
test board were defined as follows:
• Temperature on the test board was considered to be uniform with conduction and
natural convection. This is a reasonable assumption as there is no power generation
by the components.
• For vibration analysis two corners of the board were modeled as clamped supports.
Boundary conditions assumed for the vibration analysis were verified by
comparing the experimentally evaluated natural frequencies of the board with the
modeling results. The modeling predicted natural frequencies of the circuit card assembly
as 25.7 Hz, 77.5 Hz, 145.4 Hz, and 274.8 Hz. To check this, another external
accelerometer was mounted on the PCB. The ratio of the response of the PCB to the
excitation given to the PCB at various frequencies was plotted against the frequency.
Peaks of this plot give the experimental natural frequencies. Figure 3.9 shows that the
natural frequencies occur at 19.3 Hz, 79.1 Hz, 149 Hz, and 262 Hz., which is in close
agreement with the modeling prediction.
49
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1 10 100 1000
Frequency (Hz.)
P
S
D
R
e
s
p
o
n
s
e
/
P
S
D
E
x
c
i
t
a
t
i
o
n
19.3 Hz.
79.1 Hz.
149 Hz.
262 Hz.
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1 10 100 1000
Frequency (Hz.)
P
S
D
R
e
s
p
o
n
s
e
/
P
S
D
E
x
c
i
t
a
t
i
o
n
19.3 Hz.
79.1 Hz.
149 Hz.
262 Hz.
Figure 3.9: Ratio of the response of the PCB to the excitation vs. frequency.
The peaks in this plot identify the natural frequencies.
The computed stress is then used to estimate the damage fraction based on
physics-of-failure models for selected failure mechanisms. The basis of damage
assessment is that operation of the product at a given stress amplitude will produce some
amount of damage, the magnitude of which will be related to the total time of operation
at that stress amplitude and the total time that would be required to produce failure of an
undamaged part at that stress amplitude. When the total accumulated damage reaches a
critical level, failure is predicted to occur. For this thesis, solder joint failure models for
temperature cycling, shock and vibration were used to estimate the accumulated damage.
The software used Engelmaier’s first order model for thermal fatigue and Steinberg’s
equation for vibration induced fatigue on the solder joints. More details about the damage
models can be found in the appendix.
50
3.6 Remaining Life Assessment
Remaining life estimation step calculates the useful life of the product (e.g., the
time in days, distance in miles) through which the product can function reliably, based on
the damage accumulation information. The remaining life was calculated on a daily basis
by subtracting the life consumed on that day from the estimated remaining life on the
previous day. This approach used an iterative formula to find out the remaining life [52].
1 1
*
? ?
? =
N N N N
TL D RL RL
(3.5)
where RL
N
is the remaining life at the end of day N, TL
N
is estimated total life at the end
of day N and D
N
is the damage ratio accumulated for day N.
3.7 Failure Definition and Detection
Solder joint failure due to fatigue is defined as the complete fracture through the
cross section of the solder joint with solder joint parts having no adhesion to each other.
A solder joint that fails fully by fracturing does not necessarily exhibit an electrical open
or even a very noticeable increase in electrical resistance. Electrically, the solder joint
failure manifests itself only during thermal and mechanical transients or disturbances in
the form of short duration resistance spikes. The thermal and vibration fatigue models
used for the analysis are also based on intermittent resistance spikes, i.e., interruption of
electrical discontinuity for small periods of time (more than 1 µs) [45].
For the case studies in the thesis, the functional degradation of the circuit card
assembly was monitored experimentally in terms of resistance change of the solder joints.
An event detector circuit was connected in series with all the components and solder
joints to indicate intermittent resistance increase. The event detector was connected to
51
the data logger to record the time when there was an increase in resistance. The
intermittent increases in resistances were termed as “resistance spikes.” Resistance spikes
for the experiment were defined to be intermittent increase in resistance by 100 ohms for
each solder joint [54], [55]. Failure was defined as occurrence of fifteen such resistance
spikes.
The event detector circuit sends a continuous direct current signal through the
daisy-chained circuit containing the inductors and the solder joints in series. Since the
inductors offer zero resistance to the direct current, the resistance of the daisy-chained
circuit is dependent on the resistance of the solder joints. The resistance offered to the
direct current was compared with the preset value (100 ohms increase for each solder
joint). The comparison results were logged at the end of every second. This time interval
was limited by the capability of the data logger.
To determine change in resistance of the individual components and solder joints,
the resistances were measured during the experiment on a regular basis with the car
engine off.
3.8 Monitored Environment and Results
This section describes the monitored environmental parameters (e.g., temperature,
shock and vibration) for the test board assembly for the case studies. The monitored data
was simplified and used with the physics-of-failure models to estimate the remaining life
of the test board assembly.
3.8.1 Case Study-I
The temperature sensor used for the case study was Kele’s Model STR-91S two-
wire strap-on RTD sensor with a temperature range up to 200
o
C. A single axis
52
piezoelectric (Endevco’s model 2226C) accelerometer was mounted on one of the
clamping points of the test board to measure the out-of-plane acceleration for the board.
Figure 3.1 shows the experimental setup for case study-I.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40
Time (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.10: Monitored temperature during case study-I
The temperature on the test board in the underhood environment was monitored
for a period of 42 days. Figure 3.10 shows the monitored temperature for the experiment
period. Each data point is separated at an interval of one minute. The average temperature
and the maximum temperature seen by the test board assembly during case study-I are 30
o
C and 61
o
C respectively.
53
The temperature vs. time history for 42 days was converted to an equivalent
sequence of peaks and valleys (for thermal fatigue analysis) using the ordered overall
range (OOR) method. For this analysis the reversal elimination index was chosen to be
0%, i.e., all the reversals were chosen as input to the cycle counting algorithm. Figure
3.11 shows the acquired temperature data converted to peaks and valleys. The sequence
was converted to temperature cycles using the 3-parameter rainflow cycle counting
method. 275 temperature cycles were identified for case study-I. The cycle information
obtained from the rain flow cycle counting method was used as input to the calcePWA
thermal module.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40
Time (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.11: Monitored temperature converted to the peaks and valleys for case
study-I
54
Figure 3.12 shows the power spectral density (PSD) vs. frequency for the out-of-
plane vibration of the board. The PSD shown in the figure is the averaged value over the
experiment duration. The frequency analysis range was 1 to 700 Hz. Accordingly the
sampling rate and sample size were selected to be 1800 samples/second and 1024
samples.
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E+00 1.0E+01 1.0E+02 1.0E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure 3.12: Power spectral density (PSD) vs. frequency plot for case study-I
The damage accumulation in the circuit card assembly was determined using
calcePWA software. The output of the 3-parameter rainflow cycle counting algorithm
and the power spectral density (PSD) vs. frequency data described above were used as
input to the software. The displacements of the board and each component due to
vibration were estimated through finite element analysis. Figure 3.13 shows the estimated
displacement of the board. The reference point for the shown displacements was chosen
to be the clamping points. The curvature along the horizontal axis of the board was found
55
to be constant (1.1 x 10
-3
/ inch), which indicated the equal amount of damage for all
solder joints due to vibration.
Clamping
Points
Clamping
Points
Figure 3.13: Estimated board displacement due to vibration for case study-I
An accident occurred during case study-I, where the car used for experiment was
hit by another car. Vibration events with high g-values were recorded during the crash
and during dis-engagement of the cars. The maximum g-levels for these events were an
order of magnitude higher than normal conditions. Figure 3.14 shows the vibration event
with highest g-level recorded during the crash. In this case the maximum value of
acceleration was from +22g to –23 g (45 g peak-to-peak) as compared to 2g in case of
normal random vibration. The highest g-level recorded during dis-engagement of the cars
was + 9g to –9g (18 g peak-to-peak).
Under high levels of vibration, there is a chance of failure of the circuit card
assembly if the stress under maximum acceleration exceeds the material strength of the
solder joints. This failure due to shock is considered to be due to overstress mechanism.
56
Overstress models are usually used to find out whether the board can sustain the impact.
An overstress analysis was conducted using calcePWA for the maximum acceleration
value (45 g peak to peak), which showed no overstress failure.
-25.0
-20.0
-15.0
-10.0
-5.0
0.0
5.0
10.0
15.0
20.0
25.0
0 10 20 30 40 50 60 70 80 90 100 110
Time (m. sec.)
A
c
c
e
l
e
r
a
t
i
o
n
(
g
)
Figure 3.14: Recorded vibration event during the car accident. The maximum
acceleration values were from +22 g to –23 g.
Hence a random vibration analysis was conducted with the PSD data obtained
from all the events recorded during the crash and dis-engagement of the cars. The random
vibration analysis resulted in maximum of 15% accumulated damage of the solder joints
of the board. A more detailed section on the random vibration analysis is given in
appendix-3.
Accumulated damage
9
was estimated using the physics-of-failure models and
Palmgren-Miner theory on a daily basis. The results obtained are shown in the form of a
bar chart (Figure 3.15).
9
Estimated damage of 100% corresponds to the predicted end-of-life of the board.
57
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40
Time in Use (days)
A
c
c
u
m
u
l
a
t
e
d
D
a
m
a
g
e
(
%
)
Damage due to temperature cycling
Damage due to vibration
Figure 3.15: Accumulated damage estimated using calcePWA and Miner’s rule for
case study-I
Figure 3.16 shows the measured resistances of solder joints as a function of time.
These resistance values were measured on a daily basis with the car engine off. The plot
also shows the occurrences of the resistance spikes. The actual life of the circuit card
assembly found to be 39 days according to the failure criteria.
58
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
0.0
20.0
40.0
60.0
80.0
0 4 8 12 16 20 24 28 32 36 40 44
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Intermittent resistance spikes
Range of resistance readings
Figure 3.16: Resistances of the solder joints along with the intermittent
resistance spikes for case study-I
Figure 3.25 shows the estimated remaining life and the actual life for the
experiment. Initial predictions based on the similarity analysis and SAE environmental
handbook data were 25 days and 34 days respectively. The estimated life based on life
consumption monitoring with out taking into account the accident is 46 days. There was a
drop in estimated life of the circuit card assembly by 6 days because of the accident.
Hence the final estimated life is 40 days. The actual life based on resistance monitoring is
39 days, which is close to the estimated life based on life consumption monitoring.
59
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45 50
Time in Use (days)
E
s
t
i
m
a
t
e
d
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45 50
Time in Use (days)
E
s
t
i
m
a
t
e
d
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
Car Accident
Estimated life with out
accident (LCM) - 46 days
Estimated life after accident
(LCM)- 40 days
Estimated life based on
similarity analysis (earlier
CALCE case study) – 25 days
Estimated life based on SAE
environmental handbook data - 34 days
Actual life from resistance
monitoring - 39 days
Figure 3.17: Remaining life estimation summary for case study-I
All solder joints of the test board assembly were photographed with the help of
optical microscope from time to time during the experiment. Two of the solder joints
showed cracks at a magnification of 50.
0.4 mm. 0.4 mm. 0.4 mm. 0.4 mm.
Figure 3.18: Crack in one of the solder joints
60
3.8.2 Case Study-II
A 2
nd
case study was conducted to demonstrate the life consumption monitoring
methodology. There were some changes in the experimental setup in order to compare
the in-plane and out-of-plane accelerations of the test board during the experiment.
Figure 3.19: Experimental setup for case study-II
The temperature sensor used for the case study was the same as case study-I
(Kele’s Model STR-91S two-wire strap-on RTD sensor). A 3-D piezoelectric (Endevco’s
Model 2228C) accelerometer was mounted on one of the clamping points of the test
board to measure accelerations in all 3 directions (out-of-plane acceleration for the board,
acceleration along the car motion and the transverse direction). Figure 3.19 shows the
experimental setup for case study- II.
61
Figure 3.20 shows the temperature variations on the test board for 66 days. Each
data point is separated at an interval of one minute. The figure shows that the maximum
temperature seen by the circuit card assembly is 89
o
C, which is well below the glass
transition temperature of FR-4 (130
o
C) and the rated temperature of the inductors (125
o
C). The minimum temperature seen by the circuit card assembly is 4
o
C.
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 45 50 55 60 65
Time in Use (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.20: Monitored temperature for case study-II
The temperature vs. time history for 66 days was converted to an equivalent
sequence of peaks and valleys using the ordered overall range (OOR) method. For this
analysis the reversal elimination index was chosen to be 0% like case study-I. Figure 3.21
shows the acquired temperature data converted to peaks and valleys. The temperature-
time history was converted to temperature cycles using the 3-parameter rainflow cycle
counting method. 423 temperature cycles were identified for case study-II.
62
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 45 50 55 60 65
Time in Use (days)
T
e
m
p
e
r
a
t
u
r
e
(
o
C
)
Figure 3.21: Monitored temperature profile converted to the peaks and
valleys for case study-II
Figure 3.22 shows the power spectral density (PSD) vs. frequency plots for three
different directions (out-of-plane acceleration for the board, acceleration along the car
motion and the transverse direction) averaged over the experiment duration for frequency
range of 1 to 700 Hz. The out-of-plane vibration was found to be at least 2 orders
magnitude higher than the other directions. Further according to studies conducted by
Steinberg, stress in the solder joints can be related to the out-of-plane displacement of the
board [60]. Hence only out-of-plane (z-direction) vibration was used for vibration
analysis.
63
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
1.0E+00 1.0E+01 1.0E+02 1.0E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
X-Acceleration Y-Acceleration Z-Acceleration
Figure 3.22: Power spectral density (PSD) vs. frequency plots for case study-II
The damage accumulation in the circuit card assembly was determined using
calcePWA. The output of the 3-parameter rainflow cycle counting algorithm and the
power spectral density (PSD) vs. frequency data were used estimate the damage. The
displacements and radius of curvatures under each component were estimated through
numerical analysis. The radius of curvature along the horizontal axis of the board is
constant which predicts the same amount of damage accumulation in all solder joints as
in case of case study-I. Figure 3.23 shows the results of the damage analysis in the form
of a bar chart.
64
0
20
40
60
80
100
0 5 10 15 20 25 30 35 40 45 50 55 60
Time in Use (days)
A
c
c
u
m
u
l
a
t
e
d
D
a
m
a
g
e
(
%
)
Damage due to temperature cycling
Damage due to vibration
Figure 3.23: Accumulated damage for case study-II
Figure 3.24 shows the measured resistances of solder joints as a function of time
with the corresponding ranges. These resistance values were measured on a daily basis. It
can be seen from the plot that there is a gradual change in resistance through out the
experiment starting from an average value of 12.8 ohms to a value of 14.6 ohms. The plot
also shows the occurrences of the resistance spikes. The actual life of the circuit card
assembly found to be 66 days according to the defined failure criteria, i.e., fifteen
consecutive resistance spikes were observed on the 66
th
day.
Figure 3.25 shows the estimated remaining life and the actual life for the
experiment. Initial predictions based on SAE environmental handbook data and the 1
st
case study were 33 days and 46 days respectively. The predicted remaining life based on
65
the life consumption monitoring methodology is 61 days, which is a conservative
estimation of the failure (66 days).
10.0
11.0
12.0
13.0
14.0
15.0
16.0
17.0
18.0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
10.0
11.0
12.0
13.0
14.0
15.0
16.0
17.0
18.0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
s
i
s
t
a
n
c
e
(
o
h
m
s
)
Figure 3.24: Resistances of the solder joints along with the intermittent resistance
spikes for case study -II.
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Prediction based on
environmental data from
case study-I (46 days)
Prediction based on
handbook data (33 days)
Predicted life based on
environmental
monitoring (61 days)
Actual life from
resistance monitoring
(66 days)
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Prediction based on
environmental data from
case study-I (46 days)
Prediction based on
handbook data (33 days)
Predicted life based on
environmental
monitoring (61 days)
Actual life from
resistance monitoring
(66 days)
Figure 3.25: Remaining life estimation summary for case study-II
66
All solder joints of the test board assembly were photographed with the help of
optical microscope at the end of the experiment as in case study-I. A visible crack was
observed at the knee of a solder joint at a magnification of 50.
3.8.3 Comparison between case study-I and II
A comparison between the case study-I and case study-II is given in Table 3.4.
Environmental temperature values are higher for case study-II compared to case study-I.
This difference in temperature is because case study-I was conducted in winter and case
study-II was conducted in summer. Power spectral density (PSD) values for case study-II
are lower compared to case study-I. Difference in PSD values arises because of road
conditions
Table 3.4: Comparison between case studies I and II
Case Study I Case Study II
Temperature
Avg. 30
o
C
Max. 61
o
C
Avg. 41
o
C
Max. 89
o
C
PSD value
8.73e-3 g
2
/Hz. (max.) 6.09e-3 g
2
/Hz. (max.)
Predicted life using life
consumption monitoring
40 days 61 days
Actual life based on
resistance monitoring
39 days 66 days
% Damage due to
temperature
5 % 13%
% Damage due to vibration 95% 87%
Type of failure
Cracks in solder joints
Cracks in solder joints
Electrical indication of
failure
• Intermittent resistance
spikes
• Permanent resistance
increase of the solder
joints (16 %)
• Intermittent
resistance spikes
• Permanent
resistance increase
of the solder joints
(14 %)
67
4 SUMMARY AND DISCUSSION
This thesis describes a life consumption monitoring methodology for electronic
products. Steps involved in the methodology have been explained in the thesis. Two
example case studies were conducted to demonstrate the methodology for real life
applications. The case studies were conducted in automobile underhood environment.
Solder joint fatigue was identified as the dominant failure mechanism for the chosen
environment based on failure modes, mechanisms, and effects analysis (FMMEA) and
virtual reliability assessment. Accordingly, the following steps in the life consumption
monitoring were customized. Ordered overall range algorithm and 3-parameter rainflow
cycle counting algorithms were combined to develop a suitable data simplification
scheme for the chosen conditions. Proper stress and damage models were identified and
used along with the data simplification scheme for the case study conditions. An
algorithm was developed to estimate the remaining life of the electronics for the case
studies based on the results from stress and damage models. The steps followed for case
studies can be summarized as follows:
• Two identical test board assemblies were mounted in a cantilever fashion under-the-
hood of an automobile.
• Failure modes, mechanisms, and effects analysis (FMMEA) was conducted along
with virtual reliability assessment to determine the dominant failure mechanism for
the given life cycle environment. Solder joint fatigue was identified as the dominant
failure mechanism for the given environment.
68
• Temperature and vibration were identified as the environmental parameters for
monitoring were identified based on inputs to the solder joint stress and damage
models.
• Vibration and temperature data were monitored and simplified for compatibility with
stress and damage models.
• Stress and damage models were used along with the remaining life estimation
algorithm to determine the remaining life of the circuit card assemblies.
• Electrical performances of the circuit card assemblies were checked through
resistance monitoring to determine their actual life.
• The predicted life of the test board assemblies were found to be in agreement with
the experimental life.
• Remaining life of the test board assemblies were predicted with information from
various sources. All estimation results were compared.
The remaining life values for the case studies mentioned in the paper were found
to be very close to the experimental values, which can be explained by the following
reasons. The circuit card assemblies were designed intentionally with large surface mount
leadless inductors to precipitate failure solder joint fatigue much before other failure
modes. Further the stress and damage models used for analysis using calcePWA are well
calibrated for leadless components on a FR-4 printed circuit board. However, in more
complex circuit card assemblies, the life estimation results are dependent on various other
failure sites and mechanisms, which might result in higher amounts of error in the
estimation.
69
4.1 Using Handbooks and Similarity Analysis to Determine Environmental and
Operational Life Cycle Conditions
Life consumption monitoring process requires information on product geometry,
material properties along with actual environmental and operational parameters in its life
cycle environment. Once the product design is complete, its geometry and material
properties can be obtained from various sources including design layout, manufacturer
data sheets. However, until the beginning of product parameter monitoring, no
information on environmental and operational parameters is available. For life estimation
at this stage, life cycle information is taken either from environmental handbooks or data
monitored in similar applications. Environmental handbooks are good sources of
information for newly designed products. An example is data from SAE environmental
handbook for automobile electronics. For the case studies mentioned in the paper, data
from SAE environmental handbook resulted in 34 days of total life as compared to 66
days from the experiment in case study-II. This difference in the results can be because
the SAE environmental handbook provides a generic set of data not pertaining to a
specific car manufacturer or geographical location. Further the data available in SAE
environmental handbook represents twenty year old information [43].
Another source of information is the data collected from sensors for same or
similar products in a similar environment. This is possible only if the same product or a
similar product is already in production and monitored data from sensors are available.
This is called “using similarity to determine environmental and operational conditions”.
A virtual reliability assessment based on case study-I gives a total life of 46 days, which
70
is different from a total life of 66 days in case of case study-II. This difference can be
because of higher levels of vibration along with the car accident.
The process of using environmental data from handbooks or similarity analysis
with stress and damage models is often very useful because it can provide a quick
estimate of product life before spending a lot of time and money in environmental
monitoring. However, as shown in the case studies, the results may not be very accurate
as the process takes into account an approximate life cycle environment and does not
account for any change in life cycle conditions. Sometimes a sudden change in product
life cycle environment (e.g., accident of the automobile mentioned in case study-II) can
have a catastrophic impact on its life.
In general, the actual life cycle environment that a product encounters is different
from the statistically averaged values from handbooks. For automobile electronics, life
cycle environment depends on road conditions (for vibration), geographical location and
part of the year (for temperature, humidity). Hence after placing the product in the field
conditions and obtaining information about the life cycle environment more precise
information about remaining life can be obtained using the life consumption monitoring
approach. If there is any change in the life cycle environment (e.g., the car accident
mentioned in one of the case studies), there is an abrupt change in remaining life. By
knowing the remaining life of the product after an accident, the product mission can be
rescheduled to get intended life. This concept is known as “extension of life”, which
explains how life consumption monitoring results can be used in real life.
Figure 4.1 shows the idealized remaining life vs. time plot for an electronic
product. The time corresponding to 0% remaining life of the product gives the time-to-
71
failure of the product at a given load condition. This illustrates change of mission profile
to achieve extension of life.
R
e
m
a
i
n
i
n
g
L
i
f
e
(
%
)
Time
0 %
Change in
mission profile
Extension of life
Effect of accident
Expected plot for
remaining life
100%
R
e
m
a
i
n
i
n
g
L
i
f
e
(
%
)
Time
0 %
Change in
mission profile
Extension of life
Effect of accident
Expected plot for
remaining life
100%
Figure 4.1: Extension of life based on life consumption monitoring results
4.2 Determining the Number of Data Points Required for Life Estimation
Total useful life of a product at any point on the remaining life vs. time plot can
be estimated by adding the time in use (or x-coordinate) and the remaining life (or y-
coordinate) at that point. However, this analysis is a one-point estimation and does not
take into account the product usage trend. The product usage trend can be taken in
account by extrapolating a trend line using the available remaining life data points. The
intersection of the trend line with the time axis gives the total useful life of the product
(see Figure 4.2).
72
The number of data points that can be considered for the trend line varies from
two data points to all data points. “Cumulative past analysis for life estimation” is the life
estimation process where all the available data points in the remaining life plot are used.
On the other hand, “near past analysis for life estimation” is the life estimation process
where only some of the data points are used. Results obtained can be different by using
cumulative past and near past data if there is a large variation in the results from the
damage models. This variation can occur from variation in environmental and operational
parameters that are inputs to the damage models. An example of such variation is the
result due to the car accident mentioned in one of the case studies.
Figure 4.2 explains the concept of cumulative past and near past analysis with the
help of 24 data points from Figure 3.17 including the car accident. The plot shows that
life estimation results (at the end of 24
th
day) based on near past (i.e., three data points)
and cumulative past (i.e., all data points). The near past analysis results differ from the
cumulative past analysis and the actual results by 26%. This error reduces by using more
number of data points for near past analysis. This explains the requirements to compare
variations in results obtained using different number of data points.
73
Trend line based on four
points (useful life: 29 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
Trend line based on four
points (useful life: 29 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
0
10
20
30
40
50
0 5 10 15 20 25 30 35 40 45
Time in Use (days)
R
e
m
a
i
n
i
n
g
L
i
f
e
(
d
a
y
s
)
Trend line based on
all points (useful
life: 40 days)
Trend line based on
three points (useful
life: 26 days)
Figure 4.2: Cumulative past vs. near past for life estimation
As a rule of thumb, if the result using “n” data points (“n” starts from 2) and
“n+1” data points differ by less than the acceptable remaining life (see section 2.7), “n”
data points can be used for the analysis. More data points need to be considered if the
variation is more than the acceptable remaining life. This concept takes into if the
changes in damage model results (e.g., due to change in geographical location of the
product or use in a different application) are expected to continue in future.
74
5 CONTRIBUTIONS
This thesis describes the development of a life consumption monitoring
methodology for remaining life estimation of electronic products. The life consumption
monitoring methodology has been described in a general way so that it can be extended
to various electronics products. Failure modes, mechanisms, and effects analysis
(FMMEA) and virtual reliability assessment are included in the methodology to identify
the dominant failure mechanisms in a given life cycle environment. A process has been
explained to identify the environmental and operational parameters based on the failure
models for the dominant failure mechanism. An iterative method was described to
estimate the remaining life based on the accumulated damage information obtained from
stress and damage accumulation analysis.
Two case studies were conducted to demonstrate the life consumption monitoring
methodology in automotive underhood applications. Monitored environmental data was
used on a daily basis to estimate remaining life of two circuit card assemblies. The
electrical performances of the circuit card assemblies were monitored throughout the
experiments to determine field failure. The estimated life results were found to be in
agreement with the actual life results.
75
Appendix I: Damage assessment model for temperature induced fatigue analysis
Failure of solder interconnects due to temperature cycling is a common problem
in electronic hardware. Solder joint failures typically arise from fatigue due to thermal
expansion mismatch
• Between the package and the board (global mismatch)
• Between interconnect, solder, board (local mismatch)
A common model for solder joint fatigue due to thermal fatigue is based on the
work of Werner Engelmaier. The model uses a strain range as the metric for calculating
cycles to failure. The calculation of strain range considers both global mismatch and the
local mismatch in thermal expansion.
Assumptions associated with the model are:
• Fatigue failure of solder joints can be described as a power law similar to the Coffin-
Mansion low cycle fatigue equation.
• Strain in the solder arises from global as well as local thermal expansion mismatch
and the strain arising from local mismatch may be added to strain produced by global
strain to get the global strain (worst case).
• In-plane deformations are large compared to out-of-plane warping.
• Complete stress relaxation occurs during the thermal cycle.
The Engelmaier’s thermal fatigue model can be written in equation form as [45]:
c
f
p
f
N
1
2 2
1
|
|
.
|
\
| ?
=
?
?
76
where N
f
is the mean number of cycles to failure, ?
p
is the inelastic strain range and ?
f
is
a material constant (0.325 for eutectic solder).
The exponent c is known as the fatigue ductility coefficient and given by the
following relation
( ) ( )
|
|
.
|
\
|
+ × + × ? ? =
? ?
d
m
t
T c
360
1 ln 10 74 . 1 10 6 422 . 0
2 4
where T
m
is mean cyclic temperature of the solder in
o
C and t
d
is the dwell time in
minutes at the maximum temperature
The inelastic strain range ??
p
for the fatigue relationship is calculated by
considering the response of the package assembly to the change in temperature. The
stress state (stress and strain) is a function of the package, interconnect, and board
geometry and material.
l g t
? ? ? ? + ? = ?
The strain range due to global mismatch for leadless interconnects with eutectic can
be approximated to be [45], [59].
( )
c b g
LT LT
h
FI
? ? ? ? ? ? = ?
5 . 0
where F is the user defined calibration factor
10
I is the calibration factor
11
, h = height of
solder joint (mils) and T
c
, T
s
= temperatures of component and printed circuit board (
o
C).
10
This empirical correction factor accounts for idealized assumptions (F varies from 0.5 to 1.5, typical
values are around 1.0 and are determined by fitting fatigue life results to predicted life)
11
This factor is calibrated in calcePWA software based on various experimental results
77
( ) ( ) ( )( )
min max 2 2
c c cx y cx x c
T T L L LT ? + = ? ? ? ?
( ) ( ) ( )
( )
min max 2 2
b b by y bx x b
T T L L LT ? + = ? ? ? ?
where ?
cx
, ?
cy
, ?
bx and
?
by
are the coefficients of linear thermal expansion for
component and board, in x and y directions respectively (ppm/
o
C); L
x
and L
y
are the span
of interconnect in x and y directions (inch).The strain range due to local thermal
expansion mismatch can be approximated to be
( )
( )
eff
eff
l
Al bA
Al T
cosh
sinh ?
?
? ?
= ?
where G is the shear modulus of the solder, b is the solder thickness, E
l
is the modulus of
elasticity of lead, E
b
is the modulus of elasticity of the board, t
l
is the lead thickness, and
t
b
is the board thickness. The factor A is given by
|
|
.
|
\
|
+ =
b b l l
t E t E b
G
A
1 1
For leadless packages, l
eff
= 0 and hence ??
l
=0.
78
Appendix II: Damage assessment model for vibration induced fatigue analysis
For harmonic vibrations, maximum acceleration of the printed circuit board
(PCB) can be written in terms of maximum displacement, Z
o
as
o
Z a
2
max
? =
In terms of g’s
( )
g
Z f
g
Z
g
a
G
o o
2 2
max
max
2? ?
= = =
The maximum displacement can be written in terms of natural frequency, f
n
as
2
8 . 9
n
in
o
f
Q G
Z =
where G
in
is the input acceleration to the PCB and Q is the transmissibility of the PCB at
its natural frequency. For natural frequencies between 200-400 Hz., a good
approximation is
n
f = Q [60].
Failures due to high cycle fatigue (vibration induced) in solder joints can be
described by the following relationship:
=
b
f
N ? Constant
where ? is the solder joint maximum stress amplitude, N
f
is the cycles to failure and b is a
material property. Steinberg assumes that the stress in the solder joints can be directly
related to the out of plane displacement of the board, i.e.,
=
b
f
ZN Constant
Hence, Z
b b
N Z N
2 2 1 1
=
79
Solving for N
2
,
b
Z
Z
N N
1
2
1
1 2
|
|
.
|
\
|
=
where N
1
and Z
1
are the life and displacement from Steinberg’s equation, N
1
= 10,000,000
for sinusoidal and 20,000,000 cycles for random vibration [60], and Z
2
is the maximum
board displacement amplitude as given by equation 9. Through extensive testing and
design experience of PCB assemblies, Steinberg developed an empirical equation for
maximum allowable displacement, Z
1
L ct
B
Z
00022 . 0
1
=
The life of the components also depends on where they are placed. This variation
in life is a function of radius of curvature under the components and can be included in
the relationship as:
b
y x Z
Z
N N
1
2
1
1 2
) sin( ) sin(
|
|
.
|
\
|
=
? ?
where x, and y are the non-dimensional board co-ordinates of the component center.
Random vibration response is usually discussed in terms of the root mean square
(RMS) acceleration. When the distribution is normal, the RMS value is the mean of the
distribution. To account for the 3? extremes for random vibration [60]
(
(
¸
(
¸
=
2
8 . 9
3
n
rms
o
f
G
Z
80
RMS acceleration of the board is approximated as
Q f PSD G
n rms
2
?
=
For the case study, the board displacement under each solder joint is estimated by
finite element method using the calcePWA software.
In shock environment, Steinberg [60] uses a simple rule of thumb that in shock
environment of less than few thousand total cycles, the maximum allowable displacement
of the PWB is:
L ct
B
d
00132 . 0
=
where B is the length of the PWB edge parallel to the component, L is the length of the
components (inches), c is a constant depending on the package type.
81
Appendix III: Analysis of Car Accident for Case Study-I
The car accident mentioned in one of the case studies, resulted in high levels of
vibration during and after the accident. The data recording device recorded a number of
high vibration events. The maximum g-levels for these events were an order of
magnitude higher than normal conditions. The event with highest acceleration was
selected for input to calcePWA shock analysis module.
Shock analysis is typically conducted with the help of an overstress model.
Overstress failure of solder joints is defined as failure due to stresses that exceed the
ultimate strength of the solder. Shock analysis of solder joints can be conducted with
calcePWA software for an individual shock pulse. A shock pulse can be specified in
calcePWA by the type of pulse (e.g., half sine, unit impulse, terminal sawtooth),
maximum acceleration value and its time duration. The shock analysis can only
determine whether there is an overstress failure.
The identified event was found to contain a number of acceleration peaks. The
maximum peak-to-peak acceleration (45 g) was modeled in calcePWA as a half sine
pulse for shock analysis. The time duration of the 45 g pulse was determined as 3
milliseconds from the recorded data. calcePWA shock analysis showed no overstress
failure. Since the analysis was conducted at the worst case conditions, it was concluded
that no overstress failure occurred during the car accident. The electrical resistances of
the solder joints also confirmed no failure. No information on the accumulated damage
could be obtained from the shock analysis. However, this does not mean that there is no
accumulated damage in the circuit card assembly because of the accident.
82
Since shock analysis was conducted for a duration of 3 milliseconds with a single
pulse, the possibility of accumulated damage could not be ruled out. Further, high
vibration levels lasted for half an hour
12
, which confirmed the fact that there can be
accumulated damage. Hence an attempt was made to analyze the car accident using the
existing random vibration models.
Random vibration damage analysis is typically conducted using high cycle fatigue
models, which require specification of more than 10
4
cycles. The fundamental frequency
of the circuit card assembly for the given mounting system was estimated to be 25.7 Hz
from calcePWA. It was found that thirteen vibration cycles were possible per event (time
duration of approximately 500 milliseconds) at this fundamental frequency. This
precluded the possibility of random vibration analysis of individual events. To overcome
this limitation, all events recorded over a duration of 30 minutes were used to estimate
the power spectral density (PSD). The estimated PSD vs. frequency is shown in Figure.
1. A random vibration analysis was conducted using calcePWA with this PSD
information for a time duration of 30 minutes. Table. 1 gives the power spectral density
vs. frequency input to calcePWA. Random vibration analysis of the shock showed
maximum 15% damage on the solder joints and reduction of life by 6 days. Analysis of
the car accident was possible because of the availability of all the data points over the
interval of time.
12
High vibration events were observed from the moment of the car accident to the moment when the circuit
card assembly and the sensors were removed to repair the car. The time frame also included the vibration
event during disengagement of the cars.
83
Table. 1: Power spectral density vs. frequency input to calcePWA
Frequency
(Hz.)
1.76 12.3 17.6 26.4 76 141 248 378 527 693
Power
Spectral
Density
(g
2
/Hz.)
0.225 0.39 0.632 0.029 0.034 0.645 0.52 0.024 0.007 0.15
er spectral density (PSD) vs. frequency for the car accident
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
1.00E+00 1.00E+01 1.00E+02 1.00E+03
Frequency (Hz.)
P
o
w
e
r
S
p
e
c
t
r
a
l
D
e
n
s
i
t
y
(
g
2
/
H
z
.
)
Figure. 1: Pow
84
Appendix IV: Effect of temperature data reduction on prediction accuracy of life
consumption monitoring
To estimate the effect of temperature data reduction on the prediction accuracy of
life consumption monitoring, the temperature data collected from case study-II was
analyzed. For this analysis, the data was sampled at the rate of one data point per ten
minutes. The collected temperature data was simplified using the ordered overall range
(OOR) method and the 3-parameter rainflow cycle counting algorithm. A program was
developed to combine the OOR method, 3-parameter rainflow cycle counting algorithms
and Engelmaier’s model for solder joint fatigue to estimate the accumulated damage. The
accumulated damage of an individual solder joint was simulated with different values of
reversal elimination indices, ‘s’ (from 0 to 90%). The geometry and the properties of the
inductors and solder joints were used for the analysis.
For each value of reversal elimination indices, error was estimated. The error
values are compared to a situation where all reversals are preserved (i.e., reversal
elimination index, S = 0.0). Figure. 2 shows a plot of error % as a function of reversal
elimination index.
100 *
0.0) (S on Accumulati Damage
on Accumulati Damage
1 % Error
=
? =
It can be seen from Figure. 2, the value of error is very low for a reversal
elimination index (s) values up to 0.2 (i.e., a peak valley sequence is selected only if their
difference is greater than 0.2 * difference between the highest peak and the lowest
valley). For ‘s’ values greater than 0.2, the error value increases rapidly causing almost
85% error at s=0.9.
85
0
10
20
30
40
50
60
70
80
90
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Reversal Elimination Index
E
r
r
o
r
%
Figure. 2: Estimated error as a function of reversal elimination index
The number of remaining data points (reversals) in the load history decreases with
increase in ‘S’. This is because the cut off value (i.e., ‘S’ times the difference between
highest peak and the lowest valley) increases with increase in ‘S’ given the same load
history (see Figure 3.5). The fraction of data reduction is defined as:
Points Data f o no. Total
Points Data Remaining of No.
Reduction Data Fraction =
Figure. 3 shows the plot of error % as a function of fraction data reduction. It can
be seen that less than 5% error is caused by 96 % data reduction in this case. In other
words, only 5 % of the data points constitute the most damaging reversals for thermal
fatigue analysis in case of solder joints in electronics. Hence the data reduction method
developed can be effectively used for eliminating less damaging reversals.
86
0
10
20
30
40
50
60
70
80
90
0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
Fraction Data Reduction
E
r
r
o
r
%
Figure. 3: Estimated error as a function of data reduction
Different sampling rates result in different number of data points for a given
interval of time. However, a lot of data points are not useful for the temperature analysis
as the OOR algorithm takes into account only those data points that are either peaks or
valleys. Hence the number of data points that are neither peaks nor valleys increase with
sampling rate. As a result, even with reversal elimination index of 0.0 (i.e., all peaks and
valleys are counted), there is high amount of data reduction in case of higher sampling
rate. As long as the sampling rate is above certain limit where all the peaks and valleys
can be recorded, the maximum number of useful data points is independent of sampling
rate (i.e., all peaks and valleys). Specification of a reversal elimination index in such a
case will result in equal number of data points remaining after the OOR analysis. This
will result in equal amounts of damage and hence equal amounts of error for all sampling
rates. In other words, estimated error is independent of reversal elimination index beyond
a certain sampling rate. However, the data reduction will be more for higher sampling
rate for a given amount of error because of higher data reduction at reversal elimination
index of 0.0.
87
REFERENCES
[1] IEEE Reliability Society, “ IEEE Std 1413” IEEE Standard Methodology for
Reliability Prediction and Assessment for Electronic Systems and Equipment,”
New York, NY, January 1999.
[2] Ramakrishnan, A., Syrus, T., and M. Pecht, “Electronic Hardware Reliability,” The
Modern Microwave and RF Handbook, pp. 3-102 to 3-121, CRC Press, Boca
Raton, FL, 2000.
[3] Pecht, M., Product Reliability, Maintainability, and Supportability Handbook, CRC
Press, New York, NY, 1995.
[4] Beder, S., “Making Engineering Design Sustainable,” Transactions of Multi-
Disciplinary Engineering Australia, Vol. GE17, No. 1, pp. 31-35, June 1993.
[5] Kelkar, N., Dasgupta, D., Pecht, M., Knowles, I., Hawley, M. and D. Jennings,
“Smart Electronic Systems for Condition-Based Health Management,” Quality and
Reliability Engineering International, Vol. 13, pp. 3-7, 1997.
[6] Rao, B.K.N., Handbook of Condition Monitoring, Elsevier Science Publishers Ltd.,
Oxford 1996.
[7] Mobley, R.K., An Introduction to Preventive Maintenance, Van Nostrand
Reinhold, New York, NY 1990.
[8] NIST, “Condition-based Maintenance,” Advanced Technology Program Position
Paper,http://www.atp.nist.gov/files/cbm_wp1.pdf (08/01/2003)
88
[9] Borinski, J.W., Meller, S.A., Pulliam, W.J., Murphy, K.A.,and J. Schetz, “Aircraft
health monitoring using optical fiber sensors,” Proceedings of the 19
th
Digital
Avionics Systems Conference, Vol. 2, pp. 6D1/1 –6D1/8, 2000.
[10] Cooper, K.R., Elster, J., Jones, M., and R. G. Kelly, “Optical fiber-based corrosion
sensor systems for health monitoring of aging aircraft,” AUTOTESTCON
Proceedings, IEEE Systems Readiness Technology Conference, pp. 847-856, 2001.
[11] Borinski, J.W., Boyd, C.D., Dietz, J.A., Duke, J.C., and M. R. Home, “Fiber optic
sensors for predictive health monitoring,” AUTOTESTCON Proceedings, IEEE
Systems Readiness Technology Conference, pp. 250-262, 2001.
[12] Goldfine, N., Schlicker, D., Sheiretov, Y., Wailiabaugh, A., Zilberstein, V., and D.
Grundy, “Surface mounted and scanning periodic field eddy-current sensors for
structural health monitoring,” IEEE Aerospace Conference Proceedings, Vol. 6,
pp. 3141-3152, 2002.
[13] Kent, R.M, “Fiber ultrasonics for health monitoring of composites,” Proceedings of
the 19
th
Digital Avionics Systems Conference, Vol.2, pp. 6D3/1 -6D3/6, 2000.
[14] Larson, E.C., and B. E. Parker Jr., “A subspace-based approach to structural health
monitoring,” Proceedings of 19
th
Digital Avionics Systems Conference, Vol.2, pp.
6C5/1 -6C5/8, 2000.
[15] Yen, G. and Tuang Bui “Health monitoring of vibration signatures in rotorcraft
wings,” Proceedings of IEEE Aerospace Conference, Vol.1, pp. 279 –288, Feb
1997.
89
[16] Yen, G., “Health monitoring of vibration signatures,” 23rd International
Conference on Industrial Electronics, Control and Instrumentation, Vol. 3, pp.
1124-1129, Nov 1997.
[17] Hailu, B., Gachagan, A., Hayward, G., and A. McNab, “Embedded piezoelectric
transducers for structural health monitoring,” Proceedings of IEEE Ultrasonics
Symposium, Vol.1, pp. 735 –738, 1999.
[18] Hailu, B., Hayward, G., Gachagan, A., McNab, A., and R. Farlow, “Comparison of
different piezoelectric materials for the design of embedded transducers for
structural health monitoring applications,” IEEE Ultrasonics Symposium, Vol. 2,
pp. 1019-1012, 2000.
[19] Condition Monitoring and Diagnostic Engineering Management (COMADEM)
International,http://www.comadem.com/frameset.htm (08/01/2003)
[20] Society of Machinery Failure Prevention Technology (MFPT),http://www.mfpt.org/ (08/01/2003)
[21] National Aeronautics and Space Administration (NASA), “Aviation Safety
Program (AvSP),”http://www.grc.nasa.gov/WWW/avsp/about.htm (08/01/2003)
[22] National Aeronautics and Space Administration (NASA), “Vehicle Health
Management (VHM),”http://www.grc.nasa.gov/WWW/cdtb/projects/vehiclehealth/index.html
(08/01/2003)
90
[23] Office of Naval Research, “Science & Technology – Aircraft Technology,”http://www.onr.navy.mil/sci_tech/special/351_strike/prog_aircraft.htm
(08/01/2003)
[24] Department of Defense, “The Joint Strike Fighter Prognostics and Health
Management,” www.dtic.mil/ndia/2001systems/hess.pdf (08/01/2003)
[25] QinetiQ, “Vehicle Health and Usage Monitoring System,”http://www.qinetiq.com/etc/medialib/docs/news_room/press_packs/defence.Par.00
04.File.dat/DVD-TSSprel-D5(HK)_latest.doc (08/01/2003)
[26] QinetiQ, “Integrated Engine Management,”http://www.qinetiq.com/news_room/newsreleases/2003/2nd_quarter/integrated0.ht
ml (08/01/2003)
[27] UK Department of Defense,http://www.mod.uk/business/excel/projects/rcsc05.htm
(08/01/2003)
[28] US Army Material Systems Analysis Activity (AMSAA), “AMSAA Capabilities
To Support Simulation-Based Acquisition (SBA),”http://www.amsaa.army.mil/sba/Sba_doc2.html (08/01/2003)
[29] CALCE Electronic Products and Systems Center, University of Maryland, College
Park,http://www.calce.umd.edu/ (08/01/2003)
[30] Pennsylvania State University Applied Research Laboratory,http://www.arl.psu.edu/ (08/01/2003)
[31] The Boeing Company, “ETOPS Maintenance,” Aero Magazine, No. 7, July 1999.
91
[32] Pecht, M., Dube, M., Natishan, M., and I. Knowles, “An Evaluation of Built-In
Test,” IEEE Transactions on Aerospace and Electronic Systems, Vol. 37, No. 1, pp.
266-272, January 2001.
[33] The Boeing Company, “New “Smart” Network to Reduce Boeing JSF Life-Cycle
Costs,” Boeing News Release, Washington, D.C., September 13, 1999.http://www.boeing.com/news/releases/1999/news_release_990913o.htm
(08/01/2003)
[34] General Motors (Mcleish, James. G.), "Email Communication," Warren, Michigan,
9
th
April 2002.
[35] Ramakrishnan, A. and M. Pecht, “Implementing a Life Consumption Monitoring
Process for Electronic Product, IEEE Transactions on Components and Packaging
Technologies, Vol. 26, No. 3, pp. 625-634, September, 2003.
[36] Pecht, M., Radojcic, R., and G. Rao, Guidebook for Managing Silicon Chip
Reliability, CRC Press, Boca Raton, FL, 1999.
[37] Young, D. and A Christou, “Failure Mechanism Models for Electromigration,”
IEEE Transactions on Reliability, Vol. 43, No. 2, June 1994.
[38] Li., J., and A. Dasgupta, Failure Mechanism Models for Material Aging Due to
Inter-Diffusion IEEE Transactions on Reliability, Vol. 43, No. 1, March 1994.
[39] Rudra, B., and D. Jennings, “Tutorial: Failure-Mechanism Models for Conductive-
Filament Formation,” IEEE Transactions on Reliability, Vol. 43, No. 3, September
1994.
92
[40] Dasgupta, A., “Failure Mechanism Models For Cyclic Fatigue,” IEEE Transactions
on Reliability., Vol. 42, No. 4, pp. 548-555, December 1993.
[41] Pecht, M., Lall, P., and E. Hakim, “The Influence of Temperature on Integrated
Circuit Failure Mechanisms,” Quality and Reliability Engineering Intl, Vol. 8, pp.
167-175, 1992.
[42] Miner, M.A., “Cumulative Damage in Fatigue,” Journal of Applied Mechanics, pp.
A-159 to A-164, September 1945.
[43] SAE J1211, Recommended Environmental Practices for Electronic Equipment
Design, Rev. November 78.
[44] Monthly Temperature Averages for the Washington, DC Area,http://www.weather.com/weather/climatology/monthly/USDC0001 (08/01/2003)
[45] Engelmaier, W., "Generic Reliability Figures of Merit Design Tools for Surface
Mount Solder Attachments,” IEEE Transactions of CHMT, Vol. 16, No. 1., pp.
103-112, 1993.
[46] Dallas Instruments, SAVER' User’s Manual, Dallas, Texas, February 2000.
[47] Bendat, S.J. and A.G. Piersol, Random Data: Analysis and Measurement
Procedures, Wiley-Interscience, New York, NY, 1971.
[48] Collins, J., Failure of Materials in Mechanical Design, John Wiley & Sons, New
York, NY, 1993.
93
[49] Cluff, K.D., “Characterizing the Commercial Avionics Thermal Environment for
Field Reliability Assessment,” Journal of the Institute of Environmental Sciences,
Vol. 40, No. 4, pp. 22-28, Jul.-Aug. 1997.
[50] Anzai, H., “Algorithm of the Rainflow Method,” pp. 11-20, The Rainflow Method
in Fatigue, Butterworth-Heinemann, Oxford, 1991.
[51] Fuchs, H.O., Nelson, D.V., Burke, M.A., and T.L. Toomay, “Shortcuts in
Cumulative Damage Analysis,” SAE National Automobile Engineering Meeting,
Detroit, 1973.
[52] Ramakrishnan, A., “Health and Life Consumption Monitoring Using Sensor
Technologies,” Masters’ Thesis, University of Maryland, 2001
[53] Mishra, S., Pecht, M., Smith, T., McNee, I., and R. Harris, “Life Consumption
Monitoring Approach for Remaining Life Estimation”, European Microelectronics
Packaging and Interconnection Symposium, IMAPS, pp. 136-142, Cracow, Poland,
June 2002.
[54] IPC, “IPC J-STD-029” Performance and Reliability Test Methods for Flip Chip,
Chip Scale, BGA, and other Surface Mount Array Package Applications,” February
2000.
[55] MEG-Array, “Solder Joint Reliability Testing Results Summary, IPC-SM-785,”http://www.fciconnect.com/pdffiles/highspeed/MEG-Array_IPC-SM-
785_Results.pdf (08/01/2003)
94
95
[56] Collins, J., Failure of Materials in Mechanical Design, John Wiley & Sons, New
York, NY, 1993.
[57] Anzai, H., “Algorithm of the Rainflow Method,” pp. 11-20, The Rainflow Method
in Fatigue, Butterworth-Heinemann, Oxford, 1991.
[58] Constable, J. H., “Electrical Resistance as an Indicator of Fatigue,” IEEE
Transactions on Components, Hybrid and Manufacturing Technology, Vol. 15, No.
6, December 1992.
[59] Institute for Interconnecting and Packaging Electronic Circuits, “IPC SM-785”
Guidelines for Accelerated Reliability Testing of Surface Mount Solder
Attachments," Lincolnwood, IL, July 1992.
[60] D. S. Steinberg, Vibration Analysis for Electronic Equipment, John Willey & Sons,
New York, NY, 1988.
doc_700786555.pdf