Description
The condition-based maintenance (CBM) approach ensures a reduction in maintenance uncertainty, based on the needs indicated by the equipment condition. The monitoring process involves collection and interpretation of the relevant equipment parameters for identifying the state of equipment deviations and changes from its normal conditions. The parameters in this context represent a set of characteristics that indicate the actual equipment condition. Any abnormality in these characteristics indicates the occurrence of some sort of functional failure. A built-in fault diagnosis scheme can be activated by the detection of such an abnormal condition; it recognizes and analyses the symptomatic information, identifies the root causes of a failure ad infers the fault development trend
Condition-based Maintenance Management in Critical Facilities
RR-305 Neelamkavil, J July 2010
The material in this document is covered by the provisions of the Copyright Act, by Canadian laws, policies, regulations and international agreements. Such provisions serve to identify the information source and, in specific instances, to prohibit reproduction of materials without written permission. For more information visithttp://laws.justice.gc.ca/en/showtdm/cs/C-42 Les renseignements dans ce document sont protégés par la Loi sur le droit d'auteur, par les lois, les politiques et les règlements du Canada et des accords internationaux. Ces dispositions permettent d'identifier la source de l'information et, dans certains cas, d'interdire la copie de documents sans permission écrite. Pour obtenir de plus amples renseignements :http://lois.justice.gc.ca/fr/showtdm/cs/C-42
Table of Contents Contents
Condition-based Maintenance Management in Critical Facilities .............................. 1 Summary ............................................................................................................... 3 The Role of Maintenance in Facility Management................................................. 4 Corrective Maintenance or ‘Run-to-failure’ ...................................................... 4 Preventive Maintenance.................................................................................. 4 Predictive vs. Condition-based Maintenance (CBM) ....................................... 7 Degradation Methods and Models for Condition-based Maintenance ................... 8 Other Models for Condition-based Maintenance ................................................. 15 P-F Interval and Standards for Condition-based Maintenance ............................ 18 Wireless Technologies Enable Condition-based Maintenance ............................ 23 Intelligent System for Condition-based Maintenance Management..................... 25 Commercial CBM Systems.................................................................................. 27 Conclusion........................................................................................................... 28 References .......................................................................................................... 29
2
Condition-based Maintenance Management in Critical Facilities
Summary
A facility management strategy requires that an organization’s major operational concerns are dealt with, such as: avoiding the risk of catastrophic failures and eliminating any forced outage of its equipment; planning for equipment maintenance that operates under a complex operating environment; and reducing the quantity of spare parts and associated inventory costs. To bring things further into perspective, it is a well known fact that many systems suffer increasing wear with usage and age and are subject to random failures that are linked to asset deterioration. A few examples of such affected items can be seen in cutting tools, hydraulic structures, brake linings, turbine blades, and rotating equipment. In all these cases, various physical deterioration processes can be observed, such as cumulative wear, bearing wear, crack growth, erosion, corrosion, fatigue, and so on. The deterioration and failures of such systems might incur safety hazards, as well as high operational costs (e.g. due to production losses and delays, unplanned intervention on the system). As a result, preventive maintenance becomes necessary so as to replace the deteriorated system before it fails. If the deterioration of the system or a parameter strongly correlated with the state of that system can be directly measured (via vibration analysis, wear monitoring, corrosion level, etc.), and if the system ceases to function when it deteriorates beyond a given threshold level, then it is appropriate to base maintenance decisions on the actual deterioration state of the system rather than on its age. And this leads to the choice of a condition-based maintenance (CBM) policy. CBM techniques provide an assessment of the system’s condition, based on data collected from the system through continuous monitoring or via inspections. The main purpose of this is to determine the required maintenance plan prior to any predicted failure. CBM has been proven to minimize the cost of maintenance, improve operational safety and reduce the severity and number of in-service system failures. This report complements an earlier IRC report # RR 284 titled “A Review of Existing Tools and their Applicability to Facility Maintenance Management” submitted by the same author.
3
The Role of Maintenance in Facility Management
As indicated in reference [1], modern maintenance management strategies have evolved over a period of time, as organizations ensure high asset reliability and availability but with only a limited maintenance investment. Note that, arriving at a maintenance strategy for individual assets, in order for the enterprise objectives to be met at minimal cost, still remains a challenge. Figure 1, extracted from reference 1, depicts the various types of maintenance strategies practiced today.
Corrective Maintenance or ‘Run-to-failure’ For a long time, organizations have practised a “run to failure” maintenance strategy, in which an asset is operated until it fails or breaks. Maintenance action, which involves repair or replacement, is taken with the intention of correcting the fault. For many non-critical assets, this is still considered a reasonable and logical operating strategy.
Preventive Maintenance Failures of many assets have expensive and far-reaching consequences. These failures can shut down entire production lines, make buildings unusable, or may also cause accidents. It is imperative that these types of failures are prevented. As such, a different type of maintenance strategy has evolved in time – widely known as preventive maintenance. This involves looking at the asset failure history, and instigating maintenance to “fix” it before there is a high probability of its failing. This strategy ensures high asset availability and minimizes unplanned downtime. For many critical assets, preventive maintenance eliminates the severe consequences of failures; however, the benefits of preventive maintenance come at a price. Generally, the preventive strategy advises that maintenance be performed more often than is absolutely necessary. As the maintenance incurs costs in both labour and parts, this strategy can result in “over-maintenance”. In addition, preventive maintenance usually requires that assets be taken off-line for servicing, which in turn incurs cost due to down time, and lost production.
4
Figure 1: Maintenance Types; Source: Reference [1]
Operation elapsed time has a major influence on preventive maintenance, as in the case of changing the lubricant in a passenger car. Typically, most people change their engine oil in their vehicles every 5,000 to 8,000 kilometers with no particular concern given to the actual condition and performance capability of the oil. If the owner of the car discounted the vehicle run time, and had the oil analyzed at some interval to determine its actual condition and lubrication properties, he/she might be able to extend the oil change until the vehicle had traveled 10,000 kilometers.
Mitchell T. Rausch [2] has provided a detailed review of preventive (sometime called timebased) maintenance approaches (listed below) along with some characteristics: •
Failure rate limit policies initiate maintenance when the system has reached a predetermined failure rate. State variables such as wear, stress or damage are monitored to update the failure rate function. When the failure rate reaches a predetermined value, preventative maintenance activities are commenced.
5
•
Sequential maintenance policies initiate maintenance according to unequal preventive maintenance time intervals; as the age of the component increases, the elapsed time between maintenance activities is reduced.
•
Repair limit policies utilize a cost basis to decide on the action taken when a component fails. When it fails, the cost of repair is compared to the cost of replacement. The item is repaired if the cost of repair is less than the cost to replace, otherwise it will be replaced.
•
Repair number counting policies allow for the component to fail n times before it is replaced. The failures up top n-1 are mitigated with minimal repair.
•
Repair number counting and reference policies are an enhancement to repair number counting policies (above) by adding an additional variable T that represents a positive operating time. Under the policy, the component is allowed to fail n times, but is not replaced at the nth failure if the operational time has not reached the predetermined T value; the component is minimally repaired but replaced on the n+1 failure.
•
Opportunistic maintenance policies address dependencies that occur in large systems. Failure of a component within a large system of components may require the removal of non-failed components to access the failed component. So, there is an opportunity to repair or replace non-failed components according to criterion such as hazard rate or cost.
•
Optimization of preventive maintenance policies is conducted by analyzing cost and system reliability measurements. The optimization approach generates preventive maintenance intervals by minimizing costs, or that ensures the desired system reliability is achieved.
6
Predictive vs. Condition-based Maintenance (CBM) In recent times, requirements for performance have risen, and the downtime allocated for routine maintenance (preventive replacement, inspection, etc.) has been squeezed. Due to the fact that preventive maintenance has become expensive, organizations have been developing a different type of maintenance strategy. Under this plan, an asset’s condition is monitored frequently (or continuously) until it begins to give evidence of deteriorating performance or an incipient failure. Maintenance is then performed in-time to prevent an imminent failure. Compared to what the preventive maintenance can offer, the new strategy (known as predictive maintenance or condition-based maintenance) results in overall costs reduction of maintenance, while providing better asset availability and performance. Condition monitoring can reduce the uncertainty operators feel about the current state of an asset. For example, knowledge about the vibration levels of a certain critical bearing can give operators confidence about its operation. Condition-based maintenance uses real-time information on the condition of the asset to identify when the actual maintenance is necessary; and, this allows the maintenance to be deferred until it is needed. Though the terms “predictive” and “condition-based” are often used interchangeably, current thinking is that there is a difference between predictive maintenance and conditionbased maintenance (CBM) strategies. Predictive maintenance is activated by the analysis of equipment condition data that is gathered periodically, often manually. This contrasts with the CBM approach in which equipment condition data is collected in a continuous manner and analyzed in real-time. At the same time, it should be noted that the continuous data collection mandates the installation of sensors on the equipment as well as with means of collecting and analysing the collected data. CBM is more suitable and logical in critical facilities, especially in process industries like refineries and power plants.
Condition-based maintenance can be initiated according to the state of a degrading system that is monitored through various characteristic measures, which essentially describe the state of the system. Once the degradation characteristic crosses a specified threshold, the maintenance actions may be triggered. Degradation measures must be identified that effectively relate the state of the system (or component) to its remaining useful life, along
7
with a decision on a failure threshold, and with the feasibility of implementing condition monitoring technology. Mitchell T. Rausch [2] has listed the most common monitoring methods practiced today, and this includes: •
Vibration monitoring can detect wear, fatigue, misalignment and loose assemblies for rotating equipment such as bearings, gear box, pumps, motors and engines. Vibration readings are collected over time and compared to a base line and alarm limits. Maintenance personnel are alerted when the readings tend toward the alarm limits.
•
Process parameter monitoring involves tracking a variety of operational characteristics such as process efficiency, system temperature, electrical current, and pressure that can be linked to the health status of the system.
•
Thermography is a method of capturing the infrared emissions of a component to determine if the operating temperature conditions are fluctuating outside of normal operation; abnormal temperature changes can be a symptom of an upcoming failure.
•
Tribology is the study of the effects of friction between two mating surfaces. Friction causes the generation of particulate that can be monitored through wear particulate analysis. Lubrication analysis can be conducted to determine the appropriate time to change lubricating fluids in a system.
•
Visual inspection is an easy method to implement, and this involves identifying loose components, structural cracks, or any other abnormal characteristics.
Degradation Methods and Models for Condition-based Maintenance
Selecting a suitable model to be used in a CBM scheme is not an easy task. The selection of the model should be based on the ability of the model to accurately describe the degradation process and make effective extrapolations of the component state into effective
8
decisions related to the maintenance. Mitchell T. Rausch [2] has included many of the popular models in his research work. In this regard, the degradation model must ensure that the physical degradation phenomenon is captured in the most realistic and practical method available for implementation. Degradation measurements traverse downward (or upward) toward a failure threshold, and the system is considered to have failed at the time when the measured value crossed a predetermined failure threshold.
Failure mechanisms for the specific system must be understood thoroughly so that an appropriate degradation model can be developed for use. Typically, there are continuous time, discrete time, continuous state, and discrete state degradation representations. Many of the discrete state/time methods involve Markov methods that require definition of discrete degradation states and state transition probabilities. The discrete state/time degradation models are less realistic in practical applications when compared with the continuous state/time representation. The continuous state/time model can describe true operational conditions and hence such models are more effective at describing degradation. Some of the discrete models include Markov chains and Markov decision processes, while some of the continuous degradation models include polynomials, cumulative damage, Brownian motion and gamma processes.
Markov chains represent a discrete time and discrete state stochastic process and are utilized to describe state transitions mathematically. To be used in a condition-based maintenance application, the degradation phenomenon may be defined according to discrete states and can be modeled with a Markov chain. To use Markov methods, multiple states must be identified which can be a challenging task in itself. Also, Markov methods require transition probabilities between states that can be difficult to define in practice.
Markov processes represent a discrete state continuous time stochastic process that is defined by a set of states in which multiple actions are available to the decision maker at each state. State transition probabilities are defined for each state ‘s’ and action ‘a’ that establish the probability of transition to the next state. For each state traversed, the decision maker receives a reward which establishes the decision made for that time period. Markov decision processes utilize actions and rewards, unlike Markov chains.
9
Neural networks may also be used to monitor and forecast degradation trends that are linked to maintenance decisions. These represent a set of nodes that perform computations and are arranged in patterns similar to neural nets. Each of the processing elements is connected through synapses with associated weights that modify a signal as it propagates through connections. The network also learns from its environment and adjusts the synaptic weights.
Kalman filters provide a recursive method to collect indirect measurements and describe a system state parameter through the measurement. The gamma process and Brownian motion represent continuous time/state stochastic processes that define an increment between two time periods. For modeling, the degradation increment between two time intervals may be defined by gamma process or Brownian motion. The degradation increment for the gamma process is defined by gamma distribution, while the Brownian motion is defined by a normal distribution.
Choosing an appropriate model for the decisions on condition-based maintenance can be made easy by utilizing a structured approach, as suggested by Scarf [3] in the models listed below and also in Figure 2. And, it is logical to place condition-based maintenance in the context of a general maintenance framework for a large complex system, which considers all elements of the system (machines, units, components) having failure characteristics. However, the framework also contends that if it is not possible to define a failure threshold, and no condition indicator data are available at failure, and further that a warning threshold cannot be defined, then condition-based maintenance for the item is not feasible. In such cases, other maintenance policies should be explored – age- based, routine inspection, operate to failure, etc.
Proportional Hazards Models: In this, the age of a component is monitored, and replacement is initiated when the age reaches a critical level. The criticality may be expressed in terms of a hazard function. To choose the critical level optimally, information about the time to failure distribution for the component is required; one has to specify other criteria also, such as the mean time between component failures. If the component age is not monitored, replacements may be scheduled periodically, either according to some reliability-optimal considerations, or even according to maintenance budget constraints.
10
Figure 2: A binary decision tree for model selection in CBM; Source: Scarf [3]
Failure Threshold Models: This approach depends on being able to specify a failure threshold, c, for the component condition Y. If Y > c then the component is assumed to have failed. For simple cases (e.g. tire wear) Y may be directly observable, and it is sufficient to model this wear using an appropriate stochastic process. The problem is to assess the residual life or remaining time to failure. An estimate of the residual life distribution can be used to optimize the time to replacement. This replacement time can be updated dynamically as more condition information becomes available. Often, the decision would come down to making a choice between replacing before the next monitoring check, or otherwise. Where the process is measured continuously then the replacement issue becomes trivial, that is, to replace when Y becomes greater than c.
11
Two-Phase Failure Models: It is required to specify a warning threshold to proceed with the condition-based maintenance; when the condition is above the threshold then it is time for the component to be replaced. When the condition is monitored continuously, the decision problem becomes simple. If the condition indicator is monitored only periodically, and the condition indicator assumes the values 0 or 1, depending on whether the measured condition is above or below a threshold, then the issue is how often to monitor. Scarf [3] suggests using a two-phase or delay-time model to optimize the monitoring interval.
Grall et al [4] have described a system that undergoes random deterioration, while being monitored through “perfect” inspections. When the system condition exceeds its failure level L, it enters into a failed state and a corrective replacement is carried out. When the system state upon inspection is found to be greater than a given critical threshold ?, the stillfunctioning system is considered as ‘worn-out’ and a preventive replacement is performed. The choice of the inspection dates and of the critical threshold value influences the economic performance of the maintenance policy. A low critical threshold leads to frequent preventive maintenance operations and prevents the full exploitation of the residual life of the deteriorated (but still functioning) system. On the other hand, a high critical threshold tends to keep the device working even in an advanced deterioration state, with an increased risk of failure. In practice, the monitoring inspections are performed at regular intervals or done continuously. It is evident that inspection dates and critical threshold values are the main decision variables in the problem of optimizing a CBM scenario. A conservative approach can lead to choosing a weak threshold and inspect more often than necessary leading to non-optimal maintenance policies. Grall et al [4] proposed two novel developments compared with other research works on condition-based maintenance modeling and optimization. First, they developed a model which allowed them to investigate the joint influence of the critical threshold value ? and the choice of inspection dates on the total cost of the maintained device; they showed that the long run expected maintenance cost per unit time can be minimized by an appropriate joint choice of these two decision variables. Second, they did not impose a periodic ‘routine’ inspection scheme for the condition monitoring process (fixed intervals determined off-line) and they allowed irregular inspection dates: the next inspection date is dynamically updated on the basis of the present system condition revealed by the current inspection.
12
As shown in Figure 3 [Grall et al 4], a single-unit system is considered subject to a continuous accumulation of wear in time. Its condition at time t is assumed to be completely described by a single scalar random variable Xt. It starts from zero at t=0 (X0=0). When Xt=0 (after each replacement), the system is said to be in the ‘new’ state. Its increments in a time interval are non-negative, stationary and statistically independent. When the deterioration process Xt exceeds a failure level L, a system breakdown occurs and the system is said to be in the ‘failed’ state. Grall et al [6] assumed that the deterioration process can be observed only at discrete equidistant times tk=k?t, where the unit time length ?t is either arbitrarily chosen or imposed by the considered maintenance problem. The stochastic process describing the condition of the system at times tk is noted (Xk)kN where Xk=Xtk. For each k, the random deterioration Xk?Xk?1 between two consecutive discrete time units is taken to have the same probability density function f. Since the stochastic process Xt has stationary statistically independent increments, f belongs to the class of infinitely divisible distributions. The inverse of the mean deterioration rate between tk and tk?1 is noted. In order to better characterize the deterioration process, it is taken that the maintenance decision at time tk is made only on the basis of the average amount of deterioration reached at t, irrespective of how this average amount is obtained.
Figure 3: Continuous accumulation of Deterioration; Source: Grall et al [6]
13
Goto et al [5] has proposed an on-line deterioration prediction method and residual life for the maintenance of rotating equipment. The status of the rotating equipment is inspected by vibration measurement and a mathematical model for the deterioration of the equipment is derived in order to predict the future condition of the rotating equipment. For building the deterioration model, ‘noise’ or outliers in the vibration data caused by measurement errors are eliminated in order to improve the accuracy of the deterioration model. Figure 4a shows the flowchart of the deterioration management procedure; here, the deterioration management values of the rotating equipment are obtained in on-line fashion. The on-line deterioration management is divided into three parts. The first part is the outlier judgement in the deterioration management value, the second part is the deterioration prediction by using the deterioration model, and the last part is residual life evaluation by using the deterioration prediction. By using the three steps, the deterioration management of the rotating equipment is carried out.
Figure 4b, taken from Goto et al [5], shows an example of the deterioration management values of rotating equipment in a thermal power plant. In this, the deterioration management values contain some data which are deviated from the general trend of data; often, such data are referred to as “outliers”. The outliers make errors in vibration diagnosis as the deterioration tendency is disturbed by the outliers and the math model of the deterioration management value drifted away by the outliers. Hence, the accuracy of the deterioration model can be improved, if the outliers are eliminated from the deterioration management value.
14
Figures: 4a: Deterioration Procedure;
4b: Outliers in Deterioration; Source: Goto et al [5]
Other Models for Condition-based Maintenance
In most CBM modeling approaches, the deterioration measures are monitored (or inspected) and compared with a predefined threshold for the facility maintenance decisions. Different from these, Lu et al [6] describe what is called the predictive CBM (PCBM) approach to predict the deterioration condition in the future. An advantage of the PCBM model is that the degradation states are modeled as continuous states using a state-space model in which the state vector includes both the degradation level and the degrading rate, both of which influence maintenance decisions. The maintenance decisions are made according to the predicted degradation conditions and associated cost factors to enhance the profit produced by the system. A system’s deterioration condition is generally evaluated by one or more performance measures, which could be quantitative variables or signal features that are highly correlated with deterioration conditions.
There has been a great deal of research activity into monitoring the deterioration process. For instance, in a cutting operation, the feed spindle and feed motor currents are used to monitor the wearing condition of cutting tools with sensors. And it makes sense to model the dynamics of the deterioration process individually and make maintenance decisions on each, correspondingly. Also, it should be noted that the data coming out of sensors or monitoring devices may be contaminated by background noises, sudden disturbances, and/or seasonal environmental changes. Accordingly, the deterioration measures in applications may not be increasing (or decreasing) in a logical manner as they are supposed to be. Hence, data pre-processing becomes a necessary step to extract the hidden degradation features and, most importantly, the deterioration trend.
Many researchers have provided multiple relevant approaches to optimize maintenance decisions based on different characteristics of the complete system. It is unarguable that the performance of a maintenance strategy is dependent on the level of knowledge available to characterize the considered failure process. Yet many approaches to maintenance optimization do not explicitly describe the relationship between the system
15
performance and the associated operating environment. The environmental conditions can affect the deterioration rate of a system; for example, an excessive humidity level favours corrosion. Conversely, excessive deterioration of the operating system can make changes in the environment, for example a hairline crack in a roller can initiate bad vibrations. One way of capturing the effect of a random environment on an item’s life span is to randomize its failure rate function and treat it as a stochastic process. And, one of the most well known approaches is the proportional hazard rate approach which consists of modeling the effect of the environment with the introduction of covariates in the hazard function. However, the estimation of the required parameters is a somewhat complex task.
Deloux et al [7] has described a condition-based maintenance decision framework to tackle the potential variations in system deterioration, and especially in the deterioration rate. According to this work, the condition of the system at time tk (tk=kDt), where the unit time length Dt is either arbitrarily chosen or imposed by the considered maintenance system), can be summarized by a scalar variable Xk, which varies increasingly as the system deteriorates. Xk can be the measure of a physical parameter linked to the resistance of a structure (example: length of a crack). The initial state corresponds to a perfect working state, i.e. X0=0. The system ceases to fulfill its function as soon as the value of Xk is greater than a predetermined threshold level L. In this case, either a failure has occurred or an important deterioration is present that significantly reduces the system performance.
Very little attention has been paid to the CBM modeling of deteriorating systems with multiple different units. Ling Wang et al [8] present a novel CBM approach (see Figure 5) for multi-unit systems (e.g. a generating unit) in which the deterioration processes of several units are modeled using continuous-time Markov chains. Dividing the system deterioration into several discrete states is more practical than describing the deterioration condition by a single scalar continuous variable. One can classify the equipment deterioration into various states, like: initial, minor deterioration, major deterioration, and failure. For most equipment, two kinds of failure can be assumed: random failure and deterioration failure, or hard failure and soft failure. Deterioration failure grows gradually in time, occurring due to deterioration or aging mechanisms. On the contrary, random failure results from other causes not associated with typical aging, e.g. a vehicle failure incurred by its fuse that burns out. Ling Wang et al [8] have considered both random failure and failure due to deterioration.
16
Figure 5: Deteriorating Systems with Multiple Units; Source: Ling Wang et al [8]
It is intuitive that the maintenance of deteriorating devices extends their lives in two ways: one by reducing the accumulated deterioration level (i.e. reducing the previously occurred deterioration); the other by reducing the deterioration rate of the device after performing the maintenance (i.e., reducing the future deterioration). Rao and Naikan [9] propose a condition-based maintenance scheme for Markov deteriorating systems, which they call condition-based preventive maintenance (CBPM). The proposed model considers deterioration and random failures with minimal and major maintenance. Minimal repairs are carried out after every random failure, and the device is getting replaced after the occurrence of the deterioration failure. The system undergoes random inspections to assess the condition; the mean time between inspections is exponentially distributed. Based upon the observed condition of the device, triggered actions are ‘do nothing’, minimal maintenance, and major maintenance’. Minimal maintenance makes the system one deterioration stage younger, while major maintenance makes the system ? deterioration stages (? > 1) younger. The proposed models consider increasing intensity for the random failures. An exact recursive algorithm computes the steady-state probabilities of the system. Optimal solutions of the model are derived based on two criteria namely, (a) availability maximization, and (b) total cost minimization.
As identified by Shahanaghi et al [10] in Figure 6, early research has classified CBM systems into completely observable systems and partially observable systems. An important assumption that is implicit in many of these works is that after each maintenance action, the state of the system returns to its initial state. Shahanaghi et al [10] extend this assumption in such a way that each time a maintenance action is initiated, the state of the system is multiplied by a certain random coefficient. This essentially means that after each maintenance action, the system state is not fully improved and the amount of improvement which is made on the system state depends on the current state of the system.
17
Figure 6: CBM systems Classifications; Source: Adapted from Shahanaghi et al [10]
P-F Interval and Standards for Condition-based Maintenance
Condition-based maintenance schemes use non-destructive testing, visual inspection, etc. to collect performance data for the purpose of assessing equipment condition. Whereas the actual maintenance frequency may be decided based on the hypothesis that most failures do not occur instantaneously, it is assumed that one can detect a failure that occurs during the final stages of deterioration. If evidence can be collected that something is in the final stages of a failure, then it is possible to take action to prevent it from failing completely and/or at least avoiding the consequences. According to Sethiya [11], as illustrated in Figure 7, maintenance task intervals should be determined based on the expected P-F
18
interval. The P-F interval governs the frequency with which a predictive task must be done. It is called the P-F interval curve, because it shows how a failure starts and deteriorates to the point at which it can be detected (the potential failure point "P"). Thereafter, if it is not detected and suitable action taken, it continues to deteriorate - usually at an accelerating rate - until it reaches the point of functional failure (Point "F"). The amount of time (or the stress and fatigue cycles) which elapses between the point where a potential failure occurs and the point where it deteriorates into a functional failure, or the warning period during which condition monitoring tasks are used to detect the onset of a failure, is known as the P-F interval. The inspection interval must be significantly less than the P-F interval if one wishes to detect the potential failure before it becomes a functional failure. The P-F interval can be measured in units relating to exposure to fatigue cycles (running time, units of output, stop-start cycles, etc), but it is generally measured in terms of elapsed time.
Figure 7: P-F interval; Source: Sethiya [11]
Sethiya [11] has also provided descriptions of some general condition monitoring categories: • Temperature Measurement. Temperature measurement helps detect potential failures related to a temperature change. Measured temperature changes can indicate problems like excessive friction, degraded heat transfer, poor electrical connections, and so on. •
Dynamic Monitoring. It is the process of measuring and analyzing energy emitted from equipment in waves like vibration, pulses and acoustic effects. Measured
19
changes in equipment vibration characteristics can indicate problems such as wear, imbalance, etc. •
Oil Analysis. Oil analysis can be performed on different types of oils such as lubrication, hydraulic or insulation oils. It can indicate problems such as machine degradation, contamination, improper consistency, and so on.
•
Corrosion Monitoring. Helps provide an indication of the extent of corrosion, the corrosion rate and the corrosion state of material.
•
Non-destructive Testing. Involves performing tests that are non-invasive to the test subject. Many of the tests can be performed while the equipment is online.
•
Electrical Testing and Monitoring. It involves measuring changes in system properties such as resistance, conductivity, dielectric strength, etc. They help detect electrical insulation deterioration, broken motor rotor bars and a shorted motor stator lamination.
•
Observation and Surveillance. Such methods are based on human sensory capabilities. They serve as a supplement to other condition-monitoring techniques. It also helps detect problems such as worn parts, poor electrical connections, and various forms of leaks.
•
Performance Monitoring. It predicts problems by monitoring changes in variables such as pressure, temperature, flow rate, power consumption and/or equipment capacity.
In Figure 8, Sethiya [11] has illustrated a relationship between failure rates and a change in maintenance philosophy. It shows a declining trend of the failure rates from corrective maintenance to predictive maintenance; it also represents the strengths and weaknesses of the many maintenance types.
20
Figure 8: Failure rates vs. change in maintenance philosophy; Source: Sethiya [11]
A standardization proposal for condition-based maintenance architecture has been in the works in the form of Open System Architecture for Condition Based Maintenance (OSACBM). The OSA-CBM organization mission statement (www.osacbm.org) covers a wide range of functions of a CBM system, for both hardware and software components. The proposed standard divides a CBM system into seven interconnected different layers shown below, and in Figure 9.
Figure 9: Interconnected Layers in a CBM System; Source: Sethiya [11]
21
Layer 1 Sensor Module: The sensor module provides the CBM system with digitized sensor or transducer data.
Layer 2 Signal Processing: The signal processing module receives signals and data from the sensor/signal processing modules. The output from the signal processing module includes digitally filtered sensor data, frequency spectra, virtual sensor signals and other CBM features.
Layer 3 Condition Monitor: The condition monitor receives data from the sensor and/or signal processing modules and other condition monitors. Its primary focus is to compare data with expected values (e.g. vibration high). It also generates alerts based on preset operational limits.
Layer 4 Health Assessment: The health assessment module receives data from different condition monitors. Its purpose is to prescribe if the health of the monitored component or system has degraded. The module should generate diagnostic records and propose fault possibilities. The diagnosis is based on factors like trends in the health history, the operational environment, etc.
Layer 5 Prognostics: The prognostic module captures data from all the prior layers. The purpose of this module is to calculate the asset’s future health, taking into account future usage profiles. The module can report the future health status at a specified time or on the remaining useful life of the asset.
Layer 6 Decision Support: The decision support module receives data from the health assessment module and the prognostic module. Its purpose is to generate recommended actions and alternatives. The actions can be of the maintenance sort but also on how to run the asset until the current mission is completed without the occurrence of a breakdown.
Layer 7 Presentation: The presentation module can present data from all previous modules. The most important layers to present would be the data from the health assessment,
22
prognostic and decision support modules as well as alerts generated from the condition monitors. But the ability to lock even further down in the layer should be a possibility.
Wireless Technologies Enable Condition-based Maintenance
Condition-based maintenance has become viable with the use of wireless technologies. Just as people have become connected by the Internet, so too, equipment is becoming connected to the network. Much of the equipment that is being connected has been in operation for several years, so its connectivity needs retrofitting. According to Stargardt [12], wireless technologies can now provide the automatic and continuous connections provided by wires, and it is approaching the reliability compatible to the wired world. A number of concerns had kept wireless technologies from being used in the past. A major issue was the perception that wireless connections were unreliable. Other concerns have been the limited range of the wireless connections, as well as lower throughput and higher latency than wired connections. Security was also raised as an issue since wireless signals could be received across a broad area. In addition, in the past, wireless connections were more costly than simple wired connections. Advances in the last decade have overcome most of these concerns. The major advances in wireless for equipment monitoring, are in the use of low cost, low-powered, unlicensed wireless technology. Sensor networks are yet another connectivity alternative which is a relatively new development. These networks use some version of a mesh architecture, in which one radio’s data is relayed to the final destination by other radios (motes), and usually by several. These networks incorporate sophisticated intelligence that allows them to configure themselves automatically. Condition-based maintenance (CBM) now takes advantage of the automated management capabilities of computerized maintenance management systems (CMMS). It makes use of an equipment list that already exists in the CMMS database, along with detailed information on each asset together with the maintenance strategy selected for that asset. Stargardt [12] has discussed how a maintenance strategy can be implemented in such a scenario, as detailed in Figure 10. In the CMMS scenario, readings from sensors and instruments on the monitored equipment are communicated to the CMMS system continuously, or frequently
23
enough. This data is analyzed automatically by the CMMS to evaluate the condition of the asset. The analysis is performed using rules or algorithms programmed into the CMMS based on the known failure modes and their early warning indicators for each piece of equipment or asset.
Figure 10: Maintenance Strategy in a CBM System; Source: Stargardt [12]
The CMMS can perform the tedious task of monitoring machine health to detect the exception conditions that indicate the need for maintenance, and it performs this task more reliably and economically. When the assessment of an asset condition indicates a need for maintenance, the CMMS automatically generates a work order for that maintenance. To complement this, recently, implementation of some form of Enterprise Asset Management (EAM) systems have been reported, which can help organizations to manage assets over their entire life cycle, from their design, procurement and commissioning, all the way through their retirement and/or disposal. The EAM system assists organizations in deciding which assets deserve the investment in condition-based maintenance, which require the attention of preventive maintenance, and which should be operated as “run to failure.” CBM uses real-time data, which obviously requires the installation of monitoring devices on equipment to measure degradation. As Ellis [13] indicates, whenever the monitor detects degradation, a message can be transmitted to a CMMS. It should be noted however that a trade-off exists between the costs and benefits of real-time monitoring. For example, performing frequent maintenance inspection results in high labour cost; conversely, infrequent maintenance inspection might lead to asset degradation and the possibility of
24
premature failures. Hence, the costs and benefits of remote monitoring should be compared with the costs and benefits of frequent/infrequent maintenance inspections. Accordingly, an effective CBM strategy involves a good understanding of asset criticality, failure modes and the total cost of failures; knowing what to monitor for a given asset requires reliability and finance related information. In the final analysis, a decision on CBM should be based on asset criticality (safety, environmental and operational impact) and cost (failure rates). A number of techniques, like failure modes effects and criticality analysis (i.e.: possible ways that something can fail) and reliability centered maintenance (i.e.: consequences of failure), will become useful in choosing cost effective maintenance strategies.
Intelligent System for Condition-based Maintenance Management
Artificial Intelligence and expert systems also can play a role in the creation of a CBM system. According to Yam et al [14], intelligent systems that may be used for conditionbased fault diagnosis essentially fall into three categories - rule-based diagnostic systems, model-based diagnostic systems and case-based diagnostic systems. Rule-based diagnostic systems detect and identify equipment faults in accordance with the rules representing the relation of each possible fault with the corresponding condition. A modelbased diagnostic system uses various mathematical, neural network and logical methods to improve diagnostic reasoning based on the structure and properties of the equipment system. A model-based diagnostic system compares the real monitored condition with the model of the object in order to predict the fault behaviour. Case-based diagnostic systems use historical records of maintenance cases to provide an interpretation for the actual monitored conditions of the equipment. A record of all previous incidents and equipment malfunctions along with their maintenance solutions are stored in a computer, which could be used to identify a historical case that closely matched with the current condition. If a fault similar to a stored case occurs, the case-based diagnostic system will automatically pick up a suitable maintenance solution from the case library.
Yam et al [14] has also developed an intelligent predictive decision support system (IPDSS) for condition-based maintenance (CBM). It supplements the conventional CBM approach by
25
adding the capability of condition-based fault diagnosis and the power of predicting the trend in equipment deterioration. Its underlying model is based on the recurrent neural network (RNN), and was developed and tested for the critical equipment in a power plant. The IPDSS model is reported to have provided reliable fault diagnosis and strong predictive power for the trend of equipment deterioration. The results may be used as input to an integrated maintenance management system to pre-plan and pre-schedule maintenance work, to reduce inventory costs for spare parts, to cut down unplanned outage and to minimize the risk of catastrophic failures.
Figure 11: IPDSS model; Source: Yam et al [14]
Figure 11 based on Yam et al [14] shows some features of the IPDSS model. When the monitored equipment operates under normal conditions, only minimal routine maintenance will be required. When the equipment parameters reach base level the equipment goes into a degrading condition, and a fault developing trend analysis is carried out; this analysis can indicate the possible problem areas in which an incipient fault has occurred or is likely to occur. Under a degrading condition, no special maintenance action is required, but more monitoring should be carried out to avoid an emergency event. When the monitored equipment parameters exceed a set level, alarms are activated to alert the maintenance staff. The case-based diagnostic system module of the IPDSS is then activated to look for a similar situation from the case library of maintenance. If an equipment fault occurs again, the case-based diagnostic system will pick up the maintenance advice with trouble–cause– remedy from the case library. If the warning for this condition does not match with that in
26
the maintenance case library, IPDSS will activate the rule-based and/or model-based fault diagnostic system module. The rule-based diagnostic system can detect and identify incipient faults according to the rules representing the relations of each possible fault and the actual monitored equipment condition. Once a component is diagnosed as the source of incipient failure, the prediction of the trend of the equipment deterioration can be activated to assess the remaining life of the equipment. The result of the prediction will then be used to arrange a planned maintenance action prior to the equipment’s eventual failure. When the monitored equipment parameters reach a predefined protective level, it can trigger a complete shutdown, and this is equivalent to a major fault in the equipment. Under such a serious condition, urgent corrective maintenance action would be initiated.
Commercial CBM Systems
The modern approach to maintenance represents system thinking, and is a shift from the fragmented technologies of the past. CBM can detect the current state of mechanical systems and predict the systems' ability to perform without failure. It uses the stressor levels created during the equipment design process, measures suitable parameters to quantify the existing stressor levels, and can correct operating environments to make these levels compatible with economic production versus equipment lifetimes. The CBM replaces arbitrarily timed maintenance with scheduled maintenance warranted by the equipment condition. It advocates the analysis of equipment condition data to allow the planning and scheduling of maintenance activities or repairs before functional failure. While CBM can be implemented in single steps, its greatest potential is realized when it is applied evenly across an entire asset class, employing the full range of maintenance concepts. With these in the background, commercial systems have started appearing in the market; a utility company example, extracted from reference [15], is given in Figure 12.
To give some context on the example noted, a utility company must maintain several facilities, hundreds of substations and distribution circuits, several hundred thousand distribution poles and transformers, and millions of electric meters. An installed computer system monitors each and every substation, circuit, pole, transformer, and meter. The utility expects the site to notify personnel when systems or assets deviate from acceptable
27
performance, driven by automated analysis, triggers or notifications. Without delay, affected users can expect to receive an email notification of an event, respond to that email, and to have the website acknowledge that the issue has been addressed. Users may also want to fine tune configuration and processes according to external factors and conditions to achieve greater asset efficiency, and to be able to create reports that summarize conditions and the results of CBM practices as needed.
Figure 12: CBM architecture for a utility company; Source: [15]
Conclusion
The condition-based maintenance (CBM) approach ensures a reduction in maintenance uncertainty, based on the needs indicated by the equipment condition. The monitoring process involves collection and interpretation of the relevant equipment parameters for identifying the state of equipment deviations and changes from its normal conditions. The parameters in this context represent a set of characteristics that indicate the actual equipment condition. Any abnormality in these characteristics indicates the occurrence of some sort of functional failure. A built-in fault diagnosis scheme can be activated by the detection of such an abnormal condition; it recognizes and analyses the symptomatic information, identifies the root causes of a failure and infers the fault development trend, as
28
well as predicts the remaining life of the equipment. By monitoring the operating conditions of the defective item, future key symptoms associated with the deterioration of the equipment can be predicted. Equipped with this kind of monitoring system, an advanced alarm can be activated when the predicted value falls within an alarm band. This will help the system operators to take adequate actions to check the condition of the equipment and repair the defects prior to a total breakdown.
It should be mentioned that the cost of data collection is a big issue in the implementation of condition-based maintenance procedures. Though data from real life experience can boost the confidence level in the model to be used, only a few operators will be prepared to let components run to a total failure, while often the replaced components may not be in a “failed” state yet. It is important to include the cost of data collection in a CBM decision model, and hence it will be wise to weigh whether monitoring will be effective in reducing the total costs. The costs of data collection should be balanced against the expected gains. The increasing use of CBM may also place new demands on maintenance management information systems. These systems need to schedule maintenance activities in a dynamic manner, since the execution times of certain activities will be updated frequently as condition information becomes available. Hence, carrying out a cost-benefit analysis will become an essential part of any condition-based maintenance strategy.
References
1. “Moving to Condition-Based Maintenance (CBM”http://www.nextgenpe.com/article/Moving-to-Condition-Based-Maintenance-CBM/ 2. Mitchell T Rausch, “Condition based Maintenance of a single system under Spare Part Inventory Constraints”, Masters Theses, Wichita State University, August 2008. 3. P. A. Scarf “A Framework for Condition Monitoring and Condition Based Maintenance” Quality Technology & Quantitative Management, Vol. 4, No. 2, pp. 301-312, 2007 4. A. Grall, C. Bérenguer and L. Dieulle, “A condition-based maintenance policy for stochastically deteriorating systems”, Reliability Engineering & System Safety Volume 76, Issue 2, May 2002, Pages 167-180. 5. Satoru GOTO, Yuhki ADACHI, Sinji KATAFUCHI, Toshihiko FURUE, Yoshitaka UCHIDA, Mitsuhoro SUEYOSHI, Hironori HATAZAKI and Masatoshi NAKAMURA, “On Line Deterioration Prediction and Residual Life Evaluation of Rotating
29
Equipment Based on Vibration Measurement”, SICE Annual Conference, August 20-22, 2008, The University Electro-Communications, Japan. 6. Susan Lu, Yu-Chen Tu and Huitian Lu, “Predictive Condition-based Maintenance for Continuously Deteriorating System” Qual. Reliab. Engng. Int. 2007; 23:71–81. 7. E Deloux, B Castanier, and C Be´renguer, “Maintenance policy for a deteriorating system evolving in a stressful environment”, Proc. IMechE Vol. 222 Part O: J. Risk and Reliability, 2008. 8. Ling Wang, Enhui Zheng, Yuntang Li, Binrui Wang, Jinjin Wu, “Maintenance Optimization of Generating Equipment Based on a Condition-based Maintenance Policy for Multi-unit Systems”, Chinese Control and Decision Conference (CCDC 2009) – IEEE 2009. 9. P. Naga Srinivasa Rao and V.N. Achutha Naikan, “An Optimization Methodology for Condition Based Minimal and Major Preventive Maintenance”, Economic Quality Control, Vol 21 (2006), No. 1, 127 – 141. 10. Kamran Shahanaghi, Hamid Babaei, Arash Bakhsha, Nasser S. Fard, “A new condition based maintenance model with random improvements on the system after maintenance actions: Optimizing by monte carlo simulation”, World Journal of Modelling and Simulation, Vol. 4 (2008) No. 3, pp. 230-236. 11. S.K.Sethiya, “Condition Based Maintenance (CBM)”, Secy.toCME/WCR/JBP 12. Wayne Stargardt “Condition-Based Maintenance Using Wireless Monitoring: Developments and Examples”,http://reliabilityweb.com/art08/cbm_wireless.htm 13. Byron A. Ellis, “The Challenges of Condition Based Maintenance”http://www.jethroproject.com/The Challenges of Condition Based%2 0Maintenance.pdf 14. R. C. M. Yam, P. W. Tse, L. Li and P. Tu, “Intelligent Predictive Decision Support System for Condition-Based Maintenance”, Int J Adv Manuf Technol (2001) 17:383– 391 15. “Osisoft.com, “Condition-based Maintenance (CBM) Across the Enterprise” OSIsoft, Inc. 777 Davis Street Suite 250 San Leandro, CA 94577
30
doc_299232628.pdf
The condition-based maintenance (CBM) approach ensures a reduction in maintenance uncertainty, based on the needs indicated by the equipment condition. The monitoring process involves collection and interpretation of the relevant equipment parameters for identifying the state of equipment deviations and changes from its normal conditions. The parameters in this context represent a set of characteristics that indicate the actual equipment condition. Any abnormality in these characteristics indicates the occurrence of some sort of functional failure. A built-in fault diagnosis scheme can be activated by the detection of such an abnormal condition; it recognizes and analyses the symptomatic information, identifies the root causes of a failure ad infers the fault development trend
Condition-based Maintenance Management in Critical Facilities
RR-305 Neelamkavil, J July 2010
The material in this document is covered by the provisions of the Copyright Act, by Canadian laws, policies, regulations and international agreements. Such provisions serve to identify the information source and, in specific instances, to prohibit reproduction of materials without written permission. For more information visithttp://laws.justice.gc.ca/en/showtdm/cs/C-42 Les renseignements dans ce document sont protégés par la Loi sur le droit d'auteur, par les lois, les politiques et les règlements du Canada et des accords internationaux. Ces dispositions permettent d'identifier la source de l'information et, dans certains cas, d'interdire la copie de documents sans permission écrite. Pour obtenir de plus amples renseignements :http://lois.justice.gc.ca/fr/showtdm/cs/C-42
Table of Contents Contents
Condition-based Maintenance Management in Critical Facilities .............................. 1 Summary ............................................................................................................... 3 The Role of Maintenance in Facility Management................................................. 4 Corrective Maintenance or ‘Run-to-failure’ ...................................................... 4 Preventive Maintenance.................................................................................. 4 Predictive vs. Condition-based Maintenance (CBM) ....................................... 7 Degradation Methods and Models for Condition-based Maintenance ................... 8 Other Models for Condition-based Maintenance ................................................. 15 P-F Interval and Standards for Condition-based Maintenance ............................ 18 Wireless Technologies Enable Condition-based Maintenance ............................ 23 Intelligent System for Condition-based Maintenance Management..................... 25 Commercial CBM Systems.................................................................................. 27 Conclusion........................................................................................................... 28 References .......................................................................................................... 29
2
Condition-based Maintenance Management in Critical Facilities
Summary
A facility management strategy requires that an organization’s major operational concerns are dealt with, such as: avoiding the risk of catastrophic failures and eliminating any forced outage of its equipment; planning for equipment maintenance that operates under a complex operating environment; and reducing the quantity of spare parts and associated inventory costs. To bring things further into perspective, it is a well known fact that many systems suffer increasing wear with usage and age and are subject to random failures that are linked to asset deterioration. A few examples of such affected items can be seen in cutting tools, hydraulic structures, brake linings, turbine blades, and rotating equipment. In all these cases, various physical deterioration processes can be observed, such as cumulative wear, bearing wear, crack growth, erosion, corrosion, fatigue, and so on. The deterioration and failures of such systems might incur safety hazards, as well as high operational costs (e.g. due to production losses and delays, unplanned intervention on the system). As a result, preventive maintenance becomes necessary so as to replace the deteriorated system before it fails. If the deterioration of the system or a parameter strongly correlated with the state of that system can be directly measured (via vibration analysis, wear monitoring, corrosion level, etc.), and if the system ceases to function when it deteriorates beyond a given threshold level, then it is appropriate to base maintenance decisions on the actual deterioration state of the system rather than on its age. And this leads to the choice of a condition-based maintenance (CBM) policy. CBM techniques provide an assessment of the system’s condition, based on data collected from the system through continuous monitoring or via inspections. The main purpose of this is to determine the required maintenance plan prior to any predicted failure. CBM has been proven to minimize the cost of maintenance, improve operational safety and reduce the severity and number of in-service system failures. This report complements an earlier IRC report # RR 284 titled “A Review of Existing Tools and their Applicability to Facility Maintenance Management” submitted by the same author.
3
The Role of Maintenance in Facility Management
As indicated in reference [1], modern maintenance management strategies have evolved over a period of time, as organizations ensure high asset reliability and availability but with only a limited maintenance investment. Note that, arriving at a maintenance strategy for individual assets, in order for the enterprise objectives to be met at minimal cost, still remains a challenge. Figure 1, extracted from reference 1, depicts the various types of maintenance strategies practiced today.
Corrective Maintenance or ‘Run-to-failure’ For a long time, organizations have practised a “run to failure” maintenance strategy, in which an asset is operated until it fails or breaks. Maintenance action, which involves repair or replacement, is taken with the intention of correcting the fault. For many non-critical assets, this is still considered a reasonable and logical operating strategy.
Preventive Maintenance Failures of many assets have expensive and far-reaching consequences. These failures can shut down entire production lines, make buildings unusable, or may also cause accidents. It is imperative that these types of failures are prevented. As such, a different type of maintenance strategy has evolved in time – widely known as preventive maintenance. This involves looking at the asset failure history, and instigating maintenance to “fix” it before there is a high probability of its failing. This strategy ensures high asset availability and minimizes unplanned downtime. For many critical assets, preventive maintenance eliminates the severe consequences of failures; however, the benefits of preventive maintenance come at a price. Generally, the preventive strategy advises that maintenance be performed more often than is absolutely necessary. As the maintenance incurs costs in both labour and parts, this strategy can result in “over-maintenance”. In addition, preventive maintenance usually requires that assets be taken off-line for servicing, which in turn incurs cost due to down time, and lost production.
4
Figure 1: Maintenance Types; Source: Reference [1]
Operation elapsed time has a major influence on preventive maintenance, as in the case of changing the lubricant in a passenger car. Typically, most people change their engine oil in their vehicles every 5,000 to 8,000 kilometers with no particular concern given to the actual condition and performance capability of the oil. If the owner of the car discounted the vehicle run time, and had the oil analyzed at some interval to determine its actual condition and lubrication properties, he/she might be able to extend the oil change until the vehicle had traveled 10,000 kilometers.
Mitchell T. Rausch [2] has provided a detailed review of preventive (sometime called timebased) maintenance approaches (listed below) along with some characteristics: •
Failure rate limit policies initiate maintenance when the system has reached a predetermined failure rate. State variables such as wear, stress or damage are monitored to update the failure rate function. When the failure rate reaches a predetermined value, preventative maintenance activities are commenced.
5
•
Sequential maintenance policies initiate maintenance according to unequal preventive maintenance time intervals; as the age of the component increases, the elapsed time between maintenance activities is reduced.
•
Repair limit policies utilize a cost basis to decide on the action taken when a component fails. When it fails, the cost of repair is compared to the cost of replacement. The item is repaired if the cost of repair is less than the cost to replace, otherwise it will be replaced.
•
Repair number counting policies allow for the component to fail n times before it is replaced. The failures up top n-1 are mitigated with minimal repair.
•
Repair number counting and reference policies are an enhancement to repair number counting policies (above) by adding an additional variable T that represents a positive operating time. Under the policy, the component is allowed to fail n times, but is not replaced at the nth failure if the operational time has not reached the predetermined T value; the component is minimally repaired but replaced on the n+1 failure.
•
Opportunistic maintenance policies address dependencies that occur in large systems. Failure of a component within a large system of components may require the removal of non-failed components to access the failed component. So, there is an opportunity to repair or replace non-failed components according to criterion such as hazard rate or cost.
•
Optimization of preventive maintenance policies is conducted by analyzing cost and system reliability measurements. The optimization approach generates preventive maintenance intervals by minimizing costs, or that ensures the desired system reliability is achieved.
6
Predictive vs. Condition-based Maintenance (CBM) In recent times, requirements for performance have risen, and the downtime allocated for routine maintenance (preventive replacement, inspection, etc.) has been squeezed. Due to the fact that preventive maintenance has become expensive, organizations have been developing a different type of maintenance strategy. Under this plan, an asset’s condition is monitored frequently (or continuously) until it begins to give evidence of deteriorating performance or an incipient failure. Maintenance is then performed in-time to prevent an imminent failure. Compared to what the preventive maintenance can offer, the new strategy (known as predictive maintenance or condition-based maintenance) results in overall costs reduction of maintenance, while providing better asset availability and performance. Condition monitoring can reduce the uncertainty operators feel about the current state of an asset. For example, knowledge about the vibration levels of a certain critical bearing can give operators confidence about its operation. Condition-based maintenance uses real-time information on the condition of the asset to identify when the actual maintenance is necessary; and, this allows the maintenance to be deferred until it is needed. Though the terms “predictive” and “condition-based” are often used interchangeably, current thinking is that there is a difference between predictive maintenance and conditionbased maintenance (CBM) strategies. Predictive maintenance is activated by the analysis of equipment condition data that is gathered periodically, often manually. This contrasts with the CBM approach in which equipment condition data is collected in a continuous manner and analyzed in real-time. At the same time, it should be noted that the continuous data collection mandates the installation of sensors on the equipment as well as with means of collecting and analysing the collected data. CBM is more suitable and logical in critical facilities, especially in process industries like refineries and power plants.
Condition-based maintenance can be initiated according to the state of a degrading system that is monitored through various characteristic measures, which essentially describe the state of the system. Once the degradation characteristic crosses a specified threshold, the maintenance actions may be triggered. Degradation measures must be identified that effectively relate the state of the system (or component) to its remaining useful life, along
7
with a decision on a failure threshold, and with the feasibility of implementing condition monitoring technology. Mitchell T. Rausch [2] has listed the most common monitoring methods practiced today, and this includes: •
Vibration monitoring can detect wear, fatigue, misalignment and loose assemblies for rotating equipment such as bearings, gear box, pumps, motors and engines. Vibration readings are collected over time and compared to a base line and alarm limits. Maintenance personnel are alerted when the readings tend toward the alarm limits.
•
Process parameter monitoring involves tracking a variety of operational characteristics such as process efficiency, system temperature, electrical current, and pressure that can be linked to the health status of the system.
•
Thermography is a method of capturing the infrared emissions of a component to determine if the operating temperature conditions are fluctuating outside of normal operation; abnormal temperature changes can be a symptom of an upcoming failure.
•
Tribology is the study of the effects of friction between two mating surfaces. Friction causes the generation of particulate that can be monitored through wear particulate analysis. Lubrication analysis can be conducted to determine the appropriate time to change lubricating fluids in a system.
•
Visual inspection is an easy method to implement, and this involves identifying loose components, structural cracks, or any other abnormal characteristics.
Degradation Methods and Models for Condition-based Maintenance
Selecting a suitable model to be used in a CBM scheme is not an easy task. The selection of the model should be based on the ability of the model to accurately describe the degradation process and make effective extrapolations of the component state into effective
8
decisions related to the maintenance. Mitchell T. Rausch [2] has included many of the popular models in his research work. In this regard, the degradation model must ensure that the physical degradation phenomenon is captured in the most realistic and practical method available for implementation. Degradation measurements traverse downward (or upward) toward a failure threshold, and the system is considered to have failed at the time when the measured value crossed a predetermined failure threshold.
Failure mechanisms for the specific system must be understood thoroughly so that an appropriate degradation model can be developed for use. Typically, there are continuous time, discrete time, continuous state, and discrete state degradation representations. Many of the discrete state/time methods involve Markov methods that require definition of discrete degradation states and state transition probabilities. The discrete state/time degradation models are less realistic in practical applications when compared with the continuous state/time representation. The continuous state/time model can describe true operational conditions and hence such models are more effective at describing degradation. Some of the discrete models include Markov chains and Markov decision processes, while some of the continuous degradation models include polynomials, cumulative damage, Brownian motion and gamma processes.
Markov chains represent a discrete time and discrete state stochastic process and are utilized to describe state transitions mathematically. To be used in a condition-based maintenance application, the degradation phenomenon may be defined according to discrete states and can be modeled with a Markov chain. To use Markov methods, multiple states must be identified which can be a challenging task in itself. Also, Markov methods require transition probabilities between states that can be difficult to define in practice.
Markov processes represent a discrete state continuous time stochastic process that is defined by a set of states in which multiple actions are available to the decision maker at each state. State transition probabilities are defined for each state ‘s’ and action ‘a’ that establish the probability of transition to the next state. For each state traversed, the decision maker receives a reward which establishes the decision made for that time period. Markov decision processes utilize actions and rewards, unlike Markov chains.
9
Neural networks may also be used to monitor and forecast degradation trends that are linked to maintenance decisions. These represent a set of nodes that perform computations and are arranged in patterns similar to neural nets. Each of the processing elements is connected through synapses with associated weights that modify a signal as it propagates through connections. The network also learns from its environment and adjusts the synaptic weights.
Kalman filters provide a recursive method to collect indirect measurements and describe a system state parameter through the measurement. The gamma process and Brownian motion represent continuous time/state stochastic processes that define an increment between two time periods. For modeling, the degradation increment between two time intervals may be defined by gamma process or Brownian motion. The degradation increment for the gamma process is defined by gamma distribution, while the Brownian motion is defined by a normal distribution.
Choosing an appropriate model for the decisions on condition-based maintenance can be made easy by utilizing a structured approach, as suggested by Scarf [3] in the models listed below and also in Figure 2. And, it is logical to place condition-based maintenance in the context of a general maintenance framework for a large complex system, which considers all elements of the system (machines, units, components) having failure characteristics. However, the framework also contends that if it is not possible to define a failure threshold, and no condition indicator data are available at failure, and further that a warning threshold cannot be defined, then condition-based maintenance for the item is not feasible. In such cases, other maintenance policies should be explored – age- based, routine inspection, operate to failure, etc.
Proportional Hazards Models: In this, the age of a component is monitored, and replacement is initiated when the age reaches a critical level. The criticality may be expressed in terms of a hazard function. To choose the critical level optimally, information about the time to failure distribution for the component is required; one has to specify other criteria also, such as the mean time between component failures. If the component age is not monitored, replacements may be scheduled periodically, either according to some reliability-optimal considerations, or even according to maintenance budget constraints.
10
Figure 2: A binary decision tree for model selection in CBM; Source: Scarf [3]
Failure Threshold Models: This approach depends on being able to specify a failure threshold, c, for the component condition Y. If Y > c then the component is assumed to have failed. For simple cases (e.g. tire wear) Y may be directly observable, and it is sufficient to model this wear using an appropriate stochastic process. The problem is to assess the residual life or remaining time to failure. An estimate of the residual life distribution can be used to optimize the time to replacement. This replacement time can be updated dynamically as more condition information becomes available. Often, the decision would come down to making a choice between replacing before the next monitoring check, or otherwise. Where the process is measured continuously then the replacement issue becomes trivial, that is, to replace when Y becomes greater than c.
11
Two-Phase Failure Models: It is required to specify a warning threshold to proceed with the condition-based maintenance; when the condition is above the threshold then it is time for the component to be replaced. When the condition is monitored continuously, the decision problem becomes simple. If the condition indicator is monitored only periodically, and the condition indicator assumes the values 0 or 1, depending on whether the measured condition is above or below a threshold, then the issue is how often to monitor. Scarf [3] suggests using a two-phase or delay-time model to optimize the monitoring interval.
Grall et al [4] have described a system that undergoes random deterioration, while being monitored through “perfect” inspections. When the system condition exceeds its failure level L, it enters into a failed state and a corrective replacement is carried out. When the system state upon inspection is found to be greater than a given critical threshold ?, the stillfunctioning system is considered as ‘worn-out’ and a preventive replacement is performed. The choice of the inspection dates and of the critical threshold value influences the economic performance of the maintenance policy. A low critical threshold leads to frequent preventive maintenance operations and prevents the full exploitation of the residual life of the deteriorated (but still functioning) system. On the other hand, a high critical threshold tends to keep the device working even in an advanced deterioration state, with an increased risk of failure. In practice, the monitoring inspections are performed at regular intervals or done continuously. It is evident that inspection dates and critical threshold values are the main decision variables in the problem of optimizing a CBM scenario. A conservative approach can lead to choosing a weak threshold and inspect more often than necessary leading to non-optimal maintenance policies. Grall et al [4] proposed two novel developments compared with other research works on condition-based maintenance modeling and optimization. First, they developed a model which allowed them to investigate the joint influence of the critical threshold value ? and the choice of inspection dates on the total cost of the maintained device; they showed that the long run expected maintenance cost per unit time can be minimized by an appropriate joint choice of these two decision variables. Second, they did not impose a periodic ‘routine’ inspection scheme for the condition monitoring process (fixed intervals determined off-line) and they allowed irregular inspection dates: the next inspection date is dynamically updated on the basis of the present system condition revealed by the current inspection.
12
As shown in Figure 3 [Grall et al 4], a single-unit system is considered subject to a continuous accumulation of wear in time. Its condition at time t is assumed to be completely described by a single scalar random variable Xt. It starts from zero at t=0 (X0=0). When Xt=0 (after each replacement), the system is said to be in the ‘new’ state. Its increments in a time interval are non-negative, stationary and statistically independent. When the deterioration process Xt exceeds a failure level L, a system breakdown occurs and the system is said to be in the ‘failed’ state. Grall et al [6] assumed that the deterioration process can be observed only at discrete equidistant times tk=k?t, where the unit time length ?t is either arbitrarily chosen or imposed by the considered maintenance problem. The stochastic process describing the condition of the system at times tk is noted (Xk)kN where Xk=Xtk. For each k, the random deterioration Xk?Xk?1 between two consecutive discrete time units is taken to have the same probability density function f. Since the stochastic process Xt has stationary statistically independent increments, f belongs to the class of infinitely divisible distributions. The inverse of the mean deterioration rate between tk and tk?1 is noted. In order to better characterize the deterioration process, it is taken that the maintenance decision at time tk is made only on the basis of the average amount of deterioration reached at t, irrespective of how this average amount is obtained.
Figure 3: Continuous accumulation of Deterioration; Source: Grall et al [6]
13
Goto et al [5] has proposed an on-line deterioration prediction method and residual life for the maintenance of rotating equipment. The status of the rotating equipment is inspected by vibration measurement and a mathematical model for the deterioration of the equipment is derived in order to predict the future condition of the rotating equipment. For building the deterioration model, ‘noise’ or outliers in the vibration data caused by measurement errors are eliminated in order to improve the accuracy of the deterioration model. Figure 4a shows the flowchart of the deterioration management procedure; here, the deterioration management values of the rotating equipment are obtained in on-line fashion. The on-line deterioration management is divided into three parts. The first part is the outlier judgement in the deterioration management value, the second part is the deterioration prediction by using the deterioration model, and the last part is residual life evaluation by using the deterioration prediction. By using the three steps, the deterioration management of the rotating equipment is carried out.
Figure 4b, taken from Goto et al [5], shows an example of the deterioration management values of rotating equipment in a thermal power plant. In this, the deterioration management values contain some data which are deviated from the general trend of data; often, such data are referred to as “outliers”. The outliers make errors in vibration diagnosis as the deterioration tendency is disturbed by the outliers and the math model of the deterioration management value drifted away by the outliers. Hence, the accuracy of the deterioration model can be improved, if the outliers are eliminated from the deterioration management value.
14
Figures: 4a: Deterioration Procedure;
4b: Outliers in Deterioration; Source: Goto et al [5]
Other Models for Condition-based Maintenance
In most CBM modeling approaches, the deterioration measures are monitored (or inspected) and compared with a predefined threshold for the facility maintenance decisions. Different from these, Lu et al [6] describe what is called the predictive CBM (PCBM) approach to predict the deterioration condition in the future. An advantage of the PCBM model is that the degradation states are modeled as continuous states using a state-space model in which the state vector includes both the degradation level and the degrading rate, both of which influence maintenance decisions. The maintenance decisions are made according to the predicted degradation conditions and associated cost factors to enhance the profit produced by the system. A system’s deterioration condition is generally evaluated by one or more performance measures, which could be quantitative variables or signal features that are highly correlated with deterioration conditions.
There has been a great deal of research activity into monitoring the deterioration process. For instance, in a cutting operation, the feed spindle and feed motor currents are used to monitor the wearing condition of cutting tools with sensors. And it makes sense to model the dynamics of the deterioration process individually and make maintenance decisions on each, correspondingly. Also, it should be noted that the data coming out of sensors or monitoring devices may be contaminated by background noises, sudden disturbances, and/or seasonal environmental changes. Accordingly, the deterioration measures in applications may not be increasing (or decreasing) in a logical manner as they are supposed to be. Hence, data pre-processing becomes a necessary step to extract the hidden degradation features and, most importantly, the deterioration trend.
Many researchers have provided multiple relevant approaches to optimize maintenance decisions based on different characteristics of the complete system. It is unarguable that the performance of a maintenance strategy is dependent on the level of knowledge available to characterize the considered failure process. Yet many approaches to maintenance optimization do not explicitly describe the relationship between the system
15
performance and the associated operating environment. The environmental conditions can affect the deterioration rate of a system; for example, an excessive humidity level favours corrosion. Conversely, excessive deterioration of the operating system can make changes in the environment, for example a hairline crack in a roller can initiate bad vibrations. One way of capturing the effect of a random environment on an item’s life span is to randomize its failure rate function and treat it as a stochastic process. And, one of the most well known approaches is the proportional hazard rate approach which consists of modeling the effect of the environment with the introduction of covariates in the hazard function. However, the estimation of the required parameters is a somewhat complex task.
Deloux et al [7] has described a condition-based maintenance decision framework to tackle the potential variations in system deterioration, and especially in the deterioration rate. According to this work, the condition of the system at time tk (tk=kDt), where the unit time length Dt is either arbitrarily chosen or imposed by the considered maintenance system), can be summarized by a scalar variable Xk, which varies increasingly as the system deteriorates. Xk can be the measure of a physical parameter linked to the resistance of a structure (example: length of a crack). The initial state corresponds to a perfect working state, i.e. X0=0. The system ceases to fulfill its function as soon as the value of Xk is greater than a predetermined threshold level L. In this case, either a failure has occurred or an important deterioration is present that significantly reduces the system performance.
Very little attention has been paid to the CBM modeling of deteriorating systems with multiple different units. Ling Wang et al [8] present a novel CBM approach (see Figure 5) for multi-unit systems (e.g. a generating unit) in which the deterioration processes of several units are modeled using continuous-time Markov chains. Dividing the system deterioration into several discrete states is more practical than describing the deterioration condition by a single scalar continuous variable. One can classify the equipment deterioration into various states, like: initial, minor deterioration, major deterioration, and failure. For most equipment, two kinds of failure can be assumed: random failure and deterioration failure, or hard failure and soft failure. Deterioration failure grows gradually in time, occurring due to deterioration or aging mechanisms. On the contrary, random failure results from other causes not associated with typical aging, e.g. a vehicle failure incurred by its fuse that burns out. Ling Wang et al [8] have considered both random failure and failure due to deterioration.
16
Figure 5: Deteriorating Systems with Multiple Units; Source: Ling Wang et al [8]
It is intuitive that the maintenance of deteriorating devices extends their lives in two ways: one by reducing the accumulated deterioration level (i.e. reducing the previously occurred deterioration); the other by reducing the deterioration rate of the device after performing the maintenance (i.e., reducing the future deterioration). Rao and Naikan [9] propose a condition-based maintenance scheme for Markov deteriorating systems, which they call condition-based preventive maintenance (CBPM). The proposed model considers deterioration and random failures with minimal and major maintenance. Minimal repairs are carried out after every random failure, and the device is getting replaced after the occurrence of the deterioration failure. The system undergoes random inspections to assess the condition; the mean time between inspections is exponentially distributed. Based upon the observed condition of the device, triggered actions are ‘do nothing’, minimal maintenance, and major maintenance’. Minimal maintenance makes the system one deterioration stage younger, while major maintenance makes the system ? deterioration stages (? > 1) younger. The proposed models consider increasing intensity for the random failures. An exact recursive algorithm computes the steady-state probabilities of the system. Optimal solutions of the model are derived based on two criteria namely, (a) availability maximization, and (b) total cost minimization.
As identified by Shahanaghi et al [10] in Figure 6, early research has classified CBM systems into completely observable systems and partially observable systems. An important assumption that is implicit in many of these works is that after each maintenance action, the state of the system returns to its initial state. Shahanaghi et al [10] extend this assumption in such a way that each time a maintenance action is initiated, the state of the system is multiplied by a certain random coefficient. This essentially means that after each maintenance action, the system state is not fully improved and the amount of improvement which is made on the system state depends on the current state of the system.
17
Figure 6: CBM systems Classifications; Source: Adapted from Shahanaghi et al [10]
P-F Interval and Standards for Condition-based Maintenance
Condition-based maintenance schemes use non-destructive testing, visual inspection, etc. to collect performance data for the purpose of assessing equipment condition. Whereas the actual maintenance frequency may be decided based on the hypothesis that most failures do not occur instantaneously, it is assumed that one can detect a failure that occurs during the final stages of deterioration. If evidence can be collected that something is in the final stages of a failure, then it is possible to take action to prevent it from failing completely and/or at least avoiding the consequences. According to Sethiya [11], as illustrated in Figure 7, maintenance task intervals should be determined based on the expected P-F
18
interval. The P-F interval governs the frequency with which a predictive task must be done. It is called the P-F interval curve, because it shows how a failure starts and deteriorates to the point at which it can be detected (the potential failure point "P"). Thereafter, if it is not detected and suitable action taken, it continues to deteriorate - usually at an accelerating rate - until it reaches the point of functional failure (Point "F"). The amount of time (or the stress and fatigue cycles) which elapses between the point where a potential failure occurs and the point where it deteriorates into a functional failure, or the warning period during which condition monitoring tasks are used to detect the onset of a failure, is known as the P-F interval. The inspection interval must be significantly less than the P-F interval if one wishes to detect the potential failure before it becomes a functional failure. The P-F interval can be measured in units relating to exposure to fatigue cycles (running time, units of output, stop-start cycles, etc), but it is generally measured in terms of elapsed time.
Figure 7: P-F interval; Source: Sethiya [11]
Sethiya [11] has also provided descriptions of some general condition monitoring categories: • Temperature Measurement. Temperature measurement helps detect potential failures related to a temperature change. Measured temperature changes can indicate problems like excessive friction, degraded heat transfer, poor electrical connections, and so on. •
Dynamic Monitoring. It is the process of measuring and analyzing energy emitted from equipment in waves like vibration, pulses and acoustic effects. Measured
19
changes in equipment vibration characteristics can indicate problems such as wear, imbalance, etc. •
Oil Analysis. Oil analysis can be performed on different types of oils such as lubrication, hydraulic or insulation oils. It can indicate problems such as machine degradation, contamination, improper consistency, and so on.
•
Corrosion Monitoring. Helps provide an indication of the extent of corrosion, the corrosion rate and the corrosion state of material.
•
Non-destructive Testing. Involves performing tests that are non-invasive to the test subject. Many of the tests can be performed while the equipment is online.
•
Electrical Testing and Monitoring. It involves measuring changes in system properties such as resistance, conductivity, dielectric strength, etc. They help detect electrical insulation deterioration, broken motor rotor bars and a shorted motor stator lamination.
•
Observation and Surveillance. Such methods are based on human sensory capabilities. They serve as a supplement to other condition-monitoring techniques. It also helps detect problems such as worn parts, poor electrical connections, and various forms of leaks.
•
Performance Monitoring. It predicts problems by monitoring changes in variables such as pressure, temperature, flow rate, power consumption and/or equipment capacity.
In Figure 8, Sethiya [11] has illustrated a relationship between failure rates and a change in maintenance philosophy. It shows a declining trend of the failure rates from corrective maintenance to predictive maintenance; it also represents the strengths and weaknesses of the many maintenance types.
20
Figure 8: Failure rates vs. change in maintenance philosophy; Source: Sethiya [11]
A standardization proposal for condition-based maintenance architecture has been in the works in the form of Open System Architecture for Condition Based Maintenance (OSACBM). The OSA-CBM organization mission statement (www.osacbm.org) covers a wide range of functions of a CBM system, for both hardware and software components. The proposed standard divides a CBM system into seven interconnected different layers shown below, and in Figure 9.
Figure 9: Interconnected Layers in a CBM System; Source: Sethiya [11]
21
Layer 1 Sensor Module: The sensor module provides the CBM system with digitized sensor or transducer data.
Layer 2 Signal Processing: The signal processing module receives signals and data from the sensor/signal processing modules. The output from the signal processing module includes digitally filtered sensor data, frequency spectra, virtual sensor signals and other CBM features.
Layer 3 Condition Monitor: The condition monitor receives data from the sensor and/or signal processing modules and other condition monitors. Its primary focus is to compare data with expected values (e.g. vibration high). It also generates alerts based on preset operational limits.
Layer 4 Health Assessment: The health assessment module receives data from different condition monitors. Its purpose is to prescribe if the health of the monitored component or system has degraded. The module should generate diagnostic records and propose fault possibilities. The diagnosis is based on factors like trends in the health history, the operational environment, etc.
Layer 5 Prognostics: The prognostic module captures data from all the prior layers. The purpose of this module is to calculate the asset’s future health, taking into account future usage profiles. The module can report the future health status at a specified time or on the remaining useful life of the asset.
Layer 6 Decision Support: The decision support module receives data from the health assessment module and the prognostic module. Its purpose is to generate recommended actions and alternatives. The actions can be of the maintenance sort but also on how to run the asset until the current mission is completed without the occurrence of a breakdown.
Layer 7 Presentation: The presentation module can present data from all previous modules. The most important layers to present would be the data from the health assessment,
22
prognostic and decision support modules as well as alerts generated from the condition monitors. But the ability to lock even further down in the layer should be a possibility.
Wireless Technologies Enable Condition-based Maintenance
Condition-based maintenance has become viable with the use of wireless technologies. Just as people have become connected by the Internet, so too, equipment is becoming connected to the network. Much of the equipment that is being connected has been in operation for several years, so its connectivity needs retrofitting. According to Stargardt [12], wireless technologies can now provide the automatic and continuous connections provided by wires, and it is approaching the reliability compatible to the wired world. A number of concerns had kept wireless technologies from being used in the past. A major issue was the perception that wireless connections were unreliable. Other concerns have been the limited range of the wireless connections, as well as lower throughput and higher latency than wired connections. Security was also raised as an issue since wireless signals could be received across a broad area. In addition, in the past, wireless connections were more costly than simple wired connections. Advances in the last decade have overcome most of these concerns. The major advances in wireless for equipment monitoring, are in the use of low cost, low-powered, unlicensed wireless technology. Sensor networks are yet another connectivity alternative which is a relatively new development. These networks use some version of a mesh architecture, in which one radio’s data is relayed to the final destination by other radios (motes), and usually by several. These networks incorporate sophisticated intelligence that allows them to configure themselves automatically. Condition-based maintenance (CBM) now takes advantage of the automated management capabilities of computerized maintenance management systems (CMMS). It makes use of an equipment list that already exists in the CMMS database, along with detailed information on each asset together with the maintenance strategy selected for that asset. Stargardt [12] has discussed how a maintenance strategy can be implemented in such a scenario, as detailed in Figure 10. In the CMMS scenario, readings from sensors and instruments on the monitored equipment are communicated to the CMMS system continuously, or frequently
23
enough. This data is analyzed automatically by the CMMS to evaluate the condition of the asset. The analysis is performed using rules or algorithms programmed into the CMMS based on the known failure modes and their early warning indicators for each piece of equipment or asset.
Figure 10: Maintenance Strategy in a CBM System; Source: Stargardt [12]
The CMMS can perform the tedious task of monitoring machine health to detect the exception conditions that indicate the need for maintenance, and it performs this task more reliably and economically. When the assessment of an asset condition indicates a need for maintenance, the CMMS automatically generates a work order for that maintenance. To complement this, recently, implementation of some form of Enterprise Asset Management (EAM) systems have been reported, which can help organizations to manage assets over their entire life cycle, from their design, procurement and commissioning, all the way through their retirement and/or disposal. The EAM system assists organizations in deciding which assets deserve the investment in condition-based maintenance, which require the attention of preventive maintenance, and which should be operated as “run to failure.” CBM uses real-time data, which obviously requires the installation of monitoring devices on equipment to measure degradation. As Ellis [13] indicates, whenever the monitor detects degradation, a message can be transmitted to a CMMS. It should be noted however that a trade-off exists between the costs and benefits of real-time monitoring. For example, performing frequent maintenance inspection results in high labour cost; conversely, infrequent maintenance inspection might lead to asset degradation and the possibility of
24
premature failures. Hence, the costs and benefits of remote monitoring should be compared with the costs and benefits of frequent/infrequent maintenance inspections. Accordingly, an effective CBM strategy involves a good understanding of asset criticality, failure modes and the total cost of failures; knowing what to monitor for a given asset requires reliability and finance related information. In the final analysis, a decision on CBM should be based on asset criticality (safety, environmental and operational impact) and cost (failure rates). A number of techniques, like failure modes effects and criticality analysis (i.e.: possible ways that something can fail) and reliability centered maintenance (i.e.: consequences of failure), will become useful in choosing cost effective maintenance strategies.
Intelligent System for Condition-based Maintenance Management
Artificial Intelligence and expert systems also can play a role in the creation of a CBM system. According to Yam et al [14], intelligent systems that may be used for conditionbased fault diagnosis essentially fall into three categories - rule-based diagnostic systems, model-based diagnostic systems and case-based diagnostic systems. Rule-based diagnostic systems detect and identify equipment faults in accordance with the rules representing the relation of each possible fault with the corresponding condition. A modelbased diagnostic system uses various mathematical, neural network and logical methods to improve diagnostic reasoning based on the structure and properties of the equipment system. A model-based diagnostic system compares the real monitored condition with the model of the object in order to predict the fault behaviour. Case-based diagnostic systems use historical records of maintenance cases to provide an interpretation for the actual monitored conditions of the equipment. A record of all previous incidents and equipment malfunctions along with their maintenance solutions are stored in a computer, which could be used to identify a historical case that closely matched with the current condition. If a fault similar to a stored case occurs, the case-based diagnostic system will automatically pick up a suitable maintenance solution from the case library.
Yam et al [14] has also developed an intelligent predictive decision support system (IPDSS) for condition-based maintenance (CBM). It supplements the conventional CBM approach by
25
adding the capability of condition-based fault diagnosis and the power of predicting the trend in equipment deterioration. Its underlying model is based on the recurrent neural network (RNN), and was developed and tested for the critical equipment in a power plant. The IPDSS model is reported to have provided reliable fault diagnosis and strong predictive power for the trend of equipment deterioration. The results may be used as input to an integrated maintenance management system to pre-plan and pre-schedule maintenance work, to reduce inventory costs for spare parts, to cut down unplanned outage and to minimize the risk of catastrophic failures.
Figure 11: IPDSS model; Source: Yam et al [14]
Figure 11 based on Yam et al [14] shows some features of the IPDSS model. When the monitored equipment operates under normal conditions, only minimal routine maintenance will be required. When the equipment parameters reach base level the equipment goes into a degrading condition, and a fault developing trend analysis is carried out; this analysis can indicate the possible problem areas in which an incipient fault has occurred or is likely to occur. Under a degrading condition, no special maintenance action is required, but more monitoring should be carried out to avoid an emergency event. When the monitored equipment parameters exceed a set level, alarms are activated to alert the maintenance staff. The case-based diagnostic system module of the IPDSS is then activated to look for a similar situation from the case library of maintenance. If an equipment fault occurs again, the case-based diagnostic system will pick up the maintenance advice with trouble–cause– remedy from the case library. If the warning for this condition does not match with that in
26
the maintenance case library, IPDSS will activate the rule-based and/or model-based fault diagnostic system module. The rule-based diagnostic system can detect and identify incipient faults according to the rules representing the relations of each possible fault and the actual monitored equipment condition. Once a component is diagnosed as the source of incipient failure, the prediction of the trend of the equipment deterioration can be activated to assess the remaining life of the equipment. The result of the prediction will then be used to arrange a planned maintenance action prior to the equipment’s eventual failure. When the monitored equipment parameters reach a predefined protective level, it can trigger a complete shutdown, and this is equivalent to a major fault in the equipment. Under such a serious condition, urgent corrective maintenance action would be initiated.
Commercial CBM Systems
The modern approach to maintenance represents system thinking, and is a shift from the fragmented technologies of the past. CBM can detect the current state of mechanical systems and predict the systems' ability to perform without failure. It uses the stressor levels created during the equipment design process, measures suitable parameters to quantify the existing stressor levels, and can correct operating environments to make these levels compatible with economic production versus equipment lifetimes. The CBM replaces arbitrarily timed maintenance with scheduled maintenance warranted by the equipment condition. It advocates the analysis of equipment condition data to allow the planning and scheduling of maintenance activities or repairs before functional failure. While CBM can be implemented in single steps, its greatest potential is realized when it is applied evenly across an entire asset class, employing the full range of maintenance concepts. With these in the background, commercial systems have started appearing in the market; a utility company example, extracted from reference [15], is given in Figure 12.
To give some context on the example noted, a utility company must maintain several facilities, hundreds of substations and distribution circuits, several hundred thousand distribution poles and transformers, and millions of electric meters. An installed computer system monitors each and every substation, circuit, pole, transformer, and meter. The utility expects the site to notify personnel when systems or assets deviate from acceptable
27
performance, driven by automated analysis, triggers or notifications. Without delay, affected users can expect to receive an email notification of an event, respond to that email, and to have the website acknowledge that the issue has been addressed. Users may also want to fine tune configuration and processes according to external factors and conditions to achieve greater asset efficiency, and to be able to create reports that summarize conditions and the results of CBM practices as needed.
Figure 12: CBM architecture for a utility company; Source: [15]
Conclusion
The condition-based maintenance (CBM) approach ensures a reduction in maintenance uncertainty, based on the needs indicated by the equipment condition. The monitoring process involves collection and interpretation of the relevant equipment parameters for identifying the state of equipment deviations and changes from its normal conditions. The parameters in this context represent a set of characteristics that indicate the actual equipment condition. Any abnormality in these characteristics indicates the occurrence of some sort of functional failure. A built-in fault diagnosis scheme can be activated by the detection of such an abnormal condition; it recognizes and analyses the symptomatic information, identifies the root causes of a failure and infers the fault development trend, as
28
well as predicts the remaining life of the equipment. By monitoring the operating conditions of the defective item, future key symptoms associated with the deterioration of the equipment can be predicted. Equipped with this kind of monitoring system, an advanced alarm can be activated when the predicted value falls within an alarm band. This will help the system operators to take adequate actions to check the condition of the equipment and repair the defects prior to a total breakdown.
It should be mentioned that the cost of data collection is a big issue in the implementation of condition-based maintenance procedures. Though data from real life experience can boost the confidence level in the model to be used, only a few operators will be prepared to let components run to a total failure, while often the replaced components may not be in a “failed” state yet. It is important to include the cost of data collection in a CBM decision model, and hence it will be wise to weigh whether monitoring will be effective in reducing the total costs. The costs of data collection should be balanced against the expected gains. The increasing use of CBM may also place new demands on maintenance management information systems. These systems need to schedule maintenance activities in a dynamic manner, since the execution times of certain activities will be updated frequently as condition information becomes available. Hence, carrying out a cost-benefit analysis will become an essential part of any condition-based maintenance strategy.
References
1. “Moving to Condition-Based Maintenance (CBM”http://www.nextgenpe.com/article/Moving-to-Condition-Based-Maintenance-CBM/ 2. Mitchell T Rausch, “Condition based Maintenance of a single system under Spare Part Inventory Constraints”, Masters Theses, Wichita State University, August 2008. 3. P. A. Scarf “A Framework for Condition Monitoring and Condition Based Maintenance” Quality Technology & Quantitative Management, Vol. 4, No. 2, pp. 301-312, 2007 4. A. Grall, C. Bérenguer and L. Dieulle, “A condition-based maintenance policy for stochastically deteriorating systems”, Reliability Engineering & System Safety Volume 76, Issue 2, May 2002, Pages 167-180. 5. Satoru GOTO, Yuhki ADACHI, Sinji KATAFUCHI, Toshihiko FURUE, Yoshitaka UCHIDA, Mitsuhoro SUEYOSHI, Hironori HATAZAKI and Masatoshi NAKAMURA, “On Line Deterioration Prediction and Residual Life Evaluation of Rotating
29
Equipment Based on Vibration Measurement”, SICE Annual Conference, August 20-22, 2008, The University Electro-Communications, Japan. 6. Susan Lu, Yu-Chen Tu and Huitian Lu, “Predictive Condition-based Maintenance for Continuously Deteriorating System” Qual. Reliab. Engng. Int. 2007; 23:71–81. 7. E Deloux, B Castanier, and C Be´renguer, “Maintenance policy for a deteriorating system evolving in a stressful environment”, Proc. IMechE Vol. 222 Part O: J. Risk and Reliability, 2008. 8. Ling Wang, Enhui Zheng, Yuntang Li, Binrui Wang, Jinjin Wu, “Maintenance Optimization of Generating Equipment Based on a Condition-based Maintenance Policy for Multi-unit Systems”, Chinese Control and Decision Conference (CCDC 2009) – IEEE 2009. 9. P. Naga Srinivasa Rao and V.N. Achutha Naikan, “An Optimization Methodology for Condition Based Minimal and Major Preventive Maintenance”, Economic Quality Control, Vol 21 (2006), No. 1, 127 – 141. 10. Kamran Shahanaghi, Hamid Babaei, Arash Bakhsha, Nasser S. Fard, “A new condition based maintenance model with random improvements on the system after maintenance actions: Optimizing by monte carlo simulation”, World Journal of Modelling and Simulation, Vol. 4 (2008) No. 3, pp. 230-236. 11. S.K.Sethiya, “Condition Based Maintenance (CBM)”, Secy.toCME/WCR/JBP 12. Wayne Stargardt “Condition-Based Maintenance Using Wireless Monitoring: Developments and Examples”,http://reliabilityweb.com/art08/cbm_wireless.htm 13. Byron A. Ellis, “The Challenges of Condition Based Maintenance”http://www.jethroproject.com/The Challenges of Condition Based%2 0Maintenance.pdf 14. R. C. M. Yam, P. W. Tse, L. Li and P. Tu, “Intelligent Predictive Decision Support System for Condition-Based Maintenance”, Int J Adv Manuf Technol (2001) 17:383– 391 15. “Osisoft.com, “Condition-based Maintenance (CBM) Across the Enterprise” OSIsoft, Inc. 777 Davis Street Suite 250 San Leandro, CA 94577
30
doc_299232628.pdf