A Process For Data Driven Prognostics

A PROCESS FOR DATA DRIVEN PROGNOSTICS
Eric Bechhoefer NRG Systems 110 Riggs Road Hingesburg, VT 05461 Telephone: (802) 482-2255 erb@nrgsystems.com David He Professor, Dept of Mechanical and Industrial Engineering University of Illinois at Chicago 842 W. Taylor Street, Room 3027 ERF Chicago, IL 60607=7022 davidhe@uic.edu
Abstract: A prognostic is an estimate of the remaining useful life of a monitored part. While diagnostics alone can support condition based maintenance practices, prognostics facilitates changes in logistics which can greatly reduce cost or increase readiness and availability of the monitored system. A successful prognostic requires four processes. First, feature extraction of measured data to estimate damage. Second, a threshold for the feature which when exceeded, it is appropriate to perform maintenance. Third, given a future load profile, a model that can estimate the life of the component based on the current damage state. Finally, an estimate of the confidence in the prognostic is needed. This paper outlines a process for a data driven prognostics by: describing appropriate condition indicators for gear fault, threshold setting those CIs through fusion into a component health indicator, using a state space process to estimate the remaining useful life given the current component health, and a state estimate to quantify the confidence in the estimate of the remaining useful life. Finally, an gear fault run to failure test is used as an example. Key words: Confidence; Condition indicators; Health indicator; Paris Law, Remaining useful life; Introduction: Condition based maintenance (CBM) systems have been shown to product cost saving by reducing scheduled maintenance cost. However, CBM systems can be leverages into far great cost saving by developing a prognostics capability. The ability to estimate the remaining useful life (RUL) on a component can greatly improve the availability and reduce logistics cost. Prognostics are the maturation of the CBM system. Consider the effect on wind farm operations if a prognostics capability was available. Major maintenance event require a heavy lift crane. The availability of a crane can be limited and cost of rental is large. If a crane is needed to replace a gearbox, when there
are other wind turbines with small RULs, there is cost saving and improvements to readiness by conducting maintenance on those other marginal turbines. Alternatively, if the operator of a fleet of helicopter knows the RUL of their assets, the oeprator can deploy those aircraft that have the highest RUL and be assured that the aircraft will not need major maintenance while deployed. The knowledge of a RUL allows the Logistician to reduce inventory spares. This affects the man power need for maintainers and facilitates more efficient operations. That said, there are currently few deployed prognostic health management (PHM) systems. While CBM is a maturing technology, PHM is relatively immature and difficult to implement. The ability to estimate the RUL requires four pieces of information: An estimate of the current equipment health, A limit or threshold where it is appropriate to do maintenance, An estimate of the future equipment load, and A model to estimate the time from the current state to the limit/threshold based on projected load. The current health of a system can be determined by CBM system. Such systems measure some feature representing damage. For example, for a pump or generator, shaft order one acceleration is a measure of health. The limit for this vibration for some equipment can be found in such standards as [1]. Most monitored components, such as gears and bearing, are not covered by ISO standard; there are no formal or standardized limits. Additionally, while the load for stationary equipment may be well known, for many systems (helicopter or wind turbines, for example) the load is a variable. Damage models to predict future equipment health fall into two categories: physics of failure and data driven. Physics of failure models are in their nature appealing. There is a cost associated with building the models, which then require validation and testing. Further, the robustness of these models in application may not be satisfactory: how are material/manufacturing variance, unknown usage and maintenance accounted for in a real application? Data drive methods, while not capable of giving an absolute level of damage, can give a relative limit which may give acceptable performance. Presented here is an end to end process for data driven prognostic for a vibration sensor. Descriptions of how condition indicators (CIs) are generated for a gear run to failure test are given. The CIs are fused into a health indicator (HI) through a statistical process (to control the probability of false alarm). Given the current HI, the time until the HI reaches a predetermined value, using Paris Law, is used to calculate the remaining useful life (RUL). Once the RUL is calculated, a bound can them be calculated and a confidence in the RUL is given. Finally, this process will be demonstrated on a spiral bevel gear. Condition Indicators - Feature Extraction to Improve Signal to Noise: Vibration signatures for machinery faults tend to be small relative to other vibration signatures. For example, in the typical gearbox, the energy associated with gear mesh and shaft vibrations will be orders of magnitude larger than a fault feature. Spectral analysis or root mean squares (RMS) of vibration are not powerful enough CIs to find an early fault, let 2
alone provide information useful for prognostics. Techniques to improve the signal to noise are needed to remove tones associated with nominal components, while preserving the fault signatures. Gear analysis is based on operations of the time synchronous average [3]. Time synchronous averaging (TSA) is a signal processing technique that extracts periodic waveforms from noisy data. The TSA is well suited for gearbox analysis, where it allows the vibration signature of the gear under analysis to be separated from other gears and noise sources in the gearbox that are not synchronous with that gear. Additionally, variations in shaft speed can be corrected, which would otherwise result in spreading of spectral energy into an adjacent gear mesh bins. In order to do this, a signal is phasedlocked with the angular position of a shaft under analysis. This phase information can be provided through a n per revolution tachometer signal (such as a Hall sensor or optical encoder, where the time at which the tachometer signal crosses from low to high is called the zero crossing) or though demodulation of gear mesh signatures [3]. The model for vibration in a shaft in a gear box was given in [2] as: x(t) = ai(t))cos(2 i fm(t)+ i)+b(t) where: Xi is the amplitude of the kth mesh harmonic fm(t) is the average mesh frequency ak(t) is the amplitude modulation function of the kth mesh harmonic. i(t) is the phase modulation function of the kth mesh harmonic. i is the initial phase of harmonic k, and b(t) is additive background noise. The mesh frequency is a function of the shaft rotational speed: fm = Nf, where N is the number of teeth on the gear and f is the shaft speed, with no reduction in the analysis performance This vibration model assumes that f is constant. In most systems, there is some wander in the shaft speed due to changes in load or feedback delay in the control system. This change in speed will result in smearing of amplitude energy in the frequency domain. The smearing effect, and non synchronous noise, is reduced by resampling the time domain signal into the angular domain: mx( ) = E[x( )] = mx( + ). The variable is the period of the cycle to which the gearbox operation is periodic, and E[] is the expectation (e.g. ensemble mean). This makes the assumption that mx( ) is stationary and ergodic. If this assumption is true, than non-synchronous noise is reduce by 1/sqrt(rev), where rev is the number of cycles measured for the TSA Do not indent paragraphs. TSA Techniques for Condition Indicators: The TSA is an example of angular resampling [2], [4], where the number of data points in one shaft revolution (rn) are interpolated into m number of data points, such that: For all shaft revolutions n, m is larger than r, 3
i=1:K
Xi(1+
And m = 2ceiling (log2 (r)) (typical for radix 2 Fast Fourier Transform). Linear, bandwidth limited linear interpolation, and spline techniques have been used [5]. In this study, linear interpolation was used as it is considerable faster than spline or bandwidth limited filtering, with no reduction in analysis performance of the TSA. The TSA itself can be used for CIs. Typically, a CI is a statistics of a waveform (in the case the TSA). Common statistics are RMS, Peak to Peak, Crest Factor, Kurtosis and Skewness. For shaft, shaft order 1, 2 and 3 (first, second and third shaft rate harmonic) can be used to determine shaft out of balance, bent shaft, and/or shaft coupling damage, respectively. Error! Reference source not found. outlines the process of generating the TSA, and shaft CIs.
Figure 1 Generation of the TSA and selected CIs Gear Fault Indicators: There are at least six failure modes for gears [6]: surface disturbances, scuffing, deformations, surface fatigue, fissures/cracks and tooth breakage. Each type of failure mode, potentially, can generate a different fault signature. Additionally, relative to the energy associated with gear mesh tone and other noise sources, the fault signatures are typically small. A number of researchers have proposed analysis techniques to identify these different faults [7],[8] [9]. Typically, these analyses are based on the operation of the TSA. Examples of analysis are: Residual, where shaft order 1, 2, and 3 frequencies, and the gear mesh harmonics, of the TSA are removed. Faults such as a soft/broken tooth generate a 1 per rev impacts in the TSA. In the frequency domain of the TSA, these impacts are expressed as multiple harmonic of the 1 per rev. The shaft order 1, 2 and 3 frequencies and gear mesh harmonics in the frequency domain, and then the inverse FFT is performed. This allows the impact signature to become prominent in the time domain. CIs are statistics of this waveform (RMS, Peak 2 Peak, Crest Factor, Kurtosis). Energy operator, which is a type of residual of the autocorrelation function. For a nominal gear, the predominant vibration is gear mesh. Surface disturbances, 4
scuffing, etc, generate small higher frequency values which are not removed by autocorrelation. Formally, the EO is: TSA2:n-1 x TSA2:n-1 x TSA1:n-2 x TSA3:n . The bold indicates a vector of TSA values. The CIs of the EO are the standard statistics of the EO vector Narrowband Analysis operates the TSA by filtering out all tones except that of the gear mesh and with a given bandwidth. It is calculated by zeroing bins in of the Fourier transform of the TSA, except the gear mesh. The bandwidth is typically 10% of the number of teeth on the gear under analysis. For example, a 23 tooth gear analysis would retain bins 21, 22, 23, 24, and 25, and there conjugates in Fourier domain. Then the inverse FFT is taken, and statistics of waveform are taken. Narrowband analysis can capture sideband modulation of the gear mesh tone due to misalignment, or a cracked/broken tooth. Amplitude Modulation (AM) analysis is the absolute value of the Hilbert transform of the Narrowband signal. For a gear with minimum transmission error, the AM analysis feature should be a constant value. Faults will greatly increase the kurtosis of the signal Frequency Modulation (FM) analysis is the derivative of the angle of the Hilbert transform of the Narrowband signal. Its is a powerful tool capable of detecting changes of phase due to uneven tooth loading, characteristic of a number of fault types. For a more complete description of these analyses, see [7], or [8]. Error! Reference source not found. is an example of the processing to generator the gear CIs for a spiral bevel gear with surface pitting and scuffing. This gear fault will be used throughout the paper.
Figure 2 Process for Generating Gear CIs
Threshold Setting and Component Health: In a physics of failure prognostics method, modeling would estimate the CI generated for some level of damage. When the measured CI exceeds the modeled threshold value, maintenance is performance. In a data driven process, maintenance is performance when a statistically set threshold is exceeded. Thus, the performance of a data driven method is completely determined by the quality of the threshold setting process. The concept of thresholding was explored in [10], where for a given, single CI, a probability density function (PDF) for the Rician/Rice statistical distribution was used to set a threshold based on an probability of false alarm (PFA). This is contracted with [11], who explored the relationship between CI threshold and PFA to describe the receiver operating characteristics (ROC) of the CI for a given fault. Additionally, Dempsey used the ROC to evaluate the performance of the CI for a fault type. These methods support a data driven approach for prognostics by formalizing a method for threshold setting. Estimation of RUL given a threshold is complicated in that there numerous failure modes for a gear. Further, no single CI has been identified that works with all fault modes. This suggests one of two possible architectures for a prognostics system: Estimate the RUL for each Gear CI used, where the reported RUL is the minimum remaining useful life of each CI, or Fuse n number of CI into a gear health indicator (HI) and calculate the RUL based on the HI. Computationally, the use of HIs is attractive. Health indicators (HI) provide decision making tools for the end user on the status of system health. Health indicators consist of the integration of several condition indicators into one value that provides the health status of the component to the end user [11]. Highlighted in [12] are a number of advantages of the HI over CIs, such as: controlling false alarm rate, improved detection, and simplification of user display. Further, in [13] is described a threshold setting process for gear health, where the HI is a function of the CI distributions. They give a generalized process of for threshold setting, where the HI is a function of distribution of CIs, regardless of the correlation between the CIs. Gear Health as a Function of Distribution: Prior to detailing the mathematical methods used to develop the HI, a nomenclature for component health is needed. To simplify presentation and knowledge creation for a user, a uniform meaning across all components in the monitored machine should be developed. The measured CI statistics (e.g. PDFs) will be unique for each component type (due to different rates, materials, loads, etc). This means that the critical values (thresholds) will be different for each monitored component. By using the HI paradigm, one can normalized the CIs, such that the HI is independent of the component. Further, using guidance from [14], the HI will be designed such that there are two alert levels: warning and alarm. Further, a common nomenclature for the HI can be developed, such that: The HI ranges from 0 to 1, where the probability of exceeding an HI of 0.5 is the PFA,
A warning alert is generated when the HI is greater than or equal to 0.75. Maintenance should be planned by estimating the RUL until the HI is 1.0. An alarm alert is generated when the HI is greater than or equal to 1.0. Continued operations could cause collateral damage. Note that this nomenclature does not define a probability of failure for the component, or that the component fails when the HI is 1.0. Rather, it suggests a change in operator behavior to a proactive maintenance policy: perform maintenance prior to the generations of cascading faults. For example, by performing maintenance on a bearing prior the bearing shedding extensive material, costly gearbox replacement can be avoided. Controlling for the Correlation Between CIs: All CIs have a probability distribution (PDF). Any operation on the CI to form a health index (HI) is then a function of distributions [15]. Functions such as: The maximum of n CI (the order statistics) The sum of n CIs, or The norm of n CIs (energy) are valid if and only if the distribution (e.g. CIs) are independent and identical [15]. For Gaussian distribution, subtracting the mean and dividing by the standard deviation will give identical Z distributions. The issue of ensuring independence is much more difficult. In general, the correlation between CIs is non-zero. As an example, many of the correlation coefficients used in this study were near 1 (see Table 1). Table 1 Correlation Coefficients for the Six CIs Used in the Study
ij
CI 1 CI 2 CI 3 CI 4 CI 5 0.84 1 0.79 0.46 1 0.66 0.27 0.96 1 -0.47 -0.59 -0.03 0.11 1
CI 6 0.74 0.36 0.97 0.98 0.05 1
CI 1 1 CI 2 CI 3 CI 4 CI 5 CI 6
This correlation between CIs implies that for a given function of distributions to have a threshold that operationally meets the design PFA, the CIs must be whitened (e.g. decorrelated). In [16], Fukunaga presents a whitening transform using the Eigenvector matrix multiplied by the square root for the Eigenvalues (diagonal matrix) of the covariance of the CIs: A = 1/2 T, where T is the transpose of the eigenvalue matrix and and is the eigenvalue matrix. The transform is not orthonormal: the Euclidean distances are not preserved in the transform. While ideal for maximizing the distance 7
(separation) between classes (such as in a Baysian classifier), the distribution of the original CI is not preserved. This property of the transform makes it inappropriate for threshold setting. If the CIs represented a metric such as shaft order acceleration, then one can construct an HI which is the square of the normalized power (e.g. square root of the acceleration squared). This can be defined as normalized energy, as per [17], who able to whiten the CI and establish a threshold for a given PFA. A more general whitening solution can be found using Cholesky decomposition (see [13]). The Cholesky decomposition of Hermitian, positive definite matrix results in A = LL*, where L is a lower triangular, and L* is its conjugate transpose. By definition, the inverse covariance is positive definite Hermitian. It then follows that if: LL* = -1, then Y = L x CIT. The vector CI is the correlated CIs used for the HI calculation, and Y is 1 to n independent CI with unit variance (one CI representing the trivial case). The Cholesky decomposition, in effect, creates the square root of the inverse covariance. This in turn is analogous to dividing the CI by its standard deviation (the trivial case of one CI). In turn, Y = L x CIT creates the necessary independent and identical distributions required to calculate the critical values for a function of distributions. As an example of the importance of correlation on, consider a simple HI function: HI = CI1 + CI2. The CIs will be normally distributed with mean 0 and standard deviation of 1. The standard deviation of this HI is: HI = sqrt( 2CI1 + 2CI2 + 2 CI1,CI2 x HI x CI1 x CI2), where CI1,CI2 is the correlation between CI1 and CI2. If one assumes CI1,CI2 is 0.0, then HI = 1.414 (e.g. the sqrt(2)). For a PFA of 10-6, the threshold is then 6.722. Consider the case in which the observed correlation is closer to 1 (e.g. CI1,CI2 is 1.0), then the observed HI = 2. For a threshold of 6.722, the operational PFA is 4 x 10 -4. This is 390 times greater than the designed PFA. This illustrates the effect of correlation on threshold setting. HI Based on Rayleigh PDFs: The CIs used for this example have Rayleigh like PDFs (e.g. heavily tailed). Consequently, the HI function was designed using the Rayleigh distribution. The PDF for the Rayleigh distribution uses a single parameter, , resulting in the mean ( = *( /2)0.5) and variance ( 2 = (2 - /2) * 2). The PDF of the Rayleigh is: x/ 2exp(x/2 2). Note that when applying these equations to the whitening process, the value for for each CI will then be: 2 = 1, and = 2 / (2 - /2)0.5 = 1.5264. For a more complete analysis, see [17]. A number of HI functions could be used, but experience has shown [13] that the greatest signal to noise is achieve where the HI function is the norm of n CIs. This represents the normalized energy of the CIs. If the CIs are IID, it can be shown that the function defines a Nakagami PDF [17]. The statistics for the Nakagami are: = n, and = 1/(2- /2)*2*n. For this study, data was collected from experiments performed in the Spiral Bevel Gear Test facility at NASA Glenn. A description of the test rig and test procedure is given in [13]. Six CIs where used, so that: = 6, and = 27.96. For a PFA of 10-6, the threshold 10.882, with the HI function calculated as: HI = .05/10.882 x ( i=1:6 Y i 2)1/2.
The six CIs used for the HI calculation were: Residual RMS, Energy Operator RMS, FM0, NB KT, AM KT and FM RMS. These CIs were chosen because they exhibited good sensitivity to the fault. Residual Kurtosis and Energy Ratio also were good indicators, but were not chosen because; It has been the researchers experience that these CIs become ineffective when used in complex gear boxes, and As the faults progresses, these CIs lose effectiveness. The residual kurtosis can in fact decrease, while the energy ratio will approach 1. Covariance and mean values for the six CI were calculated by sampling healthy data from four gears prior to the fault propagating. This was done by randomly selecting 100 data points from each gear, and calculating the covariance and means over the resulting 400 data points. The selected CIs PDF were not Gaussian, but exhibited a high degree of skewness. Because of this, the PDFs were left shifted by subtracting an offset such that the PDFs exhibited Rayleigh like distributions. The estimated gear health is plotted in Error! Reference source not found., where the damage on the gear at the end of the test is seen in the upper left corner.
Figure 3 Gear Health, Torque, and Image of Gear Damage at HI 1.5 The key issue with a data driven prognostic is the appropriateness of the threshold. When the HI is 1.0, is the damage such that it is appropriate to do maintenance? From the example (Error! Reference source not found.), it is apparent that an HI of 1 displays 9
damage warranting maintenance. Because it is appropriate to performance maintenance when the HI is 1.0 or greater, one can state that the RUL is the time from the current state until the estimated HI is 1.0. State Space Models for Prognostics: State-space representation of data provides a versatile and robust way to model systems. Starting with the definition of the states, and the basic principles underlying the characterization of phenomena under study, once can propagate the states as a data driven stochastic process. The choice of which type of state space model to use is driven by the nature of the system dynamics and noise source. If the phenomenology of the system has linear dynamics with Gaussian noise, a Kalman filter (KF) is used. If it is a non-linear process with Gaussian noise, a sigma-point Bayesian process (e.g. unscented Kalman filter - UKF) or extended Kalman filter (EKF) is appropriate. For non-linear dynamics with non-linear noise, we use a sequential Monte Carlo method employing sequential estimation of the probability distribution using importance sampling techniques. This method is generally referred to as particle filtering (PF) [18] A state space model estimates the state variable on the basis of measurement of the output and input control variables [19]. In general, a system plant can be defined by: x = Ax+ Bu, and y =Cx, where x is the state variable, x is the rate of change of the state variable, and y is the output of the system. An observer is a subsystem used to reconstruct the state space of the plant. The model of the observer is the same as that of the plant, except that one adds an additional term which includes the estimated error to account for inaccuracies in the A and B matrixes. This means that any hidden state (such as RUL) can be reconstructed if we can model the plant (e.g. failure propagation) successfully. The observer is defined as: E[x] = E[Ax]+ Bu +K(y-E[Cx]), where E[x] is the estimate state derivative, and E[Cx]is the expectation of the system output. The matrix K is called the Kalman gain matrix (linear, Gaussian case). It is a weighting matrix that maps the differences between the measured output y and the estimated output E[Cx]. A KF is used to optimally set the Kalman Gain matrix. A KF is a recursive algorithm that optimally filters the measured state based on a priori information such as the measurement noise, the unknown behavior of the state, and relationship between the input and output states (e.g. the plant), and the time between measurements. Computationally, it is attractive because it can be designed with no matrix inversion and it is a one step, iterative process. The filtering process is given as: Prediction Xt|t-1 = AXt-1|t-1 State Pt|t-1 = A Pt-1|t-1AT + Q Covariance Gain K = Pt|t-1CT [CPt|t-1CT + R]-1 Update Pt|t = (I KC) Pt|t-1 State Covariance X t|t = Xt|t-1 + K(Y-C Xt|t-1) State Update where: t|t-1 is the condition statement (e.g. t given the information at t-1) X is the state information (x, dx/dt, dx2/dt2) A is the state transition matrix Y is the measured data K is the Kalman Gain 10
P Q C R
is the state covariance matrix is the process noise model is the measurement matrix is the measurement variance
For nonlinear systems with Gaussian noise (UKF or EKF), the state prediction is a function of Xt|t-1 and the state transition matrix A, and C is the derivative of the state with respect to the measurement. For non-linear, non-Gaussian noise problems, particle filters (PF) are attractive. PF is based on representing the filtering distribution as a set of particles. The particles are generated using sequential importance re-sampling (a Monte Carlo technique), where a proposed distribution is used to approximate a posterior distribution by appropriate weighting. In this example, the state update is nonlinear and the measurement noise is Gaussian. As such, an extended Kalman filter was used. System Dynamics for Estimating the RUL: The state space model can be constructed as a parallel system to the plant (e.g. the system under study). This requires an appropriate model to simulate the system dynamics. In general, failure modes propagating in mechanical systems are difficult to model at a level of fidelity that would generate any meaningful results (e.g. Health and RUL based on physics of failure). One needs a generalized, data driven process that can model the plant adequately enough to generate RUL with small error. Since 1953, a number of fault growth theories have been proposed, such as: net area stress theories, accumulated strain hypothesis, dislocation theories, and others [20]. Through substitution of variables, most of these theories can be generalized by the Paris Law: da/dN = D( K)n. Paris Law governs the rate of crack growth in a homogenous material, where: da/dN is the rate of change of the half crack length, D is a material constant of the crack growth equation, K is the range of strain K during a fatigue cycle, n is the exponent of the crack growth equation. The range of strain, K is given as: K= 2 ( a)1/2, where
is gross strain, is a geometric correction factor, and a is the half crack length. These variables are specific to a given material and test article. In practice, the variables are unknown. This requires some simplifying assumptions to be made to facilitate analysis. For many materials, the crack growth exponent is 2, (see [20]). The geometric correction factor , is set to 1 (a constant which will accounted for in the calculation of D), which allows Paris law to be reduced to: da/dN = D(4 2 a). Taking the inverse da/dN gives the rate of change in cycles per change in crack length, or: dN/da = 1/[D(4 2 a)]. Integrating over crack length give the number of cycles (for near synchronous systems, RUL is N x rpm): N = 1/[D(4 2 a)](ln(af) ln(ao), where the 11
current measured crack is ao and the final crack length af. Since the crack length is unknown, the current state, HI, will be used as a surrogate for ao while af will be 1.0 (the RUL is the time from the current HI state until the HI is 1.0). N is the RUL times some constant (RPM for example). The material crack constant, D, can be estimated as: D = da/dN /(4 2 a). Gross strain cannot generally be measured, thus, an appropriate surrogate value (e.g. torque, or yaw misalignment) will be used. The use of Pariss law for the calculation of RUL was given by [21] and [22], but lacked a measure of confidence (e.g. how good was the prognostics). Confidence is an important requirement for a PHM system [23]. A Prognostic and Confidence in the Prognostic: In practice, a prognostic or PHM capability would be used to schedule maintenance or assist in assets management and logistic support. The asset owner/operator will make decisions which effect the operational availability and future revenues based on the PHM system. They will need an intuitive, simple display that conveys information on: current health, RUL, and confidence in the RUL prediction. Model confidence is essential in any RUL (see [23]). For any RUL calculation, given 1 hour of nominal usage, the RUL should decrease by 1 (e.g. dN/dt is approximately -1: one hour of life is consumed for each hour of operation). Further, a measure of model drift or convergence is the second derivative d2N/dt2: a value close to zero indicates convergence. When these conditions are met, the model used for calculation of the RUL is consistent, and is indicative of a good estimate of the RUL of the component. One can use visual cues for of the prognostics based on model convergence. Visual cues, such as color, can indicate the confidence in the RUL: Low Confidence: Yellow, abs(dN/dt-1) > 3 and abs(d2N/dt2) > 0.5 Medium Confidence: Blue abs(dN/dt-1) > 2 and abs(d2N/dt2) > 0.5 High Confidence: Green, abs(dN/dt-1) < 2 and abs(d2N/dt2) < 0.5 A key requirement of the prognostic model is the ability to predict what the health of the component will be some time in the future. For a given state space mode, the RUL or any predicted health is an expectation based on the current state and future usage (e.g. damage or strain). The Paris law is driven by delta strain: changes in strain will affect the RUL. Future health is then based on the mean strain and a bound on that strain to give a range on the RUL (one benefit in a PF model is a direct distribution of the RUL). This strain information could be based on forecast weather or usage for a wind turbine or type of mission for a helicopter. The health at any time in the future is then: af = exp(ND(4 2 ) + ln(ao)). Test Article and a Prognostics Example: Data used for this example was provided by the Spiral Bevel Gear Test facility at NASA. A description of the test rig and test procedure is given in [13], [24]. The tests consisted of running the gears under load through a back to back configuration, with acquisitions made at 1 minute intervals, generating time synchronous averages (TSA) on the gear shaft (36 teeth). The pinion, on which the damage occurred, has 12 teeth. This is highly accelerated life testing, and as such, the RUL estimates are compressed. The calculated HI (see Error! Reference 12
source not found.) where used to update sequential, the state estimator. At each update, the HI, dHI/dt, RUL, dRUL/dt and d2RUL/dt2 and were calculated. The confidence of the RUL was then evaluated. The fault starts to propagate at approximately 25 hours into the test. Error! Reference source not found. displays the HI state at 26.85 hours. The state estimate of health has increased form a nominal value of .2 to .4, with and RUL of 2.5 hours.
Figure 4 Initial Low Confidence Prognostic The confidence is low: note that the prognostic is lagging the actual RUL by approximately 0.5 hours. However, the actual RUL is still within the estimated confidence bound of the RUL. As the fault continues to propagate (Error! Reference source not found.), the confidence in the prognostics has improved and the estimate RUL is concurrent with the actual RUL.
13
Figure 5 High Confidence Prognostic with Small Error Bounds In practice, it is anticipated that the time period of the RUL will be thousands of hours for equipment such as wind turbines and hundreds of hours for devices such as helicopter transmissions (see [21], where a prognostics of 100 to 150 hours of flight time was observed). Conclusion: Data driven prognostics requires four conditions: The ability to extract a feature related to damage, A process to set thresholds, A fault model to propagate the current state to the desired threshold, and A measure of confidence in the prognostics. Critical to a successful estimation of remaining useful life is an appropriate threshold. The process described is based on hypothesis testing and sets a threshold relative to a probability of false alarm. Refinement in RUL estimation will require feedback from depot level repair services to validate the appropriateness of the threshold. Physics of failure models may ultimately give an absolute level of damage for a given CI value. That said, the cost associated with model development and validation may be great. The advantage of a data driven approach is the generality of the model, and the ability to set threshold with nominal components. This leads to a relative low application cost a faster deployment of systems.
14
References: [1] ISO 10816-3:2009 (2009). Mechanical vibration Evaluation of Machine Vibration by Measurement on Non-rotating Parts. [2] McFadden, P. (1987). A revised model for the extraction of periodic waveforms by time domain averaging. Mechanical Systems and Signal Processing 1 (1), 83-95 [3] Combet, L., Gelman L. (2007). An automated methodology for performing time synchronous averaging of a gearbox signal without seed sensor. Mechanical Systems and Signal Processing, 21 (6), 2590-2606. [4] Randal, Robert B. (2011). Vibration-based Condition Monitoring. West Sussex, United Kingdom, John Wiley&Sons. [5] Bechhoefer, E., Kingsley, M. (2009). A Review of Time Synchronous Average Algorithms. Annual Conference of the Prognostics and Health Management Society [6] ISO 10825. (2007) Gears -- Wear and damage to gear teeth -- Terminology [7] Vecer, P., Kreidl, M., Smid, R. (2005). Condition Indicators for Gearbox Condition Monitoring Systems. Acta Polytechnica. 45 (6). [8] McFadden, P., Smith, J., (1985), A Signal Processing Technique for detecting local defects in a gear from a signal average of the vibration. Proc Instn Mech Engrs, 199 (4) [9] Zakrajsek, J. Townsend, D., Decker, H. (1993). An Analysis of Gear Fault Detection Method as Applied to Pitting Fatigue Failure Damage. NASA Technical Memorandum 105950. [10] Byington, C., Safa-Bakhsh, R., Watson., M., Kalgren, P. (2003). Metrics Evaluation and Tool Development for Health and Usage Monitoring System Technology. HUMS 2003 Conference, DSTO-GD-0348 [11] Dempsy, P., Keller, J. (2008). Signal Detection Theory Applied to Helicopter Transmissions Diagnostics Thresholds. NASA Technical Memorandum 2008-215262 [12] Bechhoefer, E., Duke, A., Mayhew, E. (2007). A Case for Health Indicators vs. Condition Indicators in Mechanical Diagnostics. American Helicopter Society Forum 63, Virginia Beach. [13] Bechhoefer, E., He, D., Dempsey, P. (2011). Gear Threshold Setting Based On a Probability of False Alarm. Annual Conference of the Prognostics and Health Management Society. [14] GL Renewables, (2007), Guidelines for the Certification of Condition Monitoring Systems for Wind Turbines. http://www.glgroup.com/en/certification/renewables/CertificationGuidelines.php [15] Wackerly, D., Mendenhall, W., Scheaffer, R.,(1996), Mathematical Statistics with Applications, Buxbury Press, Belmont, 1996. [16] Fukunaga, K., (1990), Introduction to Statistical Pattern Recognition, Academic Press, London, 1990, page 75. [17] Bechhoefer, E., Bernhard, A. (2007). A Generalized Process for Optimal Threshold Setting in HUMS. IEEE Aerospace Conference, Big Sky. [18] Candy, J. (2009). Bayesian Signal Processing: Classical, Modern, and Particle Filtering Methods, John Wiley & Sons, Hoboken. [19] Brogan, W. (1991). Modern Control Theory, Prentice Hall, Upper Saddle River, NJ, 07458, 1991. [20] Frost, N., March, K., Pook, L. (1999). Metal Fatigue, 1999, Dover Publications, 15
Mineola, NY., page 228-244. [21] Bechhoefer, E., Bernhard, A., He, D., Use of Paris Law for Prediction of Component Remaining Life, IEEE Aerospace Conference, Big Sky. 2008 [22] M. Orchard, M., Vachtsevanos, G. (2007). A Particle Filtering Approach for OnLine Failure Prognosis in a Planetary Carrier Plate. International Journal of Fuzzy Logic and Intelligent Systems, 7 (4), 221-227 [23] Vachtsevanos, G., Lewis, F. L., Roemer, M. Hess, A., and Wu, A. (2006). Intelligent Fault Diagnosis and Prognosis for Engineering Systems, 1st ed. Hoboken, New Jersey: John Wiley & Sons, Inc, 2006. [24] Dempsey, P., Afjeh, A., (2002). Integrating Oil Debris and Vibration Gear Damage Detection Technologies Using Fuzzy Logic, NASA Technical Memorandum 2002-211126 Bibliography: [1] [2] First... Second...
16

A Process For Data Driven Prognostics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Process For Data Driven Prognostics

Uploaded by

Copyright:

Available Formats

A PROCESS FOR DATA DRIVEN PROGNOSTICS

Figure 2 Process for Generating Gear CIs

CI 6 0.74 0.36 0.97 0.98 0.05 1

You might also like