710
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
A Neuro-Fuzzy Approach to Gear System Monitoring Wilson Wang, Fathy Ismail, and Farid Golnaraghi
Abstract—The detection of the onset of damage in gear systems is of great importance to industry. In this paper, a new neuro-fuzzy diagnostic system is developed, whereby the strengths of three robust signal processing techniques are integrated. The adopted techniques are: the continuous wavelet transform (amplitude) and beta kurtosis based on the overall residual signal, and the phase modulation by employing the signal average. Three reference functions are proposed as post-processing techniques to enhance the feature characteristics in a way that increases the accuracy of fault detection. Monitoring indexes are derived to facilitate the automatic diagnoses. A constrained-gradient-reliability algorithm is developed to train the fuzzy membership function parameters and rule weights, while the required fuzzy completeness is retained. The system output is set to different monitoring levels by using an optimization procedure to facilitate the decision-making process. The test results demonstrate that the novel neuro-fuzzy system, because of its adaptability and robustness, significantly improves the diagnostic accuracy. It outperforms other related classifiers, such as those based on fuzzy logic and neuro-fuzzy schemes, which adopt different types of rule weights and employ different training algorithms. Index Terms—Beta kurtosis, constrained-gradient-reliability algorithm, neuro-fuzzy diagnostic system, phase modulation, reference functions, wavelet transform.
I. INTRODUCTION
G
EAR systems, for example, gearboxes, are widely used in industry. Typical applications include airplanes, automobiles, power turbines, and steel mills, where the early detection of gear faults is critical in avoiding performance degradation and catastrophic failures. The accurate diagnosis of gear systems can also facilitate making decisions for repairs and maintenance. Gear damage can be classified as distributed defects (e.g., wear and misalignment) and localized faults (e.g., fatigue cracks and spalling). The former reduces transmission accuracy, and increases the noise and vibration levels in rotating machinery. However, the latter not only increases the transmission errors, but also causes catastrophic failures in machines such as airplanes and helicopters. Furthermore, distributed faults are usually initiated from localized defects. Accordingly, the diagnosis in this paper is focused on localized gear faults. Fault detection and diagnosis is a sequential process involving two steps: feature (symptom) extraction and decision-making (diagnosis). Feature extraction is a mapping
Manuscript received January 2, 2002; revised December 6, 2002 and November 15, 2003. This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), and by Mechworks Systems, Inc. W. Wang is with Lakehead University, Thunder Bay, ON P7B 5E1, Canada (e-mail:
[email protected]). F. Ismail and F. Golnaraghi are with the University of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail:
[email protected];
[email protected]). Digital Object Identifier 10.1109/TFUZZ.2004.834807
process from the signal space to the feature space, whereas decision-making is the process of classifying the features into different categories. The features used in this study are extracted from vibration signals collected from sensors mounted on the gearbox. Many vibration data processing techniques have been proposed in the literature for gear fault detection, but each has its advantages and limitations [1], [2]. According to the authors’ prior investigation [2], the current and well-accepted techniques include the amplitude modulation (AM), the phase modulation (PM), the beta kurtosis (BK), and the continuous wavelet transform (amplitude) (WA). The tables in Appendix A are reproduced from [2], and summarize the visual diagnostic results for three gears with different fault types. Three filtering conditions are utilized: the signal average obtained by time synchronization averaging; the overall residual signal, obtained by band-stop filtering out the gear meshing frequency and its harmonics from the signal average; and the dominant meshing frequency residual signal, obtained by band-pass filtering the signal average centered around the dominant gear meshing frequency. The details about these signal processing techniques can be found in [2]. In these tables, the symbol CI denotes there is a clear indication of a gear fault in the signature. SI signifies there is an indefinite indication of a gear defect in the feature, but more information is needed to confirm this diagnosis. NI shows that there is no clear fault indication in the signature, or that a false indication is observed. Even though the diagnosis in these tables is based on visual inspection and the subjective assessment of the processed signal, it is concluded that the most consistent techniques are the WA and BK based on the overall residual signal, and the PM using the signal average. If these techniques can be utilized simultaneously by some scheme, perhaps a more accurate assessment of the gear conditions may be achieved; that is the subject of this paper. Decision-making is the process of classifying the features into different categories. The traditional approach relies on human expertise to relate the vibration features to the faults [3]. However, this method is tedious and not always successful in identifying the abnormalities if the signatures are contaminated by noise. In addition, when multiple features are applied, a diagnostician can hardly pay equal attention to all the features and deal with the contradicting symptoms. The alternative is to use an automatic diagnosis. The currently available automatic decision-making schemes can be classified as mathematical model-based methods and flexible model-based techniques [4]–[6]. The latter is employed in this study, because an accurate numerical model is difficult to derive in the uncertain and noisy environment of rotating machinery. Flexible model-based classifiers consist of the classical pattern recognition approaches [7], and the inference-based techniques such as neural
1063-6706/04$20.00 © 2004 IEEE
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
711
Fig. 1. Experimental setup.
networks (NNs) [8]–[11] and fuzzy logic [12]–[17]. In order to overcome the limitations and reap the benefits of both NNs and fuzzy systems, more interest has been recently paid to the use of synergetic schemes which have been demonstrated as three hybrid levels. 1) Neural fuzzy (neuro-fuzzy) systems: NNs are used as tools in fuzzy models (mainly for control applications) [18], [19]; 2) Fuzzy neural networks: NNs retain their basic properties with some elements being fuzzified (mostly suitable for pattern classification applications) [20], [21]. 3) Fuzzy-neural hybrid systems: Both technologies are used separately in hybrid systems (suitable for both control and pattern classification) [22], [23]. The goal of this paper is to develop an automatic diagnostic system for gear system monitoring. A critical aspect of a diagnostic system in real-time industrial application is its reliability. Unreasonably missed alarms (i.e., the monitoring system can not pick up the existing faults) and false alarms (i.e., the monitoring system triggers alarms due to noise or other signals instead of real faults) may mitigate the validity of the diagnostic system. In order to effectively integrate the chosen features to achieve a more reliable diagnosis, a neuro-fuzzy scheme is adopted in this study. The reasons for using a neuro-fuzzy scheme are that the diagnostic knowledge from the expertise and online/offline learning can be incorporated into the fuzzy classification processes, whereas the fuzzy membership functions (MFs) can be optimized by using NNs [24], [25]. In this paper, reference function approaches are also proposed to enhance the feature characteristics. Monitoring indexes are derived to quantify the different measures for the automatic diagnosis. A constrained-gradient-reliability learning algorithm is suggested to optimize the fuzzy MF parameters and the rule weights, while the required fuzzy completeness is retained. The system output is divided into different monitoring levels to further facilitate the decision-making process. In this paper, the experimental setup is described. Then the derivation of the reference functions and the monitoring indexes is given. After the development of the adaptive neuro-fuzzy diagnostic system and the training algorithms, the reliability of
the new system is verified by tests and comparisons with other types of classifiers.
II. EXPERIMENTAL SETUP AND THE DIAGNOSTIC SYSTEM PARADIAGM The online experimental setup used in this study is schematically represented in Fig. 1. It consists of two 1 HP dc motors and a single stage gearbox. A pair of spur gears in which the driving gear has 16 teeth and the driven gear 14 teeth is tested. The motor speed controller allows tested gear operation in the range of 200–1400 rpm. The load is provided by a power resistor network. The speed of the drive motor and the load of the resistor network are adjusted automatically to accommodate the range of speed/torque operation conditions. The vibration is measured with an accelerometer mounted on the gearbox housing. An optical sensor is mounted in proximity to a slotted disk attached to the driving shaft, which provides a one-pulse-per-revolution signal which is used for the time synchronous averaging. The signals from both sensors are properly filtered, digitized, and then fed to a computer through a data acquisition card for further processing. Fig. 2 illustrates the feature extraction and decision-making procedures. The fault diagnosis of the gear system is conducted gear-by-gear. The first step is to differentiate the signature, specific to the gear of interest, from the collected vibration signals by using interpolation, resampling, and time synchronous averaging [26]. In this way, the gear signal average is obtained and represented in one full revolution. The features of the WA, BK, and PM are extracted by employing the corresponding signal processing techniques and filtering processes. After the extracted features are postprocessed by using the reference functions, the monitoring indexes are derived for the automatic decision-making process. The gear health condition is diagnosed by employing a properly trained neuro-fuzzy scheme. All the employed signal processing techniques, the decision-making scheme, and training algorithms are coded in MATLAB, which are presented in the subsequent sections.
712
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
Fig. 2.
Feature extraction and decision-making scheme.
III. REFERENCE FUNCTIONS AND MONITORING INDEXES In order to improve the diagnostic reliability and facilitate automatic decision-making process, the reference functions and monitoring indexes are proposed in this section. The reference function approach is derived as a postprocessing technique to enhance the feature characteristics, to reduce the feature dimensions, and to render the different features compatible. A. WA Reference Function, The continuous wavelet transform (amplitude) a signal is defined as
of
are compared with Fig. 3(b) and (d), respectively, it can be seen that the proposed reference function represents the WA features accurately, and accentuates the existence of the gear fault more clearly than in the WA maps. B. Beta Kurtosis Reference Function, The BK is the fourth moment of the beta function [29]. It is a tooth-based statistical technique. Each data block contains the vibration signal measured over one tooth period. If and represent the mean and variance of the th tooth data block, , then
(1) (3) where and are the scale (frequency) and time (space) paramis the mother wavelet which is a eters, respectively [27]. Morlet function in the present study [2], [28]. The wavelet ref, is proposed as the energy concentration erence function, over a specific bandwidth, so that
where
The reference function is defined as the inverse of but in angular space, obtained by linearly interpolating
(2) Generally, the frequency components generated from the localized gear faults are between the gear meshing frequency and the is chosen as fifth harmonic of the meshing frequency [2]. If of the meshing frequency, the processing bandwidth will be , where is the gear rotation speed in rpm, and is the number of teeth. Fig. 3(a) and (c) map the measured wavelet amplitude from two gears with tooth cracks that are approximately 20% and 40% of the tooth thickness, respectively. To facilitate the tracking of the fault, the induced fault is initiated in the tooth positioned around 180 , relative to the slot of the disc. The corresponding wavelet refer, are obtained by using (2) and depicted ence functions, in Fig. 3(b) and (d), respectively. If the graphs Fig. 3(a) and (c)
,
(4) For a gear with a 20% tooth crack initiated in tooth no. 9, Fig. 4(a) and (b) represent the signatures of the tooth-based BK ), and the corresponding reference function, , re( spectively. Other than the change in the horizontal scale, the features in both graphs are the same. However, the change in , and the third scale is necessary to be compatible with , that is proposed in the next section reference function, representing the PM. C. PM Reference Function, Gear damage produces a transient signal that usually causes , herein. The PM a phase modulation [30], represented by
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
713
Fig. 3. Examples of wavelet reference function.
derived to quantify the feature characteristics. At an instant , expressed here in terms of the rotation position, the monitoring index is proposed as the normalized amplitude of the reference function, (6) where may be , or , is the amplitude of a reference function at position , and is the outside a window centered around . Conmean value of sequently, the localized gear faults are assumed to fall within the selected window. Through a series of tests, the window width is selected to be four tooth periods around , that is, , being the number of teeth in the gear of interest. Fig. 4. Example of beta kurtosis reference function.
E. Input to the Diagnostic System reference function, , at position is proposed as the maximum of the phase difference within a specific time window, such that
(5) is the window width that is chosen according to the where particular application. In the present study, is selected to be the half of a tooth period. Fig. 5(a) and (c) are two PM signatures from two gears with a small chipping fault and a large chipping fault, respectively, at approximately 180 . Fig. 5(b) and (d) are the corresponding graphs obtained from the reference function. It is evident that the proposed PM reference function greatly enhances the phase feature characteristics. D. Monitoring Indexes In order to apply an extracted feature for automatic diagnosis, a quantitative measure, that is, a monitoring index, should be
Typically, if a healthy gear works properly, its reference functions have small fluctuations and the corresponding indexes, , or , are close to 1. An irregular signature with a large magnitude in a reference function usually demonstrates to be the occurrence of a gear defect. Suppose , , and the maximum amplitude locations in the reference functions, , , and , respectively. Then
In practice, these three locations, , and , may not be completely identical [2]. Consequently, when these monitoring indexes are input to the diagnostic system, care should be taken with the fault locations. In this study, according to the tests on the experimental setup, four tooth periods are selected to be the influence range; that is, if only one single tooth is damaged in
714
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
Fig. 5. Examples of the phase modulation reference function.
the gear of interest, , , and will occur within four tooth periods. The inputs to the diagnostic system are one or more , , or that are given as follows: vectors in (7) (8)
(9)
1) For example, if all of the , and are within one influence range, one tooth may be damaged. The input . indexes are those in and ), are within an 2) If only two of them, (e.g., influence range, and the other one (i.e., ) is not, then two teeth in this gear may have faults. Two input vectors are given to the diagnostic system accordingly. The first , , one is the first row in where is determined from , .The , that is, second input vector is the third row in , where and are determined from , , respectively, and . 3) If , and are in three distinctive influence ranges, three teeth may be damaged in this gear. The corre, sponding input vectors are the three row vectors in respectively.
The diagnosis of a gear system is conducted gear-by-gear. In each case, the main concern is whether the gear of interest is damaged or not. Thus, the gear condition is simply classified into two categories: normal ( ) and damaged ( ). Two MFs, small and large, are assigned to each input variable, , , and , ( is eliminated from here on for simplicity). If is designated as the output indicator, the fuzzy IF–THEN rules are formulated and listed in Table I, based on the following considerations. 1) If and only if all three features have no significant irregularities, that is, all the monitoring indexes are small, this gear is believed to be in its normal condition (rule ). 2) If some irregularity exists in one or more of the features, that is, if some monitoring index is large, this gear is possibly damaged. This formulation is demonstrated by to . 3) The diagnostic reliability of each rule is represented by a to , respectively. weight factor, Fig. 6 presents the paradigm of this neuro-fuzzy classification scheme. It is a six-layer feedforward network. The input nodes in layer 1 transmit the monitoring indexes, , and , to the next layer directly. Each node in layer 2 acts as an MF which can either be a single node that performs a simple activation function or multilayer nodes that perform a complex function. (Sigmoid functions are selected as MFs for all the input variables in this case). The nodes in layer 3 perform the fuzzy -norm operations. For a notational simplicity, the rules in Table I are represented in the following: IF
is
AND THEN is
is
AND with
is
If the max-product operator is used, the firing strength of rule is
IV. NEURO-FUZZY DIAGNOSTIC SYSTEM A neuro-fuzzy scheme is developed in this section to integrate the chosen features for a comprehensive assessment for the gear’s conditions. The diagnostic classification is conducted by fuzzy logic. The fuzzy IF–THEN rules are determined by experience, whereas the NNs are utilized to fine-tune the fuzzy MF parameters.
(10) All the nodes in layer 3 form the rule (knowledge) base. The nodes in layer 4 perform the defuzzification operations. is related to only rule , the output is Since class (11)
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
715
TABLE I FUZZY IF-THEN RULES FOR GEAR FAULT DIAGNOSIS
Fig. 6.
Architecture of the neuro-fuzzy diagnostic system.
For rules, , , is its belongingness to class . By using centroid defuzzification, the general belongingis ness to
(12)
V. CONSTRAINED-GRADIENT-RELIABILITY TRAINING ALGORITHM The developed neuro-fuzzy classifier must be optimized in order to achieve a desired input–output mapping. In this section, a constrained-gradient-reliability algorithm is proposed to update the fuzzy MF parameters and the rule weights. A. Constraint Function
At the end, one node in layer 5 outputs the normalized classification indicator, . Another node feeds the training data, , into the network to update the fuzzy MF parameters and the rule weights, which will be discussed next.
In order to guarantee that a dominant rule always exists and that the associated degree of belief is greater than 0.5, the fuzzy -completeness is set to 0.5 [31]. To achieve the 0.5-completeness, a constraint function is proposed next.
716
For
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
a
sigmoid
function,
, , , and correspond to the large and small functions, with a pair of sigmoid respectively. For an input variable ) MFs, large ( ) and small ( ) with parameters ( ), respectively, as shown in Fig. 7, the and ( crossover point is
minimizing (17). With the gradient descent algorithm, the parameters are then updated by (18) (19) where denotes the th training epoch, and and are the step sizes. By the manipulations shown in Appendix B, the previous equations are expressed as
(13) completeness is held if (14) or (20)
(15) From (14) and (15), the constraint function,
, is derived as (16) (21)
B. Membership Function Parameter Updating In the proposed constrained-gradient-reliability approach, the MF parameters are first optimized by adopting the unconstrained gradient descent method. If the new parameters are in the feasible region as defined in (16), then they are deemed acceptable. Otherwise, these parameters have to be reoptimized by using the gradient of the constraint function (16), instead of the gradient of the objective function [32]. , the input is For a training data pair, ; of training data sets;
;
Next, the 0.5-completeness must be verified. At the th training epoch, (16) becomes (22) If (22) holds, the updated parameters at the current epoch are remains the same, whereas feasible. Otherwise, must be recalculated as follows:
is the total number
is the desired output in the form of if if
The objective function for all the as
(23) Thus, tions, are
, corresponding to large ( ) and small ( ) func-
training data sets is defined
(24) (25)
where and For
(17) and are the class (belongingness) indicators to , obtained from (11) and (12), respectively. a sigmoid function,
, ,, training process is to fine-tune the parameters,
and
, the , by
C. Rule Weight Factors When multiple indexes are used for classification, the contribution of each index to the final decision varies, to a large degree, according to the situation under which the diagnostic decision is made. Such contributions can be represented by a weight
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
717
1) Initialize the parameters. In order to compare the system performance before and after training, the initial and are carefully chosen by experience which are signified by the dashed curves in Fig. 9. The unity rule weights are initiated, that is, . The initial step sizes, and , are set to 0.01. . 2) Input a training data set,
Fig. 7. Two sigmoid functions corresponding to small and large membership functions.
factor associated with the active indexes in each rule. The determination of the fuzzy rule weights is critical to a diagnostic classification scheme [33]. Rule weights can be determined by several approaches such as choosing empirically, using information measures [34], or adopting the general training algorithms [25], [35], [36]. When rule weight factors are updated by generally used training methods, sometimes the results are unreasonable, for example, larger than 1 or smaller than 0. If that happens, it is difficult to interpret the modified fuzzy sets. For example, if a rule weight is larger than 1, the fuzzy reasoning is no longer fulfilled by that rule. On the contrary, if a negative rule weight is attained, then this rule will never contribute to the overall output value when a maximum inference operator is employed. An objective scheme, based on the diagnostic reliability, is proposed next for determining the weight factors. , the crossover For a general monitoring index, , , as point between the small and large sigmoid functions is shown in Fig. 7. belongs to class or if or , respectively. The rule weight is proposed from the probability of the correct classification
(26) and denote where the probabilities of the missed and false alarms, respectively. As in Table I, the diagnosis is based on the an example, for rule active monitoring indexes, and . If their corresponding and , respectively, then the rule crossover points are weight, , is calculated by
(27) The rule weight acts as a penalty factor in the learning process. For example, if a pattern is misclassified by a particular fuzzy rule, its rule weight is decreased, and so is its contribution to the final decision-making process. D. Training the Neuro-Fuzzy Diagnostic Scheme The training procedure consists of the following steps.
3) Compute the outputs, and , with (11) and (12). training data sets are 4) Repeat steps 2) to 3) until all the input to the system. 5) Tune and according to (20) and (21). 6) Check if the new parameters are feasible with (22). If not, are re-optimized by using (23). 7) Determine the new weight, , according to (26). and . If the error 8) Update undergoes four consecutive reductions, increase the step size by 10%; if the error undergoes two consecutive combinations of one increase and one decrease, reduce the step by 10%. 9) Repeat steps 2)–8) until the designated epoch number is ) is achieved. reached or the training error goal (e.g.,
E. Monitoring Thresholds After the neuro-fuzzy scheme is properly trained, it can be utilized for the gear system monitoring. For a given input vector, the classification indicator is , where and are calculated from (11) and (12), respectively. If , , the gear of interest is in its normal condition (class i.e., ); otherwise, if , i.e., , the analyzed gear ). In order to facilitate the deis possibly damaged (class cision-making when , monitoring thresholds should be set between the different fault levels. Usually, these thresholds are given empirically. In this paper, an optimization process is proposed for the determination of these thresholds. For ex, is further classified into two ample, if the damaged class, and , corresponding to an initial damsubcategories, aged state (or uncertain state) and an advanced state, respec, is determined by maxtively, the threshold between them, imizing the correct classification reliability, , such that
(28) and where denote the probabilities of the false alarms and missed alarms, respectively. Therefore, in real-time gear system monitoring, if , the gear of interest is believed to be in its normal condition (with the degree of certainty of ). If , the gear is possibly damaged (with the degree of certainty of ), and an alert signal is triggered for the possible onset of a gear fault. Care should be given to this gear during the , upcoming online monitoring. On the other hand, if an alarm signal is given to the operator for the existence of a gear tooth fault (with the degree of certainty of ); maintenance operations should be scheduled for the handling of this gear.
718
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
Fig. 8. Gear health conditions tested. (a) Healthy gear, (b) cracked gear, (c) gear with a tooth filed in the middle, and (d) gear with a tooth chipped at the top.
Fig. 9. Comparison of the MFs before (dashed curves) and after (solid curves) training. (a) MFs for x , (b) MFs for x , and (c) MFs for x . TABLE II TEST CONDITIONS AND COLLECTED DATA SETS
VI. PERFORMANCE AND ANALYSIS In this section, the viability of the developed neuro-fuzzy diagnostic system is subjected to experimental tests, and compared with three related pattern classification schemes. A. Training and Testing the Neuro-Fuzzy Diagnostic System In order to train and test the new neuro-fuzzy diagnostic system, a series of tests are conducted by using the experimental setup in Fig. 1. A total of 16 pairs of spur gears are tested with the different conditions illustrated in Fig. 8: (a) healthy gears, (b) gears with one tooth cracked with 20% or 40% of the tooth width, (c) gears with one tooth filed in the middle with 20% or 50% of the tooth’s surface area removed, and (d) gears with one tooth chipped at the top with 20% or 50% of the tooth’s surface area removed. All the 20% fault tests are subjectively classified , whereas the 40% or to belong to the initial damaged state,
50% fault tests are classified as being in the advanced damaged . The tests are conducted under load levels from 3 state, to 6 Nm, and gear rotation speeds from 200 to 1200 rpm. The load and motor speed are randomly changed to simulate real machinery working conditions. A total of 356 sets of data are collected under different test conditions which are listed in Table II. In order to properly train a neuro-fuzzy scheme, sufficient representative training data sets, usually more than five times the number of the parameters to be updated [18], must be provided. The present system has 20 unknown parameters: 12 fuzzy MF parameters and eight rule weights. Approximately 30% of the data sets (106 pairs) are randomly chosen from each test condition to be the training data. Approximately 20% of the data sets (69 pairs) are randomly selected from each test condition to check the validity of the updated models to prevent overfitting. The remaining data sets (180 pairs) are employed to test
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
719
TABLE III RULE WEIGHTS BEFORE AND AFTER TRAINING
the system. The updated MFs in Fig. 9 and the new rule weights in Table III signify the results after the system is trained for 300 epochs. By examining the updated rule weights in Table III and the corresponding active features of each rule in Table I, it can be seen that the more active features that are applied to the diagnosis, the more reliable the classification performance is; that is, the developed diagnostic system can effectively integrate different features for the gear condition monitoring. In addition, for the applied features in the present system, both the WA and BK are amplitude-related measures, whereas the PM is a phase measure. If only one (active) feature is employed for the diagnosis, for example, the PM in rule , BK in , or WA in , the amplitude measure is more reliable than the phase feature, , , ). This occurs because ( the phase measure is very sensitive, not only to signature modulation due to gear imperfections, but also to noise. If two features are used for the diagnosis, such as the PM and WA in , the PM and BK in , or the BK and WA in , the classification based on the combination of a phase and an amplitude measure performs better than that with both measures of the same type, , , ). This happens because ( the diagnosis based on the analysis in the two domains (amplitude and phase) can provide more information about the system behavior than the information of only one domain. After the system is trained, the threshold between the initial by and the advanced damage states is obtained as using (28). Next, the system is tested by using the testing data sets. During testing, one missed alarm is recorded in the case of the gear with a 20% tooth crack at a shaft speed of approximately 400 rpm. The reason for this occurrence is that at very low speed, the signature modulation caused by a small tooth crack is very small. Two false alarms are recorded: 1) A gear with an initial chipped damage (20% of the area) is misclassified as an advanced damage level; and 2) a gear in the normal condition is misclassified in the initial fault state. Both the misclassifications are due to the significant signature disturbances at high rotation speeds and under large load levels. B. Comparison With Other Classifiers The developed neuro-fuzzy system is also compared with three other related diagnostic schemes to verify its viability. In order to simplify the description, the neuro-fuzzy diagnostic system developed in the preceding sections is designated as System-1. The other three schemes in this study are as follows. System-2: A Pure Fuzzy System with Unity Weights. The fuzzy IF–THEN rules are the same as those listed in Table I, except that all the rules have unity weights. The fuzzy classification process is
the same as in System-1. The MF parameters are chosen empirically, as demonstrated by the dashed curves in Fig. 9. The performance of this system is the same as that of System-1 without training. System-3: A Neuro-Fuzzy System with Unity Weights, Trained by the Gradient Method. This system is the same as System-2, except that the MF parameters are updated by using the gradient descent algorithm which is a classical training scheme. The same initial MF parameters are chosen as those in System-1. System-4: A Neuro-Fuzzy System with Constant Weights, Trained by the Gradient Method. This system is the same as System-3 except that the constant rule weights are used. The rule weight factors are chosen empirically, . These are close to those obtained for System-1 in Table III. For the purpose of comparison, only the data sets in the , and damaged category, , are considnormal category, ered; and thus the distinction between the initial faulty state and advanced faulty state are not made. Accordingly, System-1 has only one missed alarm and one false alarm with , the overall reliability: where 66 and 114 are the number of data sets of the healthy and faulty gears, respectively. Based on the same training, checking, and testing data sets that are listed in Table II, Table IV summarizes the classification performance by different classifiers. System-2 records seven missed alarms and eight false alarms with an overall reliability of 82.5%. System-2’s poor classification performance is primarily due to the lack of learning capability that results in the fuzzy MFs not being optimized. Another reason for such a performance is related to the fuzzy partition. Each input variable in System-2 utilizes only two MFs, large and small, compatible with other neuro-fuzzy classifiers whose fuzzy partition is limited by the availability of the training data. The coarser the fuzzy partition is, the more the patterns may be misclassified. Thus, the classification performance of System-2 may be improved if a finer fuzzy partition is applied. Through further testing, it is found that the performance of System-2, with a finer fuzzy partition, does improve to some extent; however, it is still lower than that of System-3 in which the fuzzy MF parameters are optimized by using the general gradient method. The only difference between the results of System-3 and System-4 is the rule weight factors. System-4, with empirical rule weights, performs better than System-3 with unity weights.
720
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
TABLE IV SUMMARY OF THE DIAGNOSTIC TESTING RESULTS USING DIFFERENT PATTERN CLASSIFICATION SCHEMES
Thus, rule weights play an important role in the classification performance in fault diagnostic applications. The first reason is associated with the robustness of the applied features. For the fuzzy IF–THEN rules in Table I, even if the firing strengths of two rules are identical, their diagnostic reliabilities may be different under different operation conditions and fault types. Therefore, a rule weight is necessary to represent the rule strength in the classification process. The second reason is related to the decision area. Each rule has its own decision area in the pattern space. In each rule, the MFs and the rule weight are directly associated with the decision area and its boundary characteristics. The optimization of rule weights can adjust the properties of the decision area boundary to reduce the misclassification, especially when the fuzzy partition is low (i.e., the total decision area is divided into fewer subspaces). This is one of the reasons why the neuro-fuzzy scheme with the optimal rule weights (System-1) has a higher reliability than the system with empirical weight factors (System-4). Another advantage of System-1 over System-4 is the use of the constrained-gradient training algorithm. In the training process, modifying MF parameters is likely to cause gaps between the MFs, and reduce the fuzzy completeness. This deteriorates the classification performance. The proposed constrained-gradient-reliability training algorithm can effectively prevent such excessive MF gaps by using constraint functions to further improve the classification accuracy. From previous analysis, it can be seen that the real contribution of the neuro-fuzzy diagnostic scheme is associated with its adaptive capability. Expert knowledge helps to properly set up a diagnostic scheme. However, human-determined system parameters are subject to variations between one person and another, and from time to time. Therefore, these parameters are rarely optimal in terms of reproducing desired classification outputs. It follows that adaptively fine-tuning the fuzzy parameters is necessary to enhance the approximation of the mapping from the observed symptoms to the underlying faults. In addition, the developed neuro-fuzzy scheme provides a robust problem solving framework. Machinery conditions vary significantly during real-world applications, and new system conditions may occur under different circumstances. With the help of adequate learning algorithms, new information can be extracted from online learning, and the diagnostic knowledge base can be
automatically expanded. Moreover, the aforementioned capabilities also facilitate classifier generalization. The neuro-fuzzy scheme can be initialized intuitively, whereas its system parameters are optimized during the adaptive learning process. VII. CONCLUSION In this paper, a neuro-fuzzy classification system is developed for gear system monitoring. Through experimental tests and comparisons with related diagnostic schemes, it is found that the developed neuro-fuzzy classifier provides an appealing diagnostic framework due to its capabilities in adaptivity and robustness. Specifically, the proposed reference function approach is a powerful post-processing tool which can enhance feature characteristics and increase the accuracy of fault detection. The proposed constrained-gradient-reliability learning algorithm can effectively update the fuzzy system, while the required fuzzy completeness is retained. In addition, the fuzzy rule weights are found to be necessary to improve the diagnostic performance, especially when a coarse fuzzy partition is applied and the multiple features from different signal processing techniques are utilized. Further testing is being pursued by applying the proposed system to actual industrial machines with many gears, shafts, and bearings. Moreover, research is underway to develop a neuro-fuzzy prognostic system to verify the diagnostic results, and to adaptively update its knowledge (rule) base to further improve the reliability of this diagnostic system. APPENDIX A VISUAL DIAGNOSTIC RESULTS FROM REFERENCE [2] See Tables V–VII. APPENDIX B DERIVATION OF THE EQUATIONS For
th training data pair, , for a sigmoid membership function
,
(B1)
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
721
TABLE V DIAGNOSTIC RESULTS CORRESPONDING TO A GEAR WITH CRACKED TOOTH FAULTS
BK: beta kurtosis; AM: amplitude modulation; PM: phase modulation; WA: wavelet amplitude; MFR: meshing frequency residual; CI: clear indication; SI: some indication; NI: no indication.
TABLE VI DIAGNOSTIC RESULTS CORRESPONDING TO A GEAR WITH FILED TOOTH FAULTS
BK: beta kurtosis; AM: amplitude modulation; PM: phase modulation; WA: wavelet amplitude; MFR: meshing frequency residual; CI: clear indication; SI: some indication; NI: no indication.
TABLE VII DIAGNOSTIC RESULTS CORRESPONDING TO A GEAR WITH CHIPPED TOOTH FAULTS
BK: beta kurtosis; AM: amplitude modulation; PM: phase modulation; WA: wavelet amplitude; MFR: meshing frequency residual; CI: clear indication; SI: some indication; NI: no indication.
From (10)
(B2)
(B3)
(B4)
(B5)
722
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 5, OCTOBER 2004
If
(B6) (B11)
(B7)
ACKNOWLEDGMENT The authors would like to thank the reviewers for their valuable comments on the research in this paper.
For
REFERENCES
(B8) Similarly
(B9) Therefore
(B10)
[1] G. Dalpiaz, A. Rivola, and R. Rubini, “Effectiveness and sensitivity of vibration processing techniques for local fault detection in gears,” Mech. Syst. Signal Process., vol. 14, no. 3, pp. 387–412, 2000. [2] W. Wang, F. Ismail, and F. Golnaraghi, “Assessment of gear damage monitoring techniques using vibration measurements,” Mech. Syst. Signal Process., vol. 15, no. 5, pp. 905–922, 2001. [3] J. Gertler, Fault Detection and Diagnosis in Engineering Systems. New York: Marcel Dekker, 1998. [4] R. Isermann, “Supervision, fault-detection and fault-diagnosis methods – An introduction,” Control Eng. Pract., vol. 5, no. 5, pp. 639–652, 1997. [5] R. Patton, P. Frank, and R. Clark, Issues of Fault Diagnosis for Dynamic Systems. New York: Springer-Verlag, 2000. [6] A. Pouliezos and G. Stavrakakis, Real Time Fault Monitoring of Industrial Processes. Norwell, MA: Kluwer, 1994. [7] R. Duda, P. Hart, and D. Stork, Pattern Classification. New York: Wiley, 2001. [8] B. Paya and I. Esat, “Artificial neural networks based fault diagnostics of rotating machinery using wavelet transforms as a preprocessor,” Mech. Syst. Signal Process., vol. 11, no. 5, pp. 751–765, 1997. [9] M. Tsujitani and T. Koshimizu, “Neural discriminant analysis,” IEEE Trans. Neural Networks, vol. 11, pp. 1394–1401, Oct. 2000. [10] H. Ney, “On the probabilistic interpretation of neural network classifiers and discriminative training criteria,” IEEE Trans. Pattern Anal. Machine Intell., vol. 17, pp. 107–119, Feb. 1995. [11] V. B. Jammu, K. Danai, and D. G. Lewicki, “Structure-based connectionist network for fault diagnosis of helicopter gearboxes,” Trans. ASME J. Mech. Design, vol. 120, pp. 100–105, 1998. [12] L. Zadeh, “The role of fuzzy logic in the management of uncertainty in expert systems,” Fuzzy Sets Syst., vol. 11, no. 3, pp. 199–228, 1983. [13] C. K. Mechefske, “Objective machinery fault diagnosis using fuzzy logic,” Mech. Syst. Signal Processing, vol. 12, no. 6, pp. 855–862, 1998. [14] L. Zeng and H. Wang, “Machine-fault classification: A fuzzy set approach,” Int. J. Adv. Manufact. Technol., vol. 6, pp. 83–94, 1991. [15] R. Isermann, “On fuzzy logic applications for automatic control, supervision, and fault diagnosis,” IEEE Trans. Syst., Man, Cybern. A, vol. 28, pp. 221–235, Apr. 1998. [16] M. Stenes and H. Roubos, “GA-fuzzy modeling and classification: Complexity and performance,” IEEE Trans. Fuzzy Syst., vol. 8, pp. 509–522, June 2000. [17] S. Li and M. Elbestawi, “Fuzzy clustering for automated tool condition monitoring in machining,” Mech. Syst. Signal Process., vol. 10, no. 5, pp. 533–550, 1996. [18] J. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy Soft Computing. Upper Saddle River, NJ: Prentice-Hall, 1997. [19] J. Buckley and Y. Hayashi, “Neural nets for fuzzy systems,” Fuzzy Sets Syst., vol. 71, pp. 265–276, 1995. [20] C. Lin and Y. Lu, “A neural fuzzy system with linguistic teaching signals,” IEEE Trans. Fuzzy Syst., vol. 3, pp. 169–189, Apr. 1995. [21] S. Pal and S. Mitra, “Multilayer perceptron, fuzzy sets and classification,” IEEE Trans. Neural Networks, vol. 3, pp. 683–697, Aug. 1992. [22] P. Simpson, “Fuzzy min-max neural networks, – Part II: Clustering,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 32–45, Feb. 1993.
WANG et al.: A NEURO-FUZZY APPROACH TO GEAR SYSTEM MONITORING
[23] A. Ghosh, N. Pal, and S. Pal, “Self-organization for object extraction using a multiplayer neural network and fuzziness measures,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 54–68, Feb. 1993. [24] S. Abe, Pattern Classification: Neuro-Fuzzy Methods and Their Comparison. New York: Springer-Verlag, 2001. [25] C. Lin and C. Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems. Upper Saddle River, NJ: Prentice-Hall, 1996. [26] P. McFadden, “A technique for calculating the time domain averages of the vibration of the individual planet gears and the sun gear in an epicyclic gearbox,” J. Sound Vibrat., vol. 144, no. 1, pp. 163–172, 1991. [27] G. Strang and T. Nguyen, Wavelets and Filter Banks. Cambridge, MA: Wellesley-Cambridge Press, 1996. [28] D. Bollahbal, F. Golnaraghi, and F. Ismail, “Amplitude and phase wavelet maps for the detection of cracks in geared systems,” Mech. Syst. Signal Process., vol. 13, no. 3, pp. 423–436, 1999. [29] F. Ismail, H. Martin, and F. Omar, “A statistical index for monitoring tooth cracks in a gearbox,” in Proc. ASME Biennial Conf. Vibration and Noise, vol. DE-84–1, Boston, MA, 1995, pp. 1413–1418. [30] P. D. McFadden, “Detecting fatigue cracks in gears by amplitude and phase demodulation of the meshing vibration,” J. Vibrat., Acoust., Stress, Reliability Design, vol. 108, pp. 165–170, 1986. [31] C. Lee, “Fuzzy logic in control systems: Fuzzy logic controller – Part I,” IEEE Trans. Syst., Man, Cybern., vol. 20, pp. 404–418, Mar. 1990. [32] D. Wismer and R. Chattergy, Introduction to Nonlinear Optimization. New York: Elsevier, 1978. [33] H. Ishibuchi and T. Nakashima, “Effect of rule weights in fuzzy rule-based classification systems,” IEEE Trans. Fuzzy Syst., vol. 9, pp. 506–515, June 2001. [34] Y. Chen, “A fuzzy decision system for fault classification using high levels of uncertainty,” Trans. ASME J. Dyna. Syst., Measure., Control, vol. 117, pp. 108–115, 1995. [35] H. Ishibuchi, K. Kwon, and H. Tanaka, “A learning algorithm for fuzzy neural networks with triangular fuzzy weights,” Fuzzy Sets Syst., vol. 71, pp. 277–293, 1995. [36] D. Nauck, “Adaptive rule weights in neuro-fuzzy systems,” Neural Comput. Applicat., vol. 9, pp. 60–70, 2000.
Wilson Wang received the Ph.D. degree in mechanical engineering from the University of Waterloo, Waterloo, ON, Canada, in 2002. He joined Lakehead University, Thunder Bay, ON, Canada, in 2004 as an Assistant Professor after working for Mechworks Systems, Inc., Waterloo, ON, Canada, as a Senior Scientist for two years. He was also the recipient of an NSERC Postdoctoral Fellowship after his graduation. His research interests include intelligent mechatronic systems design, artificial intelligence, signal processing, machinery condition monitoring, and time series forecasting.
723
Fathy Ismail received the B.S. and M.S. degrees, both in mechanical and production engineering, from the University of Alexandria, Alexandria, Egypt, in 1970 and 1974, respectively, and the Ph.D. degree from McMaster University, Hamilton, ON, Canada, in 1983. He joined the University of Waterloo, Waterloo, ON, Canada, in 1983, and is currently the Associate Dean of Graduate Studies. His research interests include machining dynamics, high-speed machining, modeling structures from modal analysis testing, and machinery health condition monitoring.
Farid Golnaraghi received the B.S. and M.S. degrees in mechanical engineering from Worcester Polytechnic Institute, Worcester, MA, in 1982, and the Ph.D. degree from the Department of Theoretical and Applied Mechanics, Cornell University, Ithaca, NY, inj 1988. He is a Mechanical Engineering Professor and a Canada Research Chair in Intelligent Mechatronic and Material Systems at the University of Waterloo, Waterloo, ON, Canada. He has several patents and publications for pioneering discoveries, and coauthored a textbook on automatic control systems with B. Kuo in 2003. He is the founder of Mechworks Systems, Inc., which provides machinery health condition monitoring services and products. He is also a Director of the Canadian Society for Mechanical Engineering and holds several key positions in his field.