A Minimum Description Length Principle Based Method for Signal Change Detection in Machine Condition Monitoring Jenni Hulkkonen and Jukka Heikkonen Department of Biomedical Engineering and Computational Science Helsinki University of Technology, P.O.Box 9203, FI-02015 TKK, Finland
[email protected] Abstract This paper proposes a minimum description length (MDL) based method for signal change detection in machine condition monitoring. Our method is grounded on a recently proposed MDL-based sequentially normalized maximum likelihood (SNML) approach to time series and especially signals complexity analysis with an autoregressive (AR) model. Experiments on signal change detection are performed using two data sets, one of which is based on measurements on damages of ball bearings. The results proved the success of the method to distinguish different ball bearing failures.
1. Introduction Machine condition monitoring is a crucial task in the early detection and analysis of possible machine failures. It is the key for safety, reliability and productivity of the machinery. Traditional machine condition monitoring methods rely on rule-base and predefined failure types [6]. However, there are many applications in which rules are hard to derive and the resulting models become too rigid. Statistical and information theoretical methods are alternatives to alleviate problems encountered with the traditional approaches. In machine condition monitoring one of the crucial processing steps is the detection of signal changes corresponding to incipient machine failures or other changes in conditions. For signal change detection, minimum description length (MDL) principle [7] provides an efficient framework. In this paper we propose an MDL and especially sequentially normalized maximum likelihood (SNML) [9] based method combined with an autoregressive (AR) model to machine condition monitoring. The performance and behaviour of the approach are experimented by two real data sets.
978-1-4244-2175-6/08/$25.00 ©2008 IEEE
2. MDL in signal change detection Rissanen’s MDL principle [7] is a statistical and information theoretical method trying to overcome typical model selection problems, such as overfitting. MDL combines the process of finding regular features in the data with coding principles. The best model is the one that allows the shortest total code length both for the data and the model. One of the key notions in MDL is stochastic complexity (SC) [8]. SC is interpreted as the shortest achievable code length for encoding and hence it provides a measure for the comparison of different models. MDL is well suited for signal change detection, as the change of the signal (complexity) induced by a change in a machine’s condition can be measured by SC. Sequentially normalized maximum likelihood (SNML) [9] is a recently proposed definition for SC. SNML provides some advantages over traditional MDL formulations, and especially over the so-called normalized maximum likelihood (NML) [8] universal model, as with SNML there is no need for hyperparameters and the SC for time series data is computable. The AR process is widely used for system malfunction detection and diagnosis. Consider a data vec0 tor y n = [y1 , . . . , yn ] modeled by AR(k) model as P k yt = b 0 x ¯ t + t = i=1 b(i)xit + t , t = 1, . . . , n, where b(i), i = 1, . . . , k are the model parameters, x it are the components of columns x ¯t = [yt−1 , . . . , yt−k ]0 , defining the regressor matrices Xt and t is an iid Gaussian noise of zero mean and variance σ 2 . The idea of SNML lies on the sequentially maximized conditionals. Consider the maximization problem n Y f (yt |y t−1 , Xt ; σ 2 , bt ) , (1) max 2 σ
t=1
where bt are maximum likelihood (ML) estimates, calculated from the data available up to t. With the solu-
tion σ ˆn2 = n1 sˆn , where the sequentially minimized sum of thePsquared deviations are recursively as Pcalculated t t−1 ¯0j bj )2 = j=m (yj − x ¯0j bj )2 + (yt − sˆt = j=m (yj − x x ¯0t bt )2 = sˆt−1 + eˆ2t , where m is the smallest fixed number for which the ML estimate can be computed, the σn2 )−n/2 , value of density function is f (y t |Xt ) = (2πeˆ 2 where parameter estimates σ ˆ and bt have been dropped to keep the notation uncluttered. Now, we define f (yt |y t−1 , Xt ) =
f (y t |Xt ) f (y t−1 |Xt−1 )
(2)
and we can calculate the non-normalized conditional density function as f (yt |y t−1 , Xt ) =
(2πeˆ st /t)−t/2 . (3) (2πeˆ st−1 /(t − 1))−(t−1)/2
The normalized conditional density distribution is fˆ(yt |y t−1 , Xt ) = K(y t−1 ) =
f (yt |y t−1 , Xt ) , (4) K(y t−1 ) Z f (yt |y t−1 , Xt ) dyt .
By multiplying the normalized conditional density distributions we get the desired parameter free density function, called the SNML model fˆSNML (y n |Xn ) = q(y m |Xm )
n Y
t=m+1
fˆ(yt |y t−1 , Xt ) ,
(5) where q(y m |Xm ) is initial density function. The negative logarithm of the SNML model in Eq. 5 gives the stochastic complexity (SC) criterion for the model order selection to be minimized. The criterion is n sn /n) − ln fˆSNML (y n |Xn ) = ln(2πeˆ 2 n X 1 − ln(1 − dt ) + ln n + O(1) , 2 t=m+1
(6)
¯0t (Xt Xt0 )−1 x ¯t . where dt = x The proposed machine condition monitoring algorithm is as follows. First, the condition monitoring signal is windowed by varying window lengths, i.e. the time series is processed in smaller segments. This enables us to recognize signal complexity changes in different time scales, which is crucial in recognizing the short and long-term machine condition changes. For each window we get an optimal AR model order and SC value by minimizing the SC criterion (Eq. 6) and hence a description (features) for each signal sample (i.e. time step) is obtained for further analysis.
In our case we were interested in recognizing machine condition changes and visualizing them for machine operators. As a compromise of these two goals, we selected the Self-Organizing Map (SOM) [4] for the post-processing task due to its clustering and visualization abilities. The SOM is a clustering method for mapping high dimensional inputs to a lower dimensional space (typically 2D discrete lattice of units) such that the topological relations of the inputs tend to be preserved. Thus the operator can easily evaluate the state of the machine visually on a computer screen.
3. Experimental results The aim of the experiments was to study the performance and behaviour of stochastic complexity (SC) and AR model order, chosen by SNML, for representing time series data and changes within it to separate different machine failure types causing their own characteristic variations for the signal. The experiments were performed using two data sets. The first data set was based on a laboratory test where the measured signal was corrupted by an external source. This test was performed to study the basic behaviours of SNML in signal change detection and to learn basic properties of the proposed approach. The second data set consisted of real measurements, allowing more realistic validation of the method and also the SOM method was applied to demonstrate its role for the machine operator. In the first experiment a movable drawer unit on castors was employed to generate data. On the drawer we attached a mobile phone with a vibration alarm for calls. Our goal was to test if we could recognize seven mobile phone vibrations hindered by the oscillations caused by the movement of the drawer. The vibration signal was gathered using a 3-axis accelerometer (SCA3000E04, VTI Technologies Oy, [1]), and for test purposes only vertical movements were analysed. The faster the drawer unit was moving the more amplified movements the drawer made in the vertical direction and visually the harder it was to find the mobile phone vibrations from the signal. The generated signal was 32 seconds long consisted on 6390 samples with the sampling rate of 200Hz. In signal analysis, the following processing window sizes were employed: 50, 100, 250 and 750 samples. Fig. 1 shows the original raw signal and SCs with the different window sizes. These results can be compared to results shown in Fig. 2, which presents the obtained AR model orders derived by the SNML criterion. From Fig. 1 one can observe that the windows of 100 and 250 samples reveal the vibrations well. With
3000
4000
5000
6000
200 150 100 1000
2000
3000
4000
5000
6000
Stochastic Complexity
700
(c)
500
Stochastic Complexity
1000
2000
3000
4000
5000
6000
(c)
1700 1600 1500 1400 5200
Stochastic Complexity
(b)
600
1800
(d)
1000
2000
3000
4000
5000
6000
(d)
5000 4800 4600 1000
2000
3000
4000
5000
6000
(e)
Acceleration
(a)
AR model order
2000
250
40
AR model order
1000 300
0 −1000
40
AR model order
−1000
1000
40
AR model order
0
(b)
(e)
2000
1000
Stochastic Complexity
(a)
Acceleration
2000
40
1000
2000
3000
4000
5000
6000
1000
2000
3000
4000
5000
6000
1000
2000
3000
4000
5000
6000
1000
2000
3000
4000
5000
6000
1000
2000
3000
4000
5000
6000
30 20 10 0
30 20 10 0
30 20 10 0
30 20 10 0
Figure 1. Results of the drawer unit case. (a) The original signal, (b)-(e) stochastic complexity with different window sizes: (b) 50, (c) 100, (d) 250 and (e) 750 samples.
Figure 2. Results of the drawer unit case. (a) The original signal, (b)-(e) AR model order chosen by SNML with different window sizes: (b) 50, (c) 100, (d) 250 and (e) 750 samples.
the smallest window size of 50, the resulting SC feature is relatively ”noisy” and the vibrations are not as clearly visible as with the next two window sizes (100 and 250). The width of the peaks representing vibrations in SC results are dependent on the window size, as assumed. One can also see that when the window size is too long (750) the resulting SCs are only able to detect more general changes than mobile phone vibrations. In this, but also in other applications, different window sizes affect significantly the SC measures and hence in machine condition monitoring it is preferable to use a collection of different window sizes to observe different phenomena. AR model order does not seem to be as good a criterion to characterize mobile phone call vibrations. Especially with the small window sizes, the model orders seem to vary considerably. When the window size increases the variation of the AR model order decreases but does not provide any clear clues when the mobile phone is vibrating. Although one would expect that a change in signal complexity affects the optimal AR model order accordingly, this does not apply generally, as seen in our case. The ball bearing fault data set was provided by Neurovision Oy [2] company. The data set consists of three different types of typical ball bearing faults: inner ring, outer ring and ball failures. Each failure was measured by piezoelectric accelerometer from three different axes, vertical (V), axial (A) and horizontal (H). As a result, we got 4097 data samples for each failure signal.
Signal changes corresponding to failures are not observable in time domain. Thus, spectrograms were computed to visualize the frequency content of the signal to show changes over time [3]. A spectrogram is based on a short-time Fourier transform [5] and it produces a 2D representation of the signal; the x-axis corresponds to time and the y-axis shows the frequencies at corresponding time points as colours. The spectrogram of V-axis’ signal is presented in Fig. 3. In the spectrogram the faults were more visible and some frequency rules for faults could be tried to derive. In our method we applied five different window sizes; 64, 128, 256, 512 and 1024 samples, resulting a total of 3072 data points for each failure with all SC values. AR model order was not used due to its poor performance in the first experiment. Finally, Self-organizing maps (SOMs) were used to visualize the results of failure separations. In Fig. 4 SOM results from V-axis data are visualized by a U-matrix and best matching units plots. The Umatrix represents the distances between the map units and their neighbouring units by colours. We can see that the SC feature vectors form two rather clear clusters of units. The best matching unit plots are presented for three different ball bearing failures. The number of samples with known failure labels in each SOM unit is correlating with the size of the coloured circle of the unit; the more mapped samples, the bigger the coloured circle of the unit is. A unit with no mapped samples has white colour. Each unit is also labelled according to these mapped samples. It can be observed that the clus-
Normalized Frequency (×π rad/sample)
1
balls
inner
outer
0.951
0.8 0.6
0.528
0.4 0.2 0
(a) U-matrix 2000
4000
6000 Time
8000
10000
ter border in the U-matrix separates inner failures from the ball and outer ring ones. The ball and outer ring failures are slightly overlapping. This is also supported by the confusion matrix in Table 1, which is based on classification of the samples according to the labelled SOM units. We also analysed measurements from two other axes, axial (A) and horizontal (H), see results in Table 1. The outer ring failures can be separated from the other failures by A-axis measurements and thus we were able to recognize three different ball bearing failure types based on V and A-axis measurements. The results of the third axis (H) were rather similar, although the failure types were slightly overlapping. The order of the failures was similar in all of the V, A and H-axes results. The inner and outer failures are clustered on the SOM lattice so that they are the most distant and the ball failure stays in the middle overlapping the inner or outer failure, or both. Table 1. Confusion matrixes in percents of V, A and H-axes results.
V axis A axis H axis
balls 96.0 0 3.6 87.3 9.3 0 87.6 3.8 4.1
(b) balls
12000
Figure 3. Spectrogram of V-axis data. The first third on the left is from the ball failure, the second from the inner ring failure and the last from the outer ring failure of the ball bearing.
Actual balls inner outer balls inner outer balls inner outer
0.104
Predicted inner outer 0 4.0 100 0 0 96.4 12.7 0 90.7 0 0 100 8.4 4.0 96.2 0 0 95.9
(c) inner ring
(d) outer ring
Figure 4. U-matrix and the best matching units for different failure types from V-axis results.
4. Conclusions We have proposed a new MDL, and especially SNML, based method for machine condition monitoring. SNML based stochastic complexity measure combined with an AR model was shown to possess potential in signal change detection tasks as observed via real examples. The results of this paper should be of interest to all persons working with machine condition monitoring applications.
References [1] VTI Technologies Oy, referenced 18th February, 2008. URL: http://www.vti.fi/en. [2] Neurovision Oy, referenced 9th January, 2008. URL: http://www.neurovision.fi. [3] L. Cohen. Time-frequency distributions - a review. Proceedings of the IEEE, 77(7):941–981, 7 1989. [4] T. Kohonen. Self-Organizing Maps. Springer, 2001. [5] S. H. Nawab and T. F. Quatieri. Short-time fourier transform. Prentice-Hall Signal Processing Series, pages 289–337, 1987. [6] B. K. N. Rao, editor. Handbook of Condition Monitoring. Elsevier Advanced Technology, 1996. [7] J. Rissanen. Modeling by shortest data description. Automatica, 1978. [8] J. Rissanen. Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42:40–47, 1 1996. [9] J. Rissanen and T. Roos. Conditional NML universal models. Information Theory and Applications Workshop, pages 337–341, 2007.