ACOUSTIC TARGET CLASSIFICATION USING DISTRIBUTED SENSOR ARRAYS Xiaoling Wang, Hairong Qi The University of Tennessee Department of Electrical and Computer Engineering Knoxville, TN 37996 Email: {xwang1,hqi}@utk.edu ABSTRACT Target classification using distributed sensor arrays remains a challenging problem due to the non-stationarity of target signatures, large geographical area coverage of sensor arrays, and the requirements of time-critical and reliable information delivery. In this paper, we develop an algorithm to derive effective and stable features from both the frequency and the time-frequency domains of the acoustic signals. A modified data fusion algorithm for distributed sensor arrays is also developed in order to integrate the classification results from different sensors and provide fault-tolerance. By using data fusion, the accuracy of the classification can be increased by as many as 50%. 1. INTRODUCTION Ground vehicle classification is of great interest to many military applications. Due to the fact that ground vehicles have distinctive acoustic signatures produced by the rotation of their engine and the propulsion mechanism, acoustic signals remotely sensed by sensor arrays are appropriate sources for target identification [1]. By far, the available algorithms for feature extraction from acoustic signals are mainly based on three domains: the time, the frequency, and the time-frequency domains. Acoustic signal processing in time domain, such as beamforming [2], is a natural approach, but not an optimal one due to the complexity of the environment, i.e., the time domain signatures of acoustic signals can be hampered by noises of other moving parts, Doppler effects, wind effects, and so on. The processing algorithms in frequency domain are based on the spectrum in a frequency range from 20 to 200Hz in which vehicles have signals generated from two main sources moving periodically: the engine and the propulsion gear. Harmonic line association (HLA) algorithm was developed in [3, 4] based on this feature. However, since vehicle acoustic signals are non-stationary and wide-band, which makes it very difficult to pick peaks in the frequency spectrum, approaches in time-frequency domain are developed as well [5]. In this paper, we first develop algorithms to extract features from the spectrum and the time-frequency representation of the signals. Then we use principle component analysis (PCA) to project features onto a low-dimensional subspace while remaining the significant characteristics. We also develop a data fusion algorithm to combine the classification results from distributed sensors in an array in order to further improve the accuracy of the result and achieve fault-tolerance. This research was supported in part by DARPA under grant N66001001-8946
2. ALGORITHMS FOR FEATURE EXTRACTION Feature extraction is the most important and also the most difficult part in target classification, since acoustic signals emitted by ground vehicle are non-stationary and can be hampered by many other factors. 2.1. Spectral Analysis Since sources emit acoustic sound at specific frequency and the propagation of a signal wave through a medium generally depends on its frequency, spectrum of a signal generated by using Fourier transform is always a powerful tool. The spectrum of a ground vehicle focuses on the lower end of its frequency range and the periodic signal appears as families of harmonically related spectral lines [4]. After sampling the signal using a Hanning window in order to avoid the leakage of Fourier transform, we implement fast Fourier transform (FFT) on each 1-second block of the acoustic signals as: X(n) =
N −1 X
nk
x(k)e(−2πj) N
(1)
k=0
Assume that the fundamental frequency range of the acoustic signal is: Ff und ∈ {8, 20}Hz
(2)
we find the highest peak in the spectrum and assume it is the kth harmonic and we can get k=
F Ff und
(3)
where k must be an integer. Based on the determination of fundamental frequency, all the harmonic lines in the spectrum can be found. After normalizing the signal strength on each harmonic line by using the total signal strength calculated on the harmonic line set, the feature vector is generated. This technique has two advantages: (1) the feature vector is not a function of frequency, but consists of normalized numbers between 0 and 1 that only depend on harmonic line numbers; (2) since we use 1-second frame to calculate the spectrum, the peak energy is tracked frame by frame, thus producing a predominant pattern for the acoustic energy source. We choose the three dominant frequency locations as feature vector components to help distinguishing sources.
2.2. Time-Frequency Representation Analysis The spectrum of a signal essentially tells us which frequencies are contained in the signal and their corresponding amplitudes and phases as well, but does not tell us any time distributions of these frequencies. Since acoustic signals emitted by ground vehicles are non-stationary and often impacted by strong noise, spectrum does not provide enough information for extracting features for classification. From this point of view, only time-frequency representation can successfully give an accurate measure of the distribution of energy or intensity of a signal in time and frequency simultaneously. Spectrogram is a simple but commonly used time-frequency representation which is defined as [6]:
Sx (t, ν) =|
Z
+∞
x(u)h∗ (u − t)e−j2πνu du |2
(4)
−∞
Even though the spectrogram of the sum of two signals presents an interference term, if the two signals are further away so that their spectrograms do not overlap significantly, the interference term will be close to zero, which means that the spectrogram of the sum of these two signals is the sum of their corresponding spectrograms. This property of spectrogram is useful to recognize signals that are emitted by two different targets. Another commonly used time-frequency representation of acoustic signals is wavelet coefficients. The wavelet transformation provides a hierarchical way to represent both time and frequency contents of a signal. Moments of time-frequency representations provide important information about the signal, like its amplitude modulation or its instantaneous frequency, so they are appropriate to be used as features. The first and the second order moments, in time and in frequency, of a time-frequency energy distribution are defined as follows: fm (t) =
R +∞
R +∞
f ∗ tf r(t, f )df
−∞ R +∞ −∞
tf r(t, f )df
f 2 tf r(t, f )df − fm (t)2 B 2 (t) = −∞ R +∞ tf r(t, f )df −∞ R +∞ t ∗ tf r(t, f )dt tm (f ) = −∞ R +∞ tf r(t, f )dt −∞ R +∞ 2 t tf r(t, f )dt T 2 (t) = R−∞ − tm (f )2 +∞ tf r(t, f )dt −∞
(5)
eigenvalues, and which therefore account for the most variance within the set of signals. In our experiment, we choose the 15 largest eigenvalues to form the eigenspace, then project the input training data vectors and test data vector onto this eigenspace. By doing so, the dimension of data sets is reduced and at the same time, some pseudofeatures can be eliminated. 3. ALGORITHMS FOR TARGET CLASSIFICATION After we extract representative and stable features to describe the characteristics of different targets, we then need to import them into a classifier. In this experiment, we use supervised learning methods. We need to specify: (1) in the training set, which class does each sample belong to, and (2) for the problem in general, how many classes (clusters) is appropriate. 3.1. Minimum Distance Approach Minimum distance approach (MPP) is the simplest approach in parametric classification which is based on the assumption that the feature vectors are Gaussian distributed. The basic idea is to project training set and testing set into their feature space, and to calculate the distance between the test data and the average of training set in each class. The class of the test data is decided as the class number of the training data corresponding to the smallest distance. 3.2. K-Nearest-Neighbor Method k-Nearest-Neighbor (kNN) use nonparametric density estimation which means to estimate the density function without any assumption. The basic idea for kNN is to center a cell at the test data x and let it grow until it contains kn training samples, and kn can be some function of the size of the training set, n. The decision rule of k-nearest-neighbor is given as following kn (9) k The decision rule tells us to look into a neighborhood of the test feature vector for k samples. If within that neighborhood, more samples lie in class i than any other classes, we assign the unknown test as belonging to class i. p(ωn |x) =
(6)
(7)
(8)
These features describe the average positions and spreads in time and in frequency of the signal. 2.3. Principle Component Analysis (PCA) Algorithm In mathematical terms, the objective of PCA is to find the principal components of the distribution of signals, i.e. the eigenvectors of the covariance matrix of the set of signals. These eigenvectors can be thought of as a set of features that characterize the variations between signals. So each original signal can be represented exactly in terms of a linear combination of the set of eigenvectors. On the other hand, each signal can also be approximated using only the subset of the “best” eigenvectors - those that have the largest
4. DATA FUSION USING DISTRIBUTED SENSOR ARRAYS In the practice of ground vehicle classification, since the target is moving in a large range of area, the output of each sensor depends heavily on the relative position of the target. So there is no way to identify the target only by using single sensor. As larger amount of sensors are deployed in distributed sensor arrays, it is important to develop a robust and fault tolerant data fusion technique in order to handle uncertainty of sensor outputs. In this paper, we implement a modified multi-resolution integration (MRI) algorithm to improve the accuracy of the classification system. The original MRI algorithm was proposed by Prasad, Iyengar and Rao in [7]. The idea consists of constructing a simple function (the overlap function) from the outputs of sensors in a sensor array and resolving this function at various successively finer scales of resolution to isolate the region over which the correct sensors lie in
[8]. Figure 1 illustrates the overlap function (Ω(x)) for a set of 7 sensors outputs. The actual value of the parameter being measured lies within regions of the maximal peaks of Ω(x).
5 4 3 2 1 0
Fig. 2. The position of sensor array A.
Fig. 1. The overlap function for a set of 7 sensors. 5.2. Results and Comparison This algorithm provides a hierarchical framework for interpreting the overlap function. At each resolution, MRI picks the crest from the overlap function, and resolve only the crest in the next finer resolution level. Crest is a region in the overlap function with the highest peak and the widest spread. To make the algorithm more efficient, we implement a modified MRI which improves the overlap function to only return the interval with ranges [n − f, n] where n is the number of sensors and f is the number of faulty sensors. This method can reduce the width of the output interval in most cases, especially useful when the number of sensors involved is large. In this experiment, each sensor generates a confidence level interval for each possible class by letting the parameter k equal to different values in kNN, which is used to construct the overlap function in the modified MRI algorithm. After picking the crests from the overlap functions, we choose the class that the maximal average confidence level belongs to as our desired target class.
In this experiment, the classification is among four ground vehicle classes: POV, dragon wagon (DW), LAV and AAV. The confidence matrix by using the kNN algorithm for sensors close to the target is shown in Table 1. Each row indicates the numbers of actual outputs for each class, and each column indicates the numbers for each expected class. So the diagonal elements are those correctly classified. We can see that the accuracy can be up to 85% for between-class identification.
5. EXPERIMENTS AND RESULTS
The confidence matrix by using MPP for the same sensors as used in the kNN algorithm is shown in Table 2. Since the MPP algorithm assumes Gaussian distribution of the test data which is not always the case in practice, the accuracy drops dramatically compared to the kNN.
5.1. Experiment Design In this experiment, we construct the training and the test data set from the database provided by DARPA SensIT (Sensor Information Technology) program. The distribution of the sensor array is shown in Figure 2. First, for each test data, the acoustic signals captured by microphones are input into the feature extraction module that extracts features from the spectrum and the time-frequency representation separately. Then the feature vectors are projected onto a lower dimensional feature space derived from PCA. We use kNN and MPP algorithm to identify the target from each sensor separately. In this procedure, we only use the data captured by those sensors close to the target. After computing the confidence range of each target class using the kNN for different k’s, the results from separate sensors can be combined together by using the modified MRI developed in section 4. The final decision is the target class with maximal average value of confidence level.
POV DW LAV AAV
POV 8 1 1 4
DW 1 48 2 3
LAV 0 4 48 3
AAV 3 16 1 57
Table 1. The confidence matrix by using kNN for sensors close to the target.
POV DW LAV AAV
POV 0 0 0 14
DW 0 18 1 35
LAV 0 1 40 14
AAV 1 3 0 73
Table 2. The confidence matrix by using MPP for sensors close to the target.
Since in practice, the ground vehicles are moving in the field, it is impossible to make a single sensor always close to the target, which will impact the real-time classification results severely. In order to improve the reliability and the accuracy of the system, we use the modified MRI to integrate the confidence level interval for
each class from each sensor in the field. The confidence level interval for each class is generated by letting k equal to several different values in kNN. The confidence matrix by using the modified MRI for a sensor array that contains the sensor close to the target and sensors away from the target is shown in Table 3.
POV DW LAV AAV
POV 7 2 1 2
DW 0 30 0 0
LAV 0 1 4 1
7. REFERENCES
AAV 1 3 2 17
[1] Howard C. Choe, Robert E. Karlsen, Grant R. Gerhert, and Thomas Meitzler, “Wavelet-based ground vehicle recognition using acoustic signal,” in Wavelet Applications III, 1996, vol. 2762, pp. 434–445.
Table 3. The confidence matrix using the modified MRI on a sensor array.
Figure 3 shows the comparison of classification accuracies by using different sensors in a sensor array. The target in this experiment is AAV which is a medium-size vehicle and we choose A11 as our base sensor. We observe from the solid line that the classification accuracy by using A11 is about 23% when the target is near A01. When we use a sensor array including sensors A11 and A03, the accuracy arises to about 35%. If we choose A11, A03 and A01 to construct our sensor array, the accuracy is increased to about 80%. On the other hand, if the target is near A11, we can obtain an accuracy of about 75% by using A11 only, which is illustrated by the dashdot line. The dashed line shows the situation that the target is near sensor A25. We can see that by using the modified MRI in a multiple sensor array, we can generally get an accuracy as high as 84% no matter where the target is in the field, whereas the result from a single sensor strongly depends on the relative position of the target. The MRI data fusion algorithm can also provide fault tolerance since the faults in one sensor will not affect the global result very much.
0.8
[3] George Succi and Torstein K. Pedersen, “Acoustic target tracking and target identification - recent results,” in Unattended Ground Sensor Technologies and Applications, Orlando, Florida, April 1999, SPIE, vol. 3713, pp. 10–21. [4] Mark C. Wellman, Nassy Srour, and David B. Hillis, “Feature extraction and fusion of acoustic and seismic sensors for target identification,” in Peace and Wartime Applications and Technical Issues for Unattended Ground Sensors, 1997, vol. 3081, pp. 139–145. [5] Amir Averbuch, Eyal Hulata, Valery Zheludev, and Inna Kozlov, “A wavelet packet algorithm for classification and detection of moving vehicles,” Department of Computer Science, Tel Aviv University, Tel Aviv, Israel. [6] Francois Auger, Patrick Flandrin, Paulo Goncalves, and Olivier Lemoine, Time-frequency toolbox tutorial for use with MATLAB, CNRS and Rice University, 1995-1996.
[8] H. Qi, S. S. Iyengar, and K. Chakrabarty, “Multi-resolution data integration using mobile agents in distributed sensor networks,” Accepted by IEEE Transactions on SMC:C, 2000.
0.7
0.6 accuracy
[2] Russell Braunling, Randy M. Jensen, and Michael A. Gallo, “Acoustic target detection, tracking, classification, and location in a multiple target environment,” in Peace and Wartime Applications and Technical Issues for Unattended Ground Sensors, 1997, vol. 3081, pp. 57–66.
[7] L. Prasad, S. S. Iyengar, and R. L. Rao, “Fault-tolerant sensor integration using multiresolution decomposition,” Physical Review E, vol. 49, no. 4, pp. 3452–3461, April 1994.
0.9
0.5
0.4
0.3
0.2
0.1
get is moving in the field, the modified MRI data fusion algorithm can avoid the relative position problem encountered when using single sensor and further improve the accuracy and the reliability of the classification.
1
A11
1.5
2 A11 and A03
2.5
3 A11, A03 and A01
3.5
4
A11, A03, A01 and A25
Fig. 3. The comparison of classification accuracy for single sensor and different sensor arrays.
6. CONCLUSION The feature extraction algorithm developed based on both the spectrum and the time-frequency representations of the acoustic signals can generate stable features which makes it possible to attain an accuracy as high as 85% from sensors close to the target for betweenclass identification. As for the practical application where the tar-