Nonlinear Adaptive Filtering of Stimulus Artifact
R. Grieve, P. Parker, B. Hudgins and K. Englehart Department of Electrical and Computer Engineering and Institute of Biomedical Engineering University of New Brunswick, Fredericton, New Brunswick, Canada
Please address all correspondence to: Philip A. Parker Dept. of Electrical and Computer Engineering, University of New Brunswick, PO Box 4400, Fredericton, New Brunswick, Canada, E3B5A3
Abstract
Noninvasive measurements of somatosensory evoked potentials have both clinical and research applications. The electrical artifact which results from the stimulus is an interference which can distort the evoked signal, and introduce errors in response onset timing estimation. Given that this interference is synchronous with the evoked signal, it cannot be reduced by the conventional technique of ensemble averaging. The technique of adaptive noise cancelling has potential in this regard however, and has been used effectively in other similar problems. An adaptive noise cancelling filter which uses a neural network as the adaptive element is investigated in this application. The filter is implemented and performance determined in the cancelling of artifact for in vivo measurements on the median nerve. A technique of segmented neural network training is proposed in which the network is trained on that segment of the record time window which does not contain the evoked signal. The neural network is found to generalize well from this training to include the segment of the window containing the evoked signal. Both quantitative and qualitative measures show that significant stimulus artifact reduction is achieved.
Keywords: evoked, response, stimulus, artifact, adaptive, filter, cancelling.
2
I Introduction
Somatosensory evoked potentials (SEPs) are used widely in clinical and research applications, and include peripheral, spinal, and cortical potentials. For reasons of convenience, comfort, and safety, the preferred measurement techniques are noninvasive using surface electrical stimulation and recording electrodes. The recorded SEP is thus small in amplitude-typically on the order of a microvolt peak, [1]. While an electrical stimulus evokes the desired response from the nervous process of interest, it also results in an interfering potential referred to as the stimulus artifact (SA). Due to a number of factors, the stimulus artifact is a particularly troublesome form of interference. First, the amplitude of the SA is typically orders of magnitude larger than the SEP, [2]. Second, the SA is synchronous or coherent with the SEP and thus cannot be reduced by ensemble averaging--the common method for reducing other forms of interference, [3]. Third, in most cases the SA overlaps the SEP in both the time and frequency domains, such that conventional time windowing and frequency filtering are not capable of removing the SA without distortion of the SEP, [4]. Fourth, the mechanisms by which the electrical stimulus is coupled into the recording are varied, complex and distributed, making minimization of the SA at the source problematic, [5]. Given that the conventional methods are not effective for SA reduction in SEP measurements, alternative approaches have to be considered. A technique which has potential in this regard is adaptive noise cancelling (ANC). In this approach an estimate of the SA is
3
obtained via an adaptive filter and the estimate subtracted from the composite SA plus SEP signal. ANC filters have been used in other biological signal acquisition applications. Ye and Choy, [6], used such a filter for the reduction of respiratory artifact in rheopneumography measurements. Sadasivan and Narayana, [7], developed an ANC filter for the reduction of electrooculogram in electroencephalography measurements. Thakor and Zhu, [8], have used an ANC filter for the removal of the QRS complex in ECG in order to improve detection of the P-wave. Suzuki et al., [9], developed a real-time ANC filter for suppression of ambient noise in lung sound measurements. In the application of noise cancelling to SA reduction, McGill et al., [4], estimated the artifact waveform from subthreshold stimulation and subtracted the estimate from the suprathreshold stimulus response. Nonlinear adaptive ANC filters were not considered in this study. Work by one of the present authors investigated the application and performance of linear and nonlinear finite impulse response (FIR) ANC filters for SA reduction in SEP measurements, [10,11]. The performance of linear FIR ANC filters was found to be rather poor, [10], suggesting that the input-output relationship of tissue to electrical stimulus is nonlinear. Indeed the source of this nonlinearity is, at least in part, the nonlinear voltage-current characteristic of the skin, [12]. A second-order truncated Volterra series nonlinear FIR ANC filter was shown to have substantially improved performance, [10]. While this filter achieves good SA cancellation, it has two serious limitations: 1) it cannot model nonlinearities of order higher than two, and 2) the number of filter coefficients grows as the square of the FIR filter length. An alternative ANC filter that does not suffer from these
4
limitations is the neural network (NN) configuration. The objectives of this paper are to investigate the application and performance of the NN ANC filter for the reduction of SA in SEP measurement
II Neural Network ANC Filter
The NN configuration has been used in ANC filters for other applications requiring cancellation of different types of interference. Yuwaraj et al., [13], applied the NN ANC filter for cancellation of interference in otoacoustic emission measurements. In the first application to SA reduction, Grieve et al., [14], investigated the pi-sigma NN configuration with some success but concluded that a feedforward NN configuration would be more appropriate. The feedforward multi layer perceptron (MLP) NN will be considered in this paper for the ANC filter and SA cancellation.
Figure 1 about here
The general configuration for the ANC filter is given in Fig.1. The primary channel input Xp is the addition of evoked response S, artifact Ap, and uncorrelated noise np. The reference channel input Xr is the addition of artifact Ar, correlated with Ap through the system H, and uncorrelated noise nr. The adaptive filter ha(t) estimates Ap from Xr and is adapted by the error e through an appropriate training algorithm. Following convergence of ha(t), the ANC filter output
5
is the minimum mean square error estimate of S. Different structures can be used for the adaptive filter, ha(t), and perhaps the most common is the linear FIR filter. However the structure of interest here is the NN adaptive filter because of its ability to generalise from training to test data, and to model nonlinear transfer functions of arbitrary order. The general structure of a feedforward MLP neural network is given in Fig. 2. For the implementation in this paper, samples, X1i, i=1,2,..,N, from X1 form the N-dimensional input feature vector for the input layer. The input layer is fully connected to the hidden layer through the weights Wji, j=1,2,..n, where n is the number of hidden nodes. The hidden layer is connected to the output layer through the weights Wj, j=1,2,..,n, and hyperbolic tangents are used for activation functions. The bias inputs and associated weights are included in practise in order to accommodate dc offsets. During the training phase, the NN weights are updated, from small randomly selected initial values, using the back-propagation algorithm, [15]. The update equations are given by Wj(k+1)= Wj(k) + µ1aje(k), and Wji(k+1)= Wji(k) + µ2Wj(k)f'(sj(k))X1i(k), where k is the time index, µ1 and µ2 are the learning rates for layers 1 and 2 respectively, f'(z) is the derivative of the activation function evaluated at z, and e(k) is the cancellation error at time index k. Appropriate values for N, n, µ1, and µ2 are determined by performance and stability criteria to be considered in Section IV as part of two in vivo experiments. These experiments require the acquisition of SEP and SA data, and the following section describes the acquisition methodology.
6
Figure 2 about here
III Methodology
A block diagram of the measurement system for acquiring SEP from the median nerve is shown in Fig. 3. Median nerve sensory fibres are stimulated at the finger and SEP acquired at the wrist using surface Ag/AgCl electrodes. The resulting SEP and SA are amplified over the bandwidth 10 Hz to 10 kHz, sampled at a rate of 25.6 kHz, and the digital data stored for subsequent off-line processing. Following each stimulus, a data window of 10 ms or 256 samples is recorded. The Bruel and Kjaer 2032 signal analyser is used in an on-line ensemble averaging mode in order to confirm the presence of SEP during the recording session. In order to ensure no SEP crosstalk between the primary and reference inputs to the ANC filter, a necessary condition for an ANC filter, a double-stimulus single-channel technique is used. In this technique a series of alternating sub and supra threshold stimuli are applied to the finger. The responses to the subthreshold stimuli are used as the reference data, and those to the suprathreshold as the primary data. Thus no channel crosstalk is possible. This technique also has the advantage of requiring only a single data acquisition channel and associated hardware. Two Grass S11 stimulators with isolators are used to generate the series of stimuli. Figure 4 shows a typical series of stimuli and acquired data for this technique.
Figure 3 about here
7
Figure 4 about here
IV Experiment 1 In order to optimize the NN parameters and evaluate the NN ANC performance, the first experiment acquired data without the presence of SEP. In this way the cancellation performance can be examined over the full SA duration, including the time interval in which the SA would otherwise overlap the SEP. If the SEP were present and unknown it would be impossible to precisely evaluate the filter's performance in cancelling the SA. At the same time the network parameters can be studied and optimised --in particular network size and training data window will be investigated. The methodology of Section III was used to collect 300 primary and reference records. However, for this experiment both reference and primary stimuli were subthreshold, thus ensuring no SEP in either channel. Because the SA and SEP are correlated, it is essential to ensure that during NN training with SEP present, the primary channel training data not contain SEP. Otherwise the NN filter will adapt to cancel not only SA but also the SEP. Thus for training, the record time windows were divided into two segments, P and P', such that P segment contains no SEP and P', which is the
8
remainder, contains the SEP. This training method is referred to as segmented-training. Fig.5 shows an example of the segments P and P'. Due to the finite conduction velocity of the nerve
Figure 5 about here
fibre, there is always a time delay between the stimulus application and the onset of the SEP. This interval determines the magnitude of the time segment P. The value of P for practical application can thus be based on the minimum expected nerve conduction delay between stimulus and SEP arrival at the recording electrodes. For the median nerve and this measurement setup, a value of 40 samples or 1.56 ms is appropriate. In other applications an appropriate P value can always be estimated. It is important that this estimation and segmentation be carefully done as the algorithm training is sensitive to the partitioning– if the P segment is too short poor generalization of the filter will result and if too long some cancellation of the SEP can occur. The performance measures for the AN filter relate to residual SA in the filter output as a ratio to the SA in the primary channel. Specifically the measures are ρ1, ρ2, and ρ3 as defined by ρ1 = |SAp(k)|max / |SAo(k)|max , k∈P, ρ2 = σSAp (k) / σSAo (k), k∈P, and
ρ3 = σSAp (k) / σSAo (k), k∈P', where SAp(k) and SAo(k) are the SA components at time-index k in the primary and output respectively, σX is the standard deviation of the random variable X, P is a time segment of the record window, and P' is a time segment containing the remainder of the record window. ρ1 is a 9
measure of the reduction of the SA peak, and ρ2 and ρ3 are measures of the power reduction of the SA in segments P and P' respectively. Since the NN was trained on the segment P, which contains only a portion of the SA, the ability of the NN to generalise and accurately generate Ap from Ar in both P and P' is critical to the AN filter performance. Thus the performance measure ρ3 is of particular significance. The 300 acquired response records were divided randomly into a 150-record training set and a 150-record test set. The primary and reference inputs used to train the NN were generated by averaging the training set data. These inputs were made up of the first 40 data points, preceded by a 20 point zero pad. The training was continued until the incremental change in the mean squared error with each iteration was not significant, indicating convergence of the network. The NN size was optimized by varying the number of input and hidden nodes, training the network for each combination, and determining the values of ρ1, ρ2, and ρ3 for the test set. The number of input and hidden nodes were varied over 4 through 20, and 2 through 14 respectively. Data were collected and processed for 6 subjects, and the results for one, which is typical of the 6, are shown in Fig.6. After considering the results for all subjects, and keeping in mind that ρ3 is of particular importance here, the combination of 12 input nodes and 6 hidden nodes was selected as optimum for further implementations. The low ρ3 value is due to the fact that the SA power in P' of the primary channel is comparable to the uncorrelated channel noise power, hence limiting the maximum value that ρ3 can attain.
Figure 6 about here
10
To determine the generalisation ability of the filter, the length of the training section was varied over 20 through 100 data points from a record length of 256 points, the NN trained, and the performance measure ρ3 calculated (P= 40). The results are shown in Fig.7. Again ρ3 is the measure of most significance since it measures the performance in P', and hence the ability of the NN to generalise. The value of ρ3 is fairly constant over the training range tested indicating that the NN is generalising well (this observation will be considered in more detail in Section VI). Also shown in Fig.7 is the performance of a nonlinear FIR AN filter with 10 taps, [10], trained with the Recursive Least Squares algorithm, (NRLS), and operating on the same data. For this
Figure 7 about here
filter, ρ3 generally continues to increase with P indicating that its generalising ability is not as strong as the NN for P in the range necessary for practical application, ie P≤40 in this case.
V Experiment 2
In this experiment the NN ANC filter of Section IV was used to cancel SA in SA plus
11
SEP signal from 6 subjects. The methodology of Section III, using both sub and supra threshold stimuli as per Fig. 4, was used to acquire the data. As in Section IV, 300 primary and reference records were acquired, and these were randomly split into a 150-record training set and a 150record test set. In order to test the practical effectiveness of the filters, the final outputs were obtained by filtering each of the 150 records in the test set, and averaging the 150 results. The primary and reference inputs used to train the NN were generated by averaging the training set data. These inputs were made up of the first 40 data points, preceded by a 20 point zero pad. The training was continued until the incremental change in the mean squared error with each iteration was not significant, indicating convergence of the network. The resulting filter was then used to filter the 150 records in the test set, and the 150-record filter output was averaged to obtain the final output. The training of the NRLS filter was slightly different. It was found that when the NRLS filter is trained on averaged data, the kernel values upon which it converges are large numbers. If a signal is filtered that is slightly different than that used to train the filter, the output contains a very high residual interference. To prevent the NRLS filter from converging on large kernel values, it was trained on raw data, rather than averaged data. All 150 primary and reference pairs in the training set were used to train the filter. Each record was composed of a 20 point zero pad followed by 40 points of data. The resulting filter was then used to filter the 150 records in the test set, and the 150-record filter output was averaged to obtain the final output. It should be noted that the performance measure ρ3 cannot be used in this experiment. Since the SEP is present and unknown, the residual SA following cancellation cannot be evaluated quantitatively. Therefore performance evaluation in the segment P_ must be a qualitative observation of the SA reduction in the overlap region, relative to the SEP.
12
Table 1 gives ρ1 and ρ2 values for the two adaptive filters, NN and NRLS FIR, used in the AN filter over the 6 subjects. It is evident that both give substantial reductions in the SA over the segment P. It is also evident that the NN gives a somewhat larger reduction which is to be expected given that the NRLS is able to model 2nd order nonlinearities only while the NN can model nonlinearities of arbitrary order. A paired t-test shows a significant difference in their performances at the 5 % significance level.
NRLS
NN
NRLS
NN
Subject 1
73
88
72
70
Subject 2
11
53
12
72
Subject 3
27
50
20
46
Subject 4
30
67
24
54
Subject 5
31
44
39
52
Subject 6
24
40
31
37
Table 1: Performance measures ρ1 and ρ2 for Experiment 2 over 6 subjects.
Figure 8 shows the results of the SA cancellation for the NN and NRLS FIR adaptive filters for 3 subjects which are typical of the 6 subjects. These figures allow a qualitative assessment of the cancellation in the P_ segment (k>40) where the SA and SEP overlap, as well as in the P segment (k≤40). The relative abilities of the NN and NRLS adaptive filters to generalise from P to P_ are particularly evident in Fig. 8b. 13
Figure 8 about here
VI Conclusions and Discussion
The overlap of SA with SEP can cause significant signal distortion and onset timing errors. An ANC filter with a NN as the adaptive element was shown to provide substantial SA reduction. An MLP with 12 input and 6 hidden nodes, and segmented training achieved a SA reduction in the P segment of 35 dB averaged over 6 subjects. Since the NN can model a nonlinearity of arbitrary order it was able to achieve a larger reduction than an NRLS FIR AN filter based on the truncated 2nd order Volterra model. It was also demonstrated that the NN is able to provide better generalization from the P to the P_ segments. The SA cancellation in the P_ interval was shown qualitatively to be excellent.
Figure 7 demonstrates the power of the NN in generalizing, and identifying/modelling the nonlinear system between the primary and reference channels. The fact that ρ3 is relatively constant over the length of P investigated indicates that it is able to capture and model the nonlinear system well with as few as 20 samples per evoked response. For this reason, the NN AN filter trained in the P segment is able to provide excellent SA cancellation in the P_ segment. In contrast, the NRLS AN filter requires a P of approximately 70 samples to achieve the same
14
performance, by which time the SA is typically overlapping the SEP. The difference is no doubt due to the 2nd order nonlinearity modelling limitation of the NRLS, as well as the structural differences in the NN and FIR adaptive filters. The dip in ρ3 seen for both filters as P increases is likely due to the fact that in the early part of P the SA is very large compared to the uncorrelated noise, and hence the adaption converges well. However in the later part of P the SA falls rapidly to values comparable to the uncorrelated noise and the adaption and convergence are disturbed. A corresponding drop in performance occurs until P is sufficiently large to allow the adaption to recover. From the results of Table 1 and the small residual SA of Figure 8, it is clear that the NN adaptive filter has converged well and has identified/modelled the nonlinear system ha(t) well. When considering training and convergence times, it is noted that in the case where the SA is stationary, the NN can be trained, the coefficients fixed, and SEP measurements made on-line or off-line. In the case of nonstationary SA, the NN adaptive filter can be allowed to track SA changes over time by using appropriate values for µ1 and µ2 in the update algorithm, [11]. Tracking and convergence properties of the filter in this mode need to be investigated further in a continuation of this work.
VII References
[1] C. D. McGillem, J. I. Aunon, and D. G. Childers, "Signal processing in evoked potential research: Applications of filtering and pattern recognition", CRC Crit. Rev. Bioeng., Vol. 9, pp.
15
225-265, Oct. 1981.
[2] S. B. Harrison and D. F. Lovely, “Identification of noise sources in surface recordings of spinal somatosensory evoked potentials”, Med. & Biol. Eng. & Comp., Vol. 33, pp. 299-305, 1995. [3] D. Regan, “Human Brain Electrophysiology”, Elsevier Science Publishing Co., New York, 1989.
[4] K. C. McGill, K. L. Cummins, B. B. Dorfman, L. J. Berlizot, K. Luetkemeyer, D. G. Nishimura, and B. Widrow, " On the nature and elimination of stimulus artifact in nerve signals evoked and recorded using surface electrodes", IEEE Trans. Biomed. Eng., Vol. BME-29, pp. 129-137, 1982.
[5] R. N. Scott, L. McLean, and P. A. Parker, "Stimulus artifact in somatosensory evoked potential measurement”, Med. & Biol. Eng. & Comp., Vol. 35, pp. 1-5, 1997.
[6] J. Ye and T. T. Choy, " On-line respiratory artifact removal via adaptive FIR filters in rheopneumographic measurements", Med. & Biol. Eng. & Comp., Vol. 32, pp. 620-624, 1994.
[7] P. K. Sadasivan and D. Narayana Dutt, " A non-linear estimation model for adaptive minimization of EOG artefacts from EEG signals", Intl. Journal of Biomed. Comp., Vol. 36, pp. 199-207, 1994
16
[8] N. V. Thakor and Y. S. Shu, " Applications of adaptive filtering to ECG analysis: noise cancellation and arrhythmia detection", IEEE Trans Biomed. Eng., Vol. BME-38, pp. 785-793, 1991.
[9] A. Suzuki, C. Sumi, K. Nakayama, and M. Mori, "Real-time adaptive cancelling of ambient noise in lung sound measurements", Med. & Biol. Eng. & Comp., Vol. 33, pp. 704-708, 1995.
[10] V. Parsa, P. A. Parker and R. N. Scott, "Adaptive stimulus artifact reduction in noncortical somatosensory evoked potential studies", IEEE Trans. Biomed. Eng., Vol. BME-45, pp. 165179, 1998.
[11] V. Parsa, P. A. Parker and R. N. Scott, " Convergence characteristics of two algorithms in non-linear stimulus artifact cancellation for electrically evoked potential enhancement”, Med. & Biol. Eng. & Comp., Vol. 36, pp. 1-13, 1998.
[12] W. G. Stevens, "The current-voltage relationship in human skin", Med. Electron. Biol. Eng., Vol.1, pp. 389-399, 1963.
[13] M. Yuwaraj, A. Gluzmann, P. Madsen and H. Kunov, " Adaptive stimulus artifact cancellation in otoacoustic emission testing", Proceedings of the 20th Canadian Medical and Biological Engineering Conference, pp. 106-107, 1994.
17
[14] R. Grieve, P. A. Parker and B. Hudgins, "Adaptive stimulus artifact cancellation in biological signals using neural networks", Proceedings of the 17th IEEE EMBES conference, Montreal, Sept. 1995.
[15] S. Haykin, " Neural Networks: A Comprehensive Foundation", Macmillan College Publishing Company, New York, 1994.
Acknowledgements
This work was supported in part by an NSERC Postgraduate Scholarship and NSERC grant # OGP0004445.
18
Figure and Table Captions Figure 1: Adaptive noise cancelling (AN) filter structure. Figure 2: Feedforward neural network structure. Figure 3: Stimulation and acquisition system. Figure 4: Alternating sub and supra electrical stimuli (4a) and resulting SA and SEP(4b}. Figure 5: The P and P_ segments for segmented training. The filter is trained on the P segment where no SEP is present. Figure 6: Network size test on subthreshold stimulus data showing performance as a function of the number of input and hidden nodes. Figure 7: Performance ρ3 versus length of training segment. Figure 8: SA cancellation results from Experiment 2 for subjects a, b, and c. Table 1: Performance measures ρ1 and ρ2 for Experiment 2 over 6 subjects.
19
Table 1
ρ1
ρ2
NRLS
NN
NRLS
NN
Subject 1
73
88
72
70
Subject 2
11
53
12
7
Subject 3
27
50
20
46
Subject 4
30
67
24
54
Subject 5
31
44
39
52
Subject 6
24
40
31
37
20
Xp=s+Ap+np
Âp Reference input
Xr=Ar+nr
Figure 1
21
Bias i/p W b1
W b2
W bn
Figure 2.
22
Figure 3
23
S am p le N u m b er
S am p le N u m b er
Figure 4 :
24
Figure 5:
25
ρ1
ρ1
# of Hidden Nodes
# of Input Nodes
ρ2
ρ2
# of Input Nodes
ρ3
# of Hidden Nodes
ρ3
# of Input Nodes
# of Hidden Nodes
Figure 6:
26
Neural Network
ρ3 NRLS
Length of Training Section, P
Figure 7
27
Figure 8: 28