Accepted for presentation at the 2014 IEEE Workshop on Statistical Signal Processing, Gold Coast, Australia
ESTIMATING DYNAMIC CORTICAL CONNECTIVITY FROM MOTOR IMAGERY EEG USING KALMAN SMOOTHER & EM ALGORITHM S. Balqis Samdin, Chee-Ming Ting*, Sh-Hussain Salleh, Mahyar Hamedi and A. B. Mohd Noor Center for Biomedical Engineering, Universiti Teknologi Malaysia, Malaysia
[email protected],
[email protected],
[email protected],
[email protected] ABSTRACT This paper considers identifying effective cortical connectivity from scalp EEG. Recent studies use time-varying multivariate autoregressive (TV-MAR) models to better describe the changing connectivity between cortical regions where the TV coefficients are estimated by Kalman filter (KF) within a state-space framework. We extend this approach by incorporating Kalman smoothing (KS) to improve the KF estimates, and the expectation-maximization (EM) algorithm to infer the unknown model parameters from EEG. We also consider solving the volume conduction problem by modeling the induced instantaneous correlations using a full noise covariate. Simulation results show the superiority of KS in tracking the coefficient changes. We apply two derived frequency domain measures i.e. TV partial directed coherence (TV-PDC) and TV directed transfer function (TV-DTF), to investigate dynamic causal interactions between motor areas in discriminating motor imagery (MI) of left and right hand. Event-related changes of information flows around beta-band, in a unidirectional way between left and right hemispheres are observed during MI. A difference in interhemispheric connectivity patterns is found between left and righthand movements, implying potential usage for BCI. Index Terms—Multivariate autoregressive model, state-space model, EM algorithm, dynamic cortical connectivity, EEG.
1. INTRODUCTION The analysis of connectivity between different brain regions can provide deeper understanding of the brain’s sensory and cognitive functions. Identification of brain connectivity network comprises identifying a set of functional network nodes that represents brain regions, modeling and then estimating the connections or edges between them, often in term of statistical interdependencies [1]. The network nodes can be simply defined by the recorded channels of electrophysiological data such as electroencephalogram (EEG), Use of EEG signals as nodes for connectivity analysis [2], especially at the cortical level, is justified by that EEG recordings are projections of underlying cortical activity measured at the scalp. Besides, it has other advantages of non-invasive nature, lowcost and high temporal resolution. Multivariate autoregressive (MAR) modeling of EEG has been used to infer effective connectivity [2], the causal influence of one neuronal region exerting on the others, which involves the directionality of information flow. Various directional connectivity measures can be derived from the MAR model parameters, including the frequency-domain directed coherence (DC), partial directed coherence (PDC), directed transfer function (DTF) and time-domain Granger causality. Some of them have been used as features for classification of motor-imagery EEG for braincomputer interface (BCI) [3]. Conventionally, these measures are fixed with time and computed from MAR models with constant coefficients fitted over the entire time-course, assuming static brain
connectivity. However, this stationarity assumption, although convenient to result interpretations, is incompatible with the wellknown dynamical, condition-dependent nature of brain activity. Recent studies demonstrate that functional connectivity exhibits changes over time, due to task demands, learning and large state transitions [4]. This suggests that the MAR parameters and hence the connectivity metrics can be allowed to vary with time to better quantify the dynamics of connectivity networks. The time-varying AR (TV-AR) models, in which the AR coefficients vary instantaneously with time, have been used to better model the non-stationarity EEG signals for spectral estimation [5]. The estimators of the TV-AR coefficients were obtained sequentially in time within the state-space framework using Kalman filter (KF), which is optimal in mean-square sense. The use of univariate TV-AR models in these studies can be generalized to multivariate case to generate dynamic connectivity measures. Time-varying MAR (TV-MAR) models have been proposed for analysis of dynamic cortical connectivity using EEG. Hesse et al. [6] introduced time-varying Granger causality based on the TV-MAR models computed adaptively using recursive least square (RLS) algorithm, to estimate causal interactions between EEG. Results in [7] show that time-varying PDC (TV-PDC) and DTF (TV-DTF) derived from the same approach are capable of detecting an evolving cortical network from EEG during the footlips movement. Omidvarnia et al. [8] estimate these two measures using KF and state-space approach instead, and applied them for time-frequency analysis of time-varying connectivity in newborn EEG. This study was extended in [9] by modifying the TV-PDC based on orthogonalization of the MAR coefficients. This method can minimize the volume conduction (VC) effect, a critical problem in EEG-based connectivity analysis, where a single brain source is observed simultaneously at several channels, which can lead to false connectivity detection. In this paper, we follow the KF approach for estimating the TV-AR models of dynamic cortical connectivity directly from scalp EEG, as in [8]. Both TV-PDC and TV-DTF measures are applied to analyze the time-frequency dynamics of directed interactions between cortical areas, during motor-imagery (MI). The TV-MAR model parameters, such as the system noise covariance matrices, are typically unknown and need to be learned from data. However, the adaptation of the covariates in [8] depends on a heuristically chosen update coefficient, which is impractical and only suboptimal. We propose to use the expectation-maximization (EM) algorithm in conjunction with Kalman smoother (KS) for maximum likelihood (ML) estimation of the TV-MAR models. The KS can yield more accurate estimators of the coefficients changes and hence the connectivity pattern, providing smoother trends for the noisy, spurious KF estimates. We further propose a TV-AR model with instantaneous effect, allowing non-zero off-diagonal elements in the model noise covariance matrix, to accommodate the
Copyright (c) 2014 IEEE. Personal use is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE
VC-induced instantaneous connectivity. Incorporating an instantaneous term for the static MAR model has been studied in [10]. Instead of diagonal matrix in [9], a full spatial covariance matrix is used here and is estimated by the EM algorithm. Non-directed coherence between EEG channels computed over short-segments has been used to interpret event-related dynamic interactions between left and right motor cortex areas in real finger movement [11]. Here, we apply the directed, instantaneously timevarying measures, for the first time, to analyze changes in the interhemispheric information flows between motor areas, for discrimination of the imagined left and right-hand movements.
Let y n = ( yn1 ,....., ynd )T , n = 1,…, N be EEG signals of length N measured from d channels (defining nodes of a brain network). y n is modeled as a TV-MAR process of order p, TV-MAR(p), that describe the linear dependency of y n on its p past observations p
∑A
k ny n− k
+ vn
(1)
k =1
where {A k n }kp=1 are d × d matrices of TV-MAR coefficients at time n and v n is a d × 1 i.i.d. Gaussian observational noise with mean zero and covariance matrix R , v n ~ N (0, R ) . Instead of diagonal matrix, we allow R to be a full matrix, with possibly positive offdiagonal elements rij > 0, i ≠ j to accommodate the instantaneous or zero-lag correlation between channels, caused by the artifact of VC. Model (1) can be formulated into state-space form [8], [12] y n = C na n + v n (2)
Pn|n = Cov[a n | y1:n ] , can be computed analytically by KF [8], [12]. a n|n is the minimum mean squared error (MMSE) estimator for a n . We propose to include the fixed-lag smoothing which can correct the KF estimates based on the future measurements y n +1: N . The smoothing densities pθ (a n | y1:N ) , with their mean a n| N and
n = N − 1, N − 2,…,1 based on the KF estimates [14]. Besides, the model parameters θ are typically unknown. We use EM algorithm [14] to obtain the ML estimates, θˆ = arg max log p ( y ) where θ
1: N
θ
log pθ (y1: N ) can be computed analytically by the KF. We derive the two-step EM iterations for the TV-MAR state-space model: E-Step: Based on the smoothed estimates, compute the Q = E[log pθk (a1: N y1: N | y1: N )] given expected log-likelihood model estimated at k-th iteration, involving expectations
S n| N = E (a naTn | y1:N ) = Pn| N + aˆ n| N aˆ Tn| N S n , n −1| N =
E (a naTn −1
| y1:N ) =
(7)
Pn , n −1| N + aˆ n| N aˆ Tn −1| N
(8)
where J n = Pn|n Pn−+11|n and Pn, n−1| N = Pn|nJTn−1 + J n (Pn+1,n| N − Pn|n )JTn−1 M-Step: Update the model parameters by maximizing the Q function over θ k . The re-estimated covariance R is given as
R k +1 =
1 N
∑
N n =1
( y n y Tn − 2Cnaˆ n| N y Tn +CnS n| N CTn )
(9)
(3)
a n = vec (( A1n ,…, A pn )T )
Q is assumed known and pre-specified, since the dynamics of the estimators for a n essentially depend only on the ratio of Q and R .
(4)
( y Tn −1,…, y Tn − p )T
The iteration stops when pθk +1 (y1: N ) − pθk ( y1:N ) < ε where ε is a
(5)
small threshold, and θˆ ML is obtained. The EM steps increase the likelihood monotonically with local optimum convergence.
Yn =
Cn = I d ⊗ YnT
(6)
where I d is d × d identity matrix and ⊗ denotes the Kronecker Defining a n a
the
a n = an −1 + w n where
product.
y1:n {y1,…, y n } and
observations
covariance Pn| N , are estimated by backward recursions for
2. METHODS 2.1. TV-MAR Model
yn =
pθ (a n | y1:n ) given
model θ . As (2)-(3) are of linear-Gaussian form, the mean and covariance of pθ (a n | y1:n ) , denoted as a n|n = E[a n | y1:n ] and
pd 2 × 1 state
vector
of
TV-MAR
coefficients at time n, re-arranged from the matrices {A kn }kp=1 , the process (1) is re-written in a compact form (2) with a linear mapping Cn of the past observations, as the observation equation.
2.3. TV-PDC and TV-DTF for Dynamic Connectivity The estimated time-varying matrices of MAR coefficients {A k n }kp=1 and the Fourier transform of them p
A(n, f ) = I −
In the state equation, the hidden state a n is assumed to follow a first-order Gauss-Markov process as in (3). w n is a Gaussian state noise with mean zero and pd 2 × pd 2 covariance matrix Q , here we assume, for simplicity, as diagonal with identical elements q i.e. Q ~ N (0, qI ) . The covariates R and Q are assumed constant with time. Different structures of these covariance matrices have been studied in our earlier work [13] for evoked potentials. We denote θ = ( R, Q) the model parameters of TV-MAR state-space model.
2.2. Kalman Smoother & EM Algorithm Estimation of the latent state vectors of TV-MAR coefficients, involves estimating sequentially in time the filtering density of a n ,
∑A
kne
−2π if / f s
, 0≤ f ≤
k =1
fs 2
(10)
contain information that can characterize the dynamical network of causal influences between different brain regions, in time and frequency domain respectively. We consider two frequency domain measures of directed connectivity in time-varying form, derived from the time-varying spectrum A( n, f ) . First is the TV-PDC, defined as the normalized spectrum [8]
π ij ( n, f ) = where a j ( n, f ) is
the
j-th
| Aij ( n, f ) | a Hj
(11)
( n, f )a j ( n, f )
column
of
matrix A( n, f ) and
Aij ( n, f ) is the (i, j) element of A( n, f ) . H denotes the Hermitian
transpose. π ij ( n, f ) ∈ [0, 1] measures the strength of directed influence of channel j on channel i at frequency f and time n. Value of 1 indicates that all the influences originating from j are directed toward i, and 0 indicates no directed influence. Second is the TV-DTF between channel i and j, defined as [8]
δ ij ( n, f ) =
| H ij ( n, f ) | hiH (n, f )hi ( n, f )
(12)
2.5 a1
a2
2
1 1.5 0.5
1 0.5
0 0 -0.5
-0.5
0.5
0.5 a3
where H ( n, f ) = A −1 (n, f ) . Again, δ ij ( n, f ) is normalized between 0 and 1. For DTF, the inflow of information to channel i is normalized by the sum of all influences to i. It differs from the PDC where the outflow from channel j to i is normalized by all outflows from j. [8] reported that the TV-PDC can better reflect the connectivity pattern than the TV-DTF. Besides, the PDC is capable of distinguishing between direct and indirect influences.
3. EXPERIMENTAL RESULTS We compare the KF and KS estimators in term of tracking dynamic changes in the TV-MAR coefficients. A 2-dimensional TV-MAR model of order one is simulated according to (1). The evolution of the four MAR coefficients is assumed to follow a random-walk and an autoregression at two different time-intervals, to simulate highly non-stationary and relatively stable changes respectively.
n =1,…,1000 a + w n , a n = n −1 a w n =1001,…,2000 0.7 + , n −1 n
1.5
a4
0
0
-0.5
-0.5
-1
-1
-1.5
0
500
1000
1500
2000
-1.5
0
500
1000
1500
Fig. 1. Tracking of the TV-MAR coefficients by KF ( ( ). ( ) indicates the true values.
) and KS
0.5
0.08
1
0.07
1.5 0.06 2 0.05 2.5 0.04
3
(13)
Abrupt changes are created at the transitions between intervals. For this simulation, we assume both state and observation noises with diagonal covariance matrices wn ~ N (0,1×10−4 I) and vn ~ N (0, I) , which are known and not estimated. The results are given in Fig. 1. The KF reasonably detect both smooth and abrupt changes. The KS estimates are shown to provide relatively stable trends for the noisy KF estimates. This is also reflected by the mean-squared error (MSE) performance of 0.0220 and 0.0119 for KF and KS respectively. TV-MAR with non-Gaussian noise [13] may better detect the abrupt changes, with the multi-modal posterior densities possibly obtained by particle or Gaussian-sum smoother [15]. We evaluate the performance of TV-PDC and TV-DTF estimated by the KS and EM algorithm, on estimating the interhemispheric dynamic interactions between motor cortex areas, during the MI of left and right-hand. A subset of the BCI Competition III dataset of cued MI EEG is used, selecting only the left and right hand movement from one of the subjects. The subject was asked to perform imagery movement, started according to a cue presented at t=3s until t=7s, while EEG data was being recorded with sampling frequency 250Hz. Only three unipolar channels across the left and right hemispheres of motor cortex areas are studied i.e. at positions C3, Cz and C4. This aims to study its potential application to the challenging, practical BCI using very few channels. q = 1×10−6 is set. The EM-fitted observational noise covariance matrices on MI EEG for both hands in Fig. 2 show that there exist instantaneous correlations between channels. Fig. 3(a) and (b) show the EM-KS estimated time-frequency representations of TV-PDC between the three channels for left- and right-hand MI. The interactions between each pair of channels are presented using two maps, for the opposite directions of information flows. It can be seen that most of the directed information flows happen within frequencies12-20 Hz (beta band) and exhibit changes over time. The activities within 0-10Hz are
2000
Sample Number
3.5 0.5
0.03 1
1.5
2
2.5
3
3.5
0.5
1
1.5
2
2.5
3
3.5
(a) (b) Fig. 2. Estimated covariance matrices of observation noise, R on EEG data from left-hand (a) and right-hand (b) MI. less than that of the beta-band. From Fig. 3(a) for the left-hand MI, a time-varying change of coherence is observed during imagery movement from direction C3→C4, where coherence levels around beta band increase, started from the cue onset at 3s until 5s, as the imagery movement persists. This behavior is consistent with the well-known findings of increased bilateral coherence in the beta band rhythm, between left and right hemispheres during finger movements [11]. However, the standard coherence analyses in these studies provide only the information about functional correlations, but are unable to discriminate the directionality of the connections. The use of directed coherence here, however, shows different results for the reverse direction C4→C3, suggesting that no inter-hemispheric coherence is found during the MI. This implies an existence of asymmetric pattern of inter-hemispheric information flows between left and right motor cortex during MI, which cannot be detected by the standard analysis. Interestingly, respective to the left-hand MI, a reversed connectivity pattern is found for the right-hand (shown in Fig 3(b)), where the beta-band rhythm exhibits coherence increase from C4→C3, an opposite direction to the left-hand MI; while the absence of hemispheric coherence is from C3→C4 instead. This discrimination suggests the potential usage for BCI.The increases of coherence from C3→C4 for the left-hand MI and from C4→C3 for the right-hand indicate that information flows from the ipsilateral hemisphere to the contralateral hemisphere, with respect to the movement. Besides, for both left- and right-hand MI, strong and relatively static directed influences are observed from the center electrode Cz to both hemispheres (i.e. Cz→C3, Cz→C4). However, weak interactions are found in the opposite directions (i.e. C3→Cz, C4→Cz). The estimates by the TV-DTF in Fig. 3(c)-(d) demonstrate
Left-Hand MI C3