Online Time Series Prediction with Missing Data Oren Anava
Elad Hazan
Assaf Zeevi
July 9, 2015
Oren Anava
Time Series with Missing Data
July 9, 2015
1 / 22
Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals.
Oren Anava
Time Series with Missing Data
July 9, 2015
2 / 22
Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals. Examples: Finance - daily returns (or closing rates) Weather - monthly average temperature
Oren Anava
Time Series with Missing Data
July 9, 2015
2 / 22
Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals. Examples: Finance - daily returns (or closing rates) Weather - monthly average temperature Time series modeling:
AR(p) (Autoregressive with lag p) Xt =
p X
αi Xt−i + εt
i=1
Oren Anava
Time Series with Missing Data
July 9, 2015
3 / 22
Motivation In real-life, time series often contain missing data
Oren Anava
Time Series with Missing Data
July 9, 2015
4 / 22
Motivation In real-life, time series often contain missing data
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions:
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions: 1
Generative model, usually Gaussian (many works)
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions: 1 2
Generative model, usually Gaussian (many works) Stochastic omission of the data (many works)
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions: 1 2 3
Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions: 1 2 3
Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)
Main problem: misspecification. . .
Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Motivation In real-life, time series often contain missing data
Statistical approach suffer from strict assumptions: 1 2 3
Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)
Main problem: misspecification. . .
Our goal Relax (or even remove) statistical assumptions on the data. Oren Anava
Time Series with Missing Data
July 9, 2015
5 / 22
Problem Definition Our setting:
Oren Anava
Time Series with Missing Data
July 9, 2015
6 / 22
Problem Definition Our setting: 1
At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.
Oren Anava
Time Series with Missing Data
July 9, 2015
6 / 22
Problem Definition Our setting: 1
2
At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it. ˜t and suffers `t Xt , X ˜t 1{Xt }. Then, the online player predicts X
Oren Anava
Time Series with Missing Data
July 9, 2015
7 / 22
Problem Definition Our setting: 1
2
At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it. ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X
The performance are measured using the regret: RT =
T X t=1
Oren Anava
T X ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K
t=1
Time Series with Missing Data
July 9, 2015
8 / 22
Problem Definition Our setting: 1
2
At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it. ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X
The performance are measured using the regret: RT =
T X
T X ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K
t=1
t=1
˜tAR (α) to denote the best AR predictor. where naturally we would like X That is, ˜tAR (α) = X
p X
αk Xt−k
k=1
Oren Anava
Time Series with Missing Data
July 9, 2015
8 / 22
Problem Definition Our setting: 1
2
At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it. ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X
The performance are measured using the regret: RT =
T X
T X ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K
t=1
t=1
˜tAR (α) to denote the best AR predictor. where naturally we would like X That is, ˜tAR (α) = X
p X
αk Xt−k
k=1
The problem: AR prediction is not defined for missing data! Oren Anava
Time Series with Missing Data
July 9, 2015
8 / 22
Problem Definition The problem: AR prediction is not defined for missing data!
Oren Anava
Time Series with Missing Data
July 9, 2015
9 / 22
Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X
p X k=1
αk Xt−k 1{Xt−k } +
p X
˜ AR (α)(1 − 1{Xt−k }) αk X t−k
k=1
and compete against the best predictor of finite order (d < ∞) in it.
Oren Anava
Time Series with Missing Data
July 9, 2015
9 / 22
Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X
p X k=1
αk Xt−k 1{Xt−k } +
p X
˜ AR (α)(1 − 1{Xt−k }) αk X t−k
k=1
and compete against the best predictor of finite order (d < ∞) in it. For example, AR(1) with d = 2 generates predictions: ˜tAR (α) = α1 Xt−1 X AR ˜tAR (α) = α1 X ˜t−1 (α) = α1 (α1 Xt−2 ) = α12 Xt−2 X ˜t (α) = 0 X AR
Oren Anava
if Xt−1 exists if only Xt−2 exists if both do not exist
Time Series with Missing Data
July 9, 2015
9 / 22
Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X
p X
αk Xt−k 1{Xt−k } +
k=1
p X
˜ AR (α)(1 − 1{Xt−k }) αk X t−k
k=1
and compete against the best predictor of finite order (d < ∞) in it. For example, AR(1) with d = 2 generates predictions: ˜tAR (α) = α1 Xt−1 X AR ˜tAR (α) = α1 X ˜t−1 (α) = α1 (α1 Xt−2 ) = α12 Xt−2 X ˜t (α) = 0 X AR
if Xt−1 exists if only Xt−2 exists if both do not exist
The problem now is well-defined! Oren Anava
Time Series with Missing Data
July 9, 2015
9 / 22
Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here.
Oren Anava
Time Series with Missing Data
July 9, 2015
10 / 22
Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?
Oren Anava
Time Series with Missing Data
July 9, 2015
10 / 22
Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?
Idea: Use non-proper learning! Generate the predictions from a different model than the one we try to compete with.
Oren Anava
Time Series with Missing Data
July 9, 2015
10 / 22
Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?
Idea: Use non-proper learning! Generate the predictions from a different model than the one we try to compete with. In our context, we define ˜t (w ) = w > Φ(Xt−d , . . . , Xt−1 ) X d
for w ∈ R2 and Φ(Xt−d , . . . , Xt−1 ) is (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} ) {z } | {z } | 21−1 times
Oren Anava
2d−1 times
Time Series with Missing Data
July 9, 2015
11 / 22
Main Idea Let’s get back to the previous example (AR(1) with d = 2)...
Oren Anava
Time Series with Missing Data
July 9, 2015
12 / 22
Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X
if Xt−1 exists if only Xt−2 exists
˜tAR (α) = 0 X
Oren Anava
if both do not exist
Time Series with Missing Data
July 9, 2015
12 / 22
Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X
if Xt−1 exists if only Xt−2 exists
˜tAR (α) = 0 X
if both do not exist
⇓ >
˜t (w ) = w Φ(Xt−d , . . . , Xt−1 ) X 2
Now w ∈ R2 replaces (0, α1 , α12 , α2 ) and Φ makes sure that only corresponding observations are non-zero. How? Oren Anava
Time Series with Missing Data
July 9, 2015
12 / 22
Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X
if Xt−1 exists if only Xt−2 exists
˜tAR (α) = 0 X
if both do not exist
⇓ >
˜t (w ) = w Φ(Xt−d , . . . , Xt−1 ) X 2
Now w ∈ R2 replaces (0, α1 , α12 , α2 ) and Φ makes sure that only corresponding observations are non-zero. How? indicators of good paths. Oren Anava
Time Series with Missing Data
July 9, 2015
12 / 22
Main Idea What are the advantages of w ?
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
2
Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X
T X √ ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X
t=1
α∈K
t=1
using the standard Follow The Regularized Leader.
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
2
Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X
T X √ ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X
t=1
α∈K
t=1
using the standard Follow The Regularized Leader. On the other hand...
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
2
Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X
T X √ ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K
t=1
t=1
using the standard Follow The Regularized Leader. On the other hand... 1
d
w ∈ R2 and thus c ≈ 2d for standard losses
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
2
Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X
T X √ ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K
t=1
t=1
using the standard Follow The Regularized Leader. On the other hand... d
1
w ∈ R2 and thus c ≈ 2d for standard losses
2
time and space complexities are also exponential in d
Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Main Idea What are the advantages of w ? 1
The prediction is linear in w and thus learning is possible
2
Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X
T X √ ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K
t=1
t=1
using the standard Follow The Regularized Leader. On the other hand... d
1
w ∈ R2 and thus c ≈ 2d for standard losses
2
time and space complexities are also exponential in d
Can we solve these issues? Oren Anava
Time Series with Missing Data
July 9, 2015
13 / 22
Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t.
Oren Anava
Time Series with Missing Data
July 9, 2015
14 / 22
Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t. So basically we need to compute h(0, Xs−1 1{good path} , . . . , Xs−d 1{good path} , . . . , Xs−d 1{good path} ), (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} )i | {z } | {z } 21−1 times
Oren Anava
2d−1 times
Time Series with Missing Data
July 9, 2015
14 / 22
Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t. So basically we need to compute h(0, Xs−1 1{good path} , . . . , Xs−d 1{good path} , . . . , Xs−d 1{good path} ), (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} )i | {z } | {z } 21−1 times
2d−1 times
which will take the form d X
n(k)Xs−k Xt−k
k=1
where n(k) is the number of common good paths of length k. Oren Anava
Time Series with Missing Data
July 9, 2015
14 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X
Oren Anava
Time Series with Missing Data
July 9, 2015
15 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
Oren Anava
Time Series with Missing Data
July 9, 2015
15 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
Oren Anava
Time Series with Missing Data
July 9, 2015
16 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
Oren Anava
Time Series with Missing Data
July 9, 2015
17 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
Oren Anava
Time Series with Missing Data
July 9, 2015
18 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
So, totally 2 common good paths, which is 2(common missing observations) .
Oren Anava
Time Series with Missing Data
July 9, 2015
18 / 22
Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:
So, totally 2 common good paths, which is 2(common missing observations) . In our example, common missing observations = 1.
Oren Anava
Time Series with Missing Data
July 9, 2015
19 / 22
Experimental Results MSE vs. time for 10% missing data (sanity check)
10
−0.7
MSE
−0.5
10
MSE vs. time for 10% missing data (heteroscedasticity)
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
10
−0.7
10
−0.8
10
−0.8
10
−0.7
−0.5
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
MSE
10
MSE
MSE vs. time for 10% missing data (AR mixture)
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.3
10
−0.9
10
−0.9
10
−0.9
10
500
1000 Time
1500
2000
0
MSE@2000 vs. percentage of missing data (sanity check) 0.4
1000 Time
1500
2000
0
0.25 0.2 0.15
0.3 0.25 0.2 0.15
0.3 0.25 0.2
0.1
0.05
0.05
0.05
0
0
Oren Anava
2000
0.15
0.1
10% 20% Percentage of missing data
1500
0%
10% 20% Percentage of missing data
Time Series with Missing Data
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
0.1
0%
1000 Time
0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
MSE@2000
0.3
500
MSE@2000 vs. percentage of missing data (heteroscedasticity)
0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
MSE@2000
500
MSE@2000 vs. percentage of missing data (AR mixture)
MSE@2000
0
0
0%
10% 20% Percentage of missing data
July 9, 2015
20 / 22
Experimental Results MSE vs. time for 10% missing data (sanity check)
10
−0.7
MSE
−0.5
10
MSE vs. time for 10% missing data (heteroscedasticity)
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
10
−0.7
10
−0.8
10
−0.8
10
−0.7
−0.5
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
MSE
10
MSE
MSE vs. time for 10% missing data (AR mixture)
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.3
10
−0.9
10
−0.9
10
−0.9
10
0
500
1000 Time
1500
2000
0
MSE@2000 vs. percentage of missing data (sanity check)
1000 Time
1500
2000
0
0.4
0.25 0.2 0.15
0.3 0.25 0.2 0.15
0.3 0.25
2000
0.2 0.15
0.1
0.1
0.05
0.05
0.05
0
0
10% 20% Percentage of missing data
1500
0%
10% 20% Percentage of missing data
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
0.1
0%
1000 Time
0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
MSE@2000
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.3
500
MSE@2000 vs. percentage of missing data (heteroscedasticity)
MSE@2000
0.4 0.35
MSE@2000
500
MSE@2000 vs. percentage of missing data (AR mixture)
0
0%
10% 20% Percentage of missing data
All methods perform roughly the same for well-specified data
Oren Anava
Time Series with Missing Data
July 9, 2015
20 / 22
Experimental Results MSE vs. time for 10% missing data (sanity check)
10
−0.7
MSE
−0.5
10
MSE vs. time for 10% missing data (heteroscedasticity)
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
10
−0.7
10
−0.8
10
−0.8
10
−0.7
−0.5
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
MSE
10
MSE
MSE vs. time for 10% missing data (AR mixture)
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.3
10
−0.9
10
−0.9
10
−0.9
10
0
500
1000 Time
1500
2000
0
MSE@2000 vs. percentage of missing data (sanity check)
1000 Time
1500
2000
0
0.4
0.25 0.2 0.15
0.3 0.25 0.2 0.15
0.3 0.25
2000
0.2 0.15
0.1
0.1
0.05
0.05
0.05
0
0
10% 20% Percentage of missing data
1500
0%
10% 20% Percentage of missing data
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
0.1
0%
1000 Time
0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
MSE@2000
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.3
500
MSE@2000 vs. percentage of missing data (heteroscedasticity)
MSE@2000
0.4 0.35
MSE@2000
500
MSE@2000 vs. percentage of missing data (AR mixture)
0
0%
10% 20% Percentage of missing data
All methods perform roughly the same for well-specified data The online approach is robust to non-standard data
Oren Anava
Time Series with Missing Data
July 9, 2015
20 / 22
Experimental Results MSE vs. time for 10% missing data (sanity check)
10
−0.7
MSE
−0.5
10
MSE vs. time for 10% missing data (heteroscedasticity)
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
10
−0.7
10
−0.8
10
−0.8
10
−0.7
−0.5
10
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.6
MSE
10
MSE
MSE vs. time for 10% missing data (AR mixture)
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
−0.3
10
−0.9
10
−0.9
10
−0.9
10
0
500
1000 Time
1500
2000
0
MSE@2000 vs. percentage of missing data (sanity check)
1000 Time
1500
2000
0
0.4
0.25 0.2 0.15
0.3 0.25 0.2 0.15
0.3 0.25
2000
0.2 0.15
0.1
0.1
0.05
0.05
0.05
0
0
10% 20% Percentage of missing data
1500
0%
10% 20% Percentage of missing data
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
0.1
0%
1000 Time
0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.35
MSE@2000
Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm
0.3
500
MSE@2000 vs. percentage of missing data (heteroscedasticity)
MSE@2000
0.4 0.35
MSE@2000
500
MSE@2000 vs. percentage of missing data (AR mixture)
0
0%
10% 20% Percentage of missing data
All methods perform roughly the same for well-specified data The online approach is robust to non-standard data The online algorithms are faster and easier to implement
Oren Anava
Time Series with Missing Data
July 9, 2015
20 / 22
Conclusion What did we see?
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next?
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound?
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound? Can we generalize to multivariate time series?
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound? Can we generalize to multivariate time series?
Oren Anava
Time Series with Missing Data
July 9, 2015
21 / 22
Questions???
Oren Anava
Time Series with Missing Data
July 9, 2015
22 / 22