Online Time Series Prediction with Missing Data - Semantic Scholar

Report 2 Downloads 125 Views
Online Time Series Prediction with Missing Data Oren Anava

Elad Hazan

Assaf Zeevi

July 9, 2015

Oren Anava

Time Series with Missing Data

July 9, 2015

1 / 22

Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals.

Oren Anava

Time Series with Missing Data

July 9, 2015

2 / 22

Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals. Examples: Finance - daily returns (or closing rates) Weather - monthly average temperature

Oren Anava

Time Series with Missing Data

July 9, 2015

2 / 22

Basic Concepts Time Series A time-series is a sequence of samples, measured at successive times, which are assumed to be spaced at uniform time intervals. Examples: Finance - daily returns (or closing rates) Weather - monthly average temperature Time series modeling:

AR(p) (Autoregressive with lag p) Xt =

p X

αi Xt−i + εt

i=1

Oren Anava

Time Series with Missing Data

July 9, 2015

3 / 22

Motivation In real-life, time series often contain missing data

Oren Anava

Time Series with Missing Data

July 9, 2015

4 / 22

Motivation In real-life, time series often contain missing data

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions:

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions: 1

Generative model, usually Gaussian (many works)

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions: 1 2

Generative model, usually Gaussian (many works) Stochastic omission of the data (many works)

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions: 1 2 3

Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions: 1 2 3

Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)

Main problem: misspecification. . .

Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Motivation In real-life, time series often contain missing data

Statistical approach suffer from strict assumptions: 1 2 3

Generative model, usually Gaussian (many works) Stochastic omission of the data (many works) Structural missing data (several works)

Main problem: misspecification. . .

Our goal Relax (or even remove) statistical assumptions on the data. Oren Anava

Time Series with Missing Data

July 9, 2015

5 / 22

Problem Definition Our setting:

Oren Anava

Time Series with Missing Data

July 9, 2015

6 / 22

Problem Definition Our setting: 1

At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.

Oren Anava

Time Series with Missing Data

July 9, 2015

6 / 22

Problem Definition Our setting: 1

2

At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.  ˜t and suffers `t Xt , X ˜t 1{Xt }. Then, the online player predicts X

Oren Anava

Time Series with Missing Data

July 9, 2015

7 / 22

Problem Definition Our setting: 1

2

At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.  ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X

The performance are measured using the regret: RT =

T X t=1

Oren Anava

T X   ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K

t=1

Time Series with Missing Data

July 9, 2015

8 / 22

Problem Definition Our setting: 1

2

At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.  ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X

The performance are measured using the regret: RT =

T X

T X   ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K

t=1

t=1

˜tAR (α) to denote the best AR predictor. where naturally we would like X That is, ˜tAR (α) = X

p X

αk Xt−k

k=1

Oren Anava

Time Series with Missing Data

July 9, 2015

8 / 22

Problem Definition Our setting: 1

2

At time t, the adversary generates Xt (arbitrarily) and chooses whether to reveal it.  ˜t 1{Xt }. ˜t and suffers `t Xt , X Then, the online player predicts X

The performance are measured using the regret: RT =

T X

T X   ˜t 1{Xt } − min ˜tAR (α) 1{Xt } `t Xt , X `t Xt , X α∈K

t=1

t=1

˜tAR (α) to denote the best AR predictor. where naturally we would like X That is, ˜tAR (α) = X

p X

αk Xt−k

k=1

The problem: AR prediction is not defined for missing data! Oren Anava

Time Series with Missing Data

July 9, 2015

8 / 22

Problem Definition The problem: AR prediction is not defined for missing data!

Oren Anava

Time Series with Missing Data

July 9, 2015

9 / 22

Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X

p X k=1

αk Xt−k 1{Xt−k } +

p X

˜ AR (α)(1 − 1{Xt−k }) αk X t−k

k=1

and compete against the best predictor of finite order (d < ∞) in it.

Oren Anava

Time Series with Missing Data

July 9, 2015

9 / 22

Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X

p X k=1

αk Xt−k 1{Xt−k } +

p X

˜ AR (α)(1 − 1{Xt−k }) αk X t−k

k=1

and compete against the best predictor of finite order (d < ∞) in it. For example, AR(1) with d = 2 generates predictions: ˜tAR (α) = α1 Xt−1 X AR ˜tAR (α) = α1 X ˜t−1 (α) = α1 (α1 Xt−2 ) = α12 Xt−2 X ˜t (α) = 0 X AR

Oren Anava

if Xt−1 exists if only Xt−2 exists if both do not exist

Time Series with Missing Data

July 9, 2015

9 / 22

Problem Definition The problem: AR prediction is not defined for missing data! Solution: define new family of predictors: ˜tAR (α) = X

p X

αk Xt−k 1{Xt−k } +

k=1

p X

˜ AR (α)(1 − 1{Xt−k }) αk X t−k

k=1

and compete against the best predictor of finite order (d < ∞) in it. For example, AR(1) with d = 2 generates predictions: ˜tAR (α) = α1 Xt−1 X AR ˜tAR (α) = α1 X ˜t−1 (α) = α1 (α1 Xt−2 ) = α12 Xt−2 X ˜t (α) = 0 X AR

if Xt−1 exists if only Xt−2 exists if both do not exist

The problem now is well-defined! Oren Anava

Time Series with Missing Data

July 9, 2015

9 / 22

Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here.

Oren Anava

Time Series with Missing Data

July 9, 2015

10 / 22

Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?

Oren Anava

Time Series with Missing Data

July 9, 2015

10 / 22

Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?

Idea: Use non-proper learning! Generate the predictions from a different model than the one we try to compete with.

Oren Anava

Time Series with Missing Data

July 9, 2015

10 / 22

Main Challenge The prediction is no longer linear in α and thus the loss is non-convex! Proper learning is impossible here. So, what can we use instead?

Idea: Use non-proper learning! Generate the predictions from a different model than the one we try to compete with. In our context, we define ˜t (w ) = w > Φ(Xt−d , . . . , Xt−1 ) X d

for w ∈ R2 and Φ(Xt−d , . . . , Xt−1 ) is (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} ) {z } | {z } | 21−1 times

Oren Anava

2d−1 times

Time Series with Missing Data

July 9, 2015

11 / 22

Main Idea Let’s get back to the previous example (AR(1) with d = 2)...

Oren Anava

Time Series with Missing Data

July 9, 2015

12 / 22

Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X

if Xt−1 exists if only Xt−2 exists

˜tAR (α) = 0 X

Oren Anava

if both do not exist

Time Series with Missing Data

July 9, 2015

12 / 22

Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X

if Xt−1 exists if only Xt−2 exists

˜tAR (α) = 0 X

if both do not exist

⇓ >

˜t (w ) = w Φ(Xt−d , . . . , Xt−1 ) X 2

Now w ∈ R2 replaces (0, α1 , α12 , α2 ) and Φ makes sure that only corresponding observations are non-zero. How? Oren Anava

Time Series with Missing Data

July 9, 2015

12 / 22

Main Idea Let’s get back to the previous example (AR(1) with d = 2)... ˜tAR (α) = α1 Xt−1 X ˜tAR (α) = α12 Xt−2 X

if Xt−1 exists if only Xt−2 exists

˜tAR (α) = 0 X

if both do not exist

⇓ >

˜t (w ) = w Φ(Xt−d , . . . , Xt−1 ) X 2

Now w ∈ R2 replaces (0, α1 , α12 , α2 ) and Φ makes sure that only corresponding observations are non-zero. How? indicators of good paths. Oren Anava

Time Series with Missing Data

July 9, 2015

12 / 22

Main Idea What are the advantages of w ?

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

2

Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X

T X √   ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X

t=1

α∈K

t=1

using the standard Follow The Regularized Leader.

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

2

Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X

T X √   ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X

t=1

α∈K

t=1

using the standard Follow The Regularized Leader. On the other hand...

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

2

Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X

T X √   ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K

t=1

t=1

using the standard Follow The Regularized Leader. On the other hand... 1

d

w ∈ R2 and thus c ≈ 2d for standard losses

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

2

Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X

T X √   ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K

t=1

t=1

using the standard Follow The Regularized Leader. On the other hand... d

1

w ∈ R2 and thus c ≈ 2d for standard losses

2

time and space complexities are also exponential in d

Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Main Idea What are the advantages of w ? 1

The prediction is linear in w and thus learning is possible

2

Prediction with w is reacher than prediction with α (no constraints), and thus we can get T X

T X √   ˜tAR (α) 1{Xt } ≤ c T ˜t (w ) 1{Xt } − min `t Xt , X `t Xt , X α∈K

t=1

t=1

using the standard Follow The Regularized Leader. On the other hand... d

1

w ∈ R2 and thus c ≈ 2d for standard losses

2

time and space complexities are also exponential in d

Can we solve these issues? Oren Anava

Time Series with Missing Data

July 9, 2015

13 / 22

Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t.

Oren Anava

Time Series with Missing Data

July 9, 2015

14 / 22

Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t. So basically we need to compute h(0, Xs−1 1{good path} , . . . , Xs−d 1{good path} , . . . , Xs−d 1{good path} ), (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} )i | {z } | {z } 21−1 times

Oren Anava

2d−1 times

Time Series with Missing Data

July 9, 2015

14 / 22

Central Observation At time point t, the update step of FTRL requires the inner products of hΦ(Xs−d , . . . , Xs−1 ), Φ(Xt−d , . . . , Xt−1 )i for all s < t. So basically we need to compute h(0, Xs−1 1{good path} , . . . , Xs−d 1{good path} , . . . , Xs−d 1{good path} ), (0, Xt−1 1{good path} , . . . , Xt−d 1{good path} , . . . , Xt−d 1{good path} )i | {z } | {z } 21−1 times

2d−1 times

which will take the form d X

n(k)Xs−k Xt−k

k=1

where n(k) is the number of common good paths of length k. Oren Anava

Time Series with Missing Data

July 9, 2015

14 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X

Oren Anava

Time Series with Missing Data

July 9, 2015

15 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

Oren Anava

Time Series with Missing Data

July 9, 2015

15 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

Oren Anava

Time Series with Missing Data

July 9, 2015

16 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

Oren Anava

Time Series with Missing Data

July 9, 2015

17 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

Oren Anava

Time Series with Missing Data

July 9, 2015

18 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

So, totally 2 common good paths, which is 2(common missing observations) .

Oren Anava

Time Series with Missing Data

July 9, 2015

18 / 22

Central Observation ˜t and X ˜s ? But how many good paths of length k are common to X Assume k = 4:

So, totally 2 common good paths, which is 2(common missing observations) . In our example, common missing observations = 1.

Oren Anava

Time Series with Missing Data

July 9, 2015

19 / 22

Experimental Results MSE vs. time for 10% missing data (sanity check)

10

−0.7

MSE

−0.5

10

MSE vs. time for 10% missing data (heteroscedasticity)

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

10

−0.7

10

−0.8

10

−0.8

10

−0.7

−0.5

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

MSE

10

MSE

MSE vs. time for 10% missing data (AR mixture)

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.3

10

−0.9

10

−0.9

10

−0.9

10

500

1000 Time

1500

2000

0

MSE@2000 vs. percentage of missing data (sanity check) 0.4

1000 Time

1500

2000

0

0.25 0.2 0.15

0.3 0.25 0.2 0.15

0.3 0.25 0.2

0.1

0.05

0.05

0.05

0

0

Oren Anava

2000

0.15

0.1

10% 20% Percentage of missing data

1500

0%

10% 20% Percentage of missing data

Time Series with Missing Data

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

0.1

0%

1000 Time

0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

MSE@2000

0.3

500

MSE@2000 vs. percentage of missing data (heteroscedasticity)

0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

MSE@2000

500

MSE@2000 vs. percentage of missing data (AR mixture)

MSE@2000

0

0

0%

10% 20% Percentage of missing data

July 9, 2015

20 / 22

Experimental Results MSE vs. time for 10% missing data (sanity check)

10

−0.7

MSE

−0.5

10

MSE vs. time for 10% missing data (heteroscedasticity)

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

10

−0.7

10

−0.8

10

−0.8

10

−0.7

−0.5

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

MSE

10

MSE

MSE vs. time for 10% missing data (AR mixture)

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.3

10

−0.9

10

−0.9

10

−0.9

10

0

500

1000 Time

1500

2000

0

MSE@2000 vs. percentage of missing data (sanity check)

1000 Time

1500

2000

0

0.4

0.25 0.2 0.15

0.3 0.25 0.2 0.15

0.3 0.25

2000

0.2 0.15

0.1

0.1

0.05

0.05

0.05

0

0

10% 20% Percentage of missing data

1500

0%

10% 20% Percentage of missing data

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

0.1

0%

1000 Time

0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

MSE@2000

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.3

500

MSE@2000 vs. percentage of missing data (heteroscedasticity)

MSE@2000

0.4 0.35

MSE@2000

500

MSE@2000 vs. percentage of missing data (AR mixture)

0

0%

10% 20% Percentage of missing data

All methods perform roughly the same for well-specified data

Oren Anava

Time Series with Missing Data

July 9, 2015

20 / 22

Experimental Results MSE vs. time for 10% missing data (sanity check)

10

−0.7

MSE

−0.5

10

MSE vs. time for 10% missing data (heteroscedasticity)

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

10

−0.7

10

−0.8

10

−0.8

10

−0.7

−0.5

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

MSE

10

MSE

MSE vs. time for 10% missing data (AR mixture)

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.3

10

−0.9

10

−0.9

10

−0.9

10

0

500

1000 Time

1500

2000

0

MSE@2000 vs. percentage of missing data (sanity check)

1000 Time

1500

2000

0

0.4

0.25 0.2 0.15

0.3 0.25 0.2 0.15

0.3 0.25

2000

0.2 0.15

0.1

0.1

0.05

0.05

0.05

0

0

10% 20% Percentage of missing data

1500

0%

10% 20% Percentage of missing data

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

0.1

0%

1000 Time

0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

MSE@2000

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.3

500

MSE@2000 vs. percentage of missing data (heteroscedasticity)

MSE@2000

0.4 0.35

MSE@2000

500

MSE@2000 vs. percentage of missing data (AR mixture)

0

0%

10% 20% Percentage of missing data

All methods perform roughly the same for well-specified data The online approach is robust to non-standard data

Oren Anava

Time Series with Missing Data

July 9, 2015

20 / 22

Experimental Results MSE vs. time for 10% missing data (sanity check)

10

−0.7

MSE

−0.5

10

MSE vs. time for 10% missing data (heteroscedasticity)

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

10

−0.7

10

−0.8

10

−0.8

10

−0.7

−0.5

10

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.6

MSE

10

MSE

MSE vs. time for 10% missing data (AR mixture)

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

−0.3

10

−0.9

10

−0.9

10

−0.9

10

0

500

1000 Time

1500

2000

0

MSE@2000 vs. percentage of missing data (sanity check)

1000 Time

1500

2000

0

0.4

0.25 0.2 0.15

0.3 0.25 0.2 0.15

0.3 0.25

2000

0.2 0.15

0.1

0.1

0.05

0.05

0.05

0

0

10% 20% Percentage of missing data

1500

0%

10% 20% Percentage of missing data

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

0.1

0%

1000 Time

0.4 Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.35

MSE@2000

Yule Walker ARLSimpute EM (Kalman Filter) OGDimpute Our Algorithm

0.3

500

MSE@2000 vs. percentage of missing data (heteroscedasticity)

MSE@2000

0.4 0.35

MSE@2000

500

MSE@2000 vs. percentage of missing data (AR mixture)

0

0%

10% 20% Percentage of missing data

All methods perform roughly the same for well-specified data The online approach is robust to non-standard data The online algorithms are faster and easier to implement

Oren Anava

Time Series with Missing Data

July 9, 2015

20 / 22

Conclusion What did we see?

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next?

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound?

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound? Can we generalize to multivariate time series?

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Conclusion What did we see? New approach for time series prediction with missing data Theoretical guarantees with respect to the best AR predictor Good performance in practice What next? Can we improve 2d in the regret bound? Can we generalize to multivariate time series?

Oren Anava

Time Series with Missing Data

July 9, 2015

21 / 22

Questions???

Oren Anava

Time Series with Missing Data

July 9, 2015

22 / 22