Anticipatory DTW for Efficient Similarity Search in ... - Semantic Scholar

Report 2 Downloads 151 Views
Anticipatory DTW for Efficient Similarity Search in Time Series Databases ◦ • Ira Assent Marc Wichterich • • Hardy Kremer Thomas Seidl •

RWTH Aachen University, Germany



Ralph Krieger

Aalborg University, Denmark

 

VLDB 2009 Lyon, France



Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Overview

1

Introduction

2

Dynamic Time Warping

3

Anticipatory pruning

4

Experiments

5

Conclusion

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

2 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Time series similarity search

time series from EEG data set 3000 2800

Time series Sequence of time related values Stock data, sensor data, EEG measurements, climate data, ...

value (ti) [[mV]

2600 2400 2200 2000 1800 1600 1400 1200 1000 800 0

128

256

384

512

point in time (i)

Similarity search Find time series with similar patterns over time

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

3 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Euclidean Distance Dynamic Time Warping (DTW)

Most widely used distance functions: Euclidean distance and Dynamic Time Warping Dynamic Time Warping allows scaling and stretching for better alignment, but computationally costly Euclidean Distance

Dynamic Time Warping

Figure: Euclidean distance (left) and Dynamic Time Warping (right)

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

4 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

DTW definition k-band DTW DTW ([s1 , ..., sn ], [t1 , ..., tm ]) =  DTW ([s1 , ..., sn−1 ], [t1 , ..., tm−1 ]) distband (sn , tm ) + min DTW ([s1 , ..., sn ], [t1 , ..., tm−1 ])  DTW ([s1 , ..., sn−1 ], [t1 , ..., tm ]) with ( distband (si , tj ) = DTW (∅, ∅) = 0,

Assent et al.

l m dist(si , tj ) |i − j·n m |≤k ∞ else

DTW (x, ∅) = ∞,

DTW (∅, y ) = ∞

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

5 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

DTW Computation

tj

si

dist(si,tj) + min{ci-1,j-1,ci,j-1,ci-1,j} Figure: Cumulative warping matrix Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

6 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Existing DTW algorithms DTW is computationally expensive Many approaches use multistep filter-and-refine architecture If filter lower bounds DTW ⇒ lossless Different filters have been proposed and achieve substantial speed-ups

query index (filter) candidates

refinement

database (exact)

result

Figure: Multistep filter-and-refine architecture Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

7 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

DTW properties for speed-up DTW is incremental. For any cumulative DTW matrix C = [ci,j ], the column minima are monotonically non-decreasing: mini=1,...,n {ci,x } ≤ mini=1,...,n {ci,y } for x < y . → Existing approach: early stopping /early abandon compute DTW cumulative matrix after each filled column (band), check threshold for pruning Not as tight as it could be ⇒ new anticipatory pruning

tj

si

dist(si,tj) + min{ci-1,j-1,ci,j-1,ci-1,j} Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

8 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Anticpatory pruning

DTW

query q

data

AP2 … APn AP1 filter result candidates anticipatory pruning

Figure: Anticipatory pruning in multistep filter-and-refine

Multistep with anticipatory pruning Benefit from work already done in filter step Low overhead, substantial speed-up

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

9 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

General idea Anticipatory pruning: As in early stopping, compute DTW incrementally Additionally: re-use filter information to anticipate full DTW (no additional computation necessary!) Requires: filter for remainder of time series

1..5

Assent et al.

1..6

es

...

TW

es

D

TW D

6..15

tim

at e tim

at e tim es TW D

...

at e

We characterize a class of filters to construct such anticipation: piecewise DTW property: reversible

7..15

1..7

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

8..15

10 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Piecewise filter Piecewise DTW lower bound. A piecewise lower bounding filter for the DTW distance is a set f = {f0 , ..., fm } with the following property: j = 0 : fj (s, t) = 0 ∀j > 0 : fj (s, t) ≤

min

(i,j)∈bandj

DTW ([s1 , ..., si ], [t1 , ..., tj ])

1..5

Assent et al.

1..6

es

...

TW

es

D

TW D

6..15

tim

at e tim

at e tim es TW D

...

at e

Piecewise is the property of the filter that complements the incremental nature of DTW.

7..15

1..7

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

8..15

11 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

DTW is reversible DTW is reversible. For any two time series [s1 , ..., sn ] and [t1 , ..., tm ] their DTW distance is the same as for the reversed time series: DTW ([s1 , ..., sn ],[t1 , ..., tm ]) = DTW ([sn , ..., s1 ],[tm , ..., t1 ])

1..5

Assent et al.

1..6

es

...

TW

es

D

TW D

6..15

tim

at e tim

at e tim es TW D

...

at e

Reversible is the DTW property that allows alteration of the time series direction between filter and DTW.

7..15

1..7

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

8..15

12 / 21

[5 , 3 , 4 , 5 , 3 , 4 , 8 , 3 , 9 ,

Introduction

Dynamic Time Warping

Anticipatory pruning

23 27 28

Experiments

Conclusion

19 18 21

Anticipatory pruning

16 17 21

Anticipatory Pruning qDistance.

14 15 17 12 13 16

9 13 n13and m, a cumulative distance matrix Given two time series s and t of length 6 8 10 C = [ci,j ], a piecewise lower bounding filter f for reversed time series, the j th 2 7 6 anticipatory pruning is 2 5 1 2 min 3 4 {c 5 } 6 + 7 f 8 (s 9 ← 10, t ← ). 0 := 6 10 13i,j 15 17 m−j 18 21 28 0 2 5i=1,...,n f : 18 18 15 14 11 10 10 10 10 7 0 AP: 18 20 20 20 21 23 25 27 28 28 28

stept) j: APj (s, min{}:

40

DTW

35

AP=min{} + f

distance

30 25

pruning threshold

20 15 10

f: anticipatory part on inverted time series min{}: early stopping part

5 0

step j: 0 Assent et al.

1

2

3

4

5

6

7

8

9 10

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

13 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Example

q

step j: min{}: f: AP:

[5 , 3 , 4 , 5 , 3 , 4 , 8 , 3 , 9 , 10]

t [3 , 8 , 2 , 0 , 2 , 6 , 6 , 4 , 0 , 2]

0 0 18 18

33 35 23 27 28 19 18 21 16 17 21 14 15 17 12 13 16 9 13 13 2

6

8 10

7

6

2 5 1 2 3 4 5 6 2 5 6 10 13 15 18 15 14 11 10 10 20 20 20 21 23 25

7 17 10 27

8 9 10 18 21 28 10 7 0 28 28 28

40 35 Figure: 30 Assent et al.

DTW

Anticipatory pruning example AP=min{} + f

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

14 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Anticipatory pruning is lossless Theorem: Anticipatory Pruning lower bounds DTW. Anticipatory pruning between two time series s, t with respect to a bandwidth k and a lower bounding filter f lower bounds the DTW: APj (s, t) ≤ DTW (s, t) ∀ j ∈ {1, . . . , n} Proof sketch Anticipatory pruning: series of partial DTW paths plus lower bound estimate of the remainder from the previous filter step (1) For each possible step r , 1 ≤ r ≤ n, column minima lower bound the true path (2) DTW is reversible (3) Combine (1),(2): lower bound of DTW Lower bounding means: lossless pruning, i.e. speed-up + no loss of accuracy Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

15 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Existing piecewise lower bounds

Linearization LBKeogh : Euclidean distance to envelopes (upper, lower bound of segments) [Keogh, VLDB 2002; Zhu/Shasha, SIGMOD 2003] Corner boundaries LBHybrid : piecewise corner-like shapes in the warping matrix through which every warping path has to pass [Zhou/Wong, ICDE 2007] Path Approximation FTW (Fast search method for Dynamic Time Warping): go from coarser DTW (less costly) to finer as needed: [Sakurai/Yoshikawa/Faloutsos, PODS 2005] → are all piecewise lower bounds as required for anticipatory pruning (details in paper)

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

16 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Experimental setup Synthetic and real world data sets SignLanguage: 1, 400 multivariate (11 attributes) of sign language finger tracking data[1] , length 64 to 512 TRECVid: 650 to 2, 000 benchmark data [2] NEWSVid: 2, 000 to 8, 000 TV news recorded at 30 fps (20 attributes), length 64 to 2048 Random walk RW1/RW2: zero normalized time series of length 512 and of cardinality 10,time000 (1 to 50 attributes) time series generated via RW2 series generated via RW1 40

6000

30

4000

20

value (tti)

value (tti)

2000 10 0

0 -2000

-10 -4000

-20 -30

-6000 0

[1]

(a)

128

256

384

RW1 t(i+1) ti + point = in time (i)N(0, 1) j j

www.cse.unsw.edu.au/ waleed/tml/data/ campaigns and TRECVid, MIR 2006

Assent et al.

0

(b) [2]

512

128

256

384

512

RW2 t(i+1) = tpoint in time ij + N(t ij (i)− t(i−1)j , 1) j

Smeaton/Over/Kraaij, Evaluation

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

17 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Experiments

relative Improvement [percent]

100

80

60 AP_FTW ES_FTW

40

AP_LBKeogh ES_LBKeogh

20

AP_LBHybrid ES_LBHybrid 0 1

5

10

20

30

40

50

number of attributes

Figure: Relative improvement (#calc.) for varying number of attributes on RW2

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

18 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Experiments 2

relative Improvement [percent]

65 AP_LBKeogh ES_LBKeogh

55

AP_FTW ES_FTW 45

AP_LBHybrid ES_LBHybrid

35

25 10

30

50

70

90

110

130

150

k

Figure: Efficiency improvement (#calc.) for varying DTW bandwidths on NEWSVid

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

19 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

1

LBKeogh ES_LBKeogh AP_LBKeogh

0.5

0.25

0.125 128

64 32 reduced length average query time [s] log. scale

256

16

8

average query time [s] log. scale

average query time [s] log. scale

Experiments 3 100

10

1

FTW ES_FTW AP_FTW

0.1 256

1

128

64 32 reduced length

16

8

LBHybrid ES_LBHybrid AP_LBHybrid

0.5

0.25

0.125 256

128

64 32 reduced length

16

8

Figure: (log. scale) Absolute improvement (average query time), reduction, RW2 Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

20 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Conclusion Anticipatory pruning Speed up DTW (Dynamic Time Warping) Widely used distance function for time series similarity search

Our novel anticipatory pruning makes best use of A family of multistep filter-and-refine approaches Compute an estimated overall DTW distance from already available filter information: series of lower bounds of the DTW DTW

query q

data

AP2 … APn AP1 filter result candidates anticipatory pruning

Experiments demonstrate substantially reduced runtime AP can be flexibly combined with existing and future DTW lower bounds AP is orthogonal to speed-up via indexing etc.

Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

21 / 21

Introduction

Dynamic Time Warping

Anticipatory pruning

Experiments

Conclusion

Conclusion Anticipatory pruning Speed up DTW (Dynamic Time Warping) Widely used distance function for time series similarity search

Our novel anticipatory pruning makes best use of A family of multistep filter-and-refine approaches Compute an estimated overall DTW distance from already available filter information: series of lower bounds of the DTW DTW

query q

data

AP2 … APn AP1 filter result candidates anticipatory pruning

Experiments demonstrate substantially reduced runtime AP can be flexibly combined with existing and future DTW lower bounds AP is orthogonal to speed-up via indexing etc. Thank you for your attention. Questions? Assent et al.

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

21 / 21