fastdtw function is slower than in DynamicTimeWarping.jl

Question

fastdtw function is slower than in DynamicTimeWarping.jl

maetshju opened this issue 7 years ago · 8 comments

Hi there!

It makes me really happy that this package exists because I've been getting frustrated with the speed of using dynamic time warping on large datasets in Python. While I was comparing this package to the older DynamicTimeWarp.jl package, I found that the fastdtw function is slower here than it is in the older package. I'm a little confused about why this might be, because my understanding is that this package is based on DynamicTimeWarp.jl.

When I time the fastdtw functions using the @btime macro from BenchmarkTools.jl, I get approximately 14 ms for the function from DynamicTimeWarp.jl, and approximately 76 ms for the function from TimeWarp.jl.

This is the script, assuming you have two sounds "sound1.wav" and "sound2.wav":

using TimeWarp, DynamicTimeWarp, BenchmarkTools, Compat, WAV

s1, sf1 = wavread("sound1.wav", format="double")
s2, sf2 = wavread("sound2.wav", format="double")

s1 = reshape(s1, size(s1)[1])
s2 = reshape(s2, size(s2)[1])

@btime DynamicTimeWarp.fastdtw($s1, $s2, 1) # 14 ms
@btime TimeWarp.fastdtw($s1, $s2, 1) # 76 ms

I was just wondering if this is expected behavior, or if there might be a simple fix for this.

Answer 1 · 2017-12-23T16:50:47.000Z

Does this performance difference get much worse/larger for longer time series?

Answer 2 · 2017-12-23T18:15:10.000Z

I ran it on some longer audio files (around 1000 times longer), and it looks like the ratio of the difference is about constant.

Package	`@btime` output
DynamicTimeWarp.jl	17.744 s (288721458 allocations: 22.40 GiB)
TimeWarp.jl	82.779 s (2828198599 allocations: 52.71 GiB)

Answer 3 · 2018-11-08T23:28:34.000Z

FYI, if you still need a fast implementation in Python, check out **pydtw which uses the UCR Suite optimisations.

Answer 4 · 2020-04-23T07:46:33.000Z

The struct definition

struct Sequence{N,T} <: AbstractArray{T,1}
    val::AbstractArray{T,N}
end

has an abstractly typed field, meaning that all access to the vectors will incur the cost of dynamic dispatch. Parameterizing the struct on the vector type would solve this and likely speed up all algorithms that take a Sequence as input.

Answer 5 · 2020-04-23T21:07:52.000Z

Changing the struct definition to

struct Sequence{N,T} <: AbstractArray{T,1}
    val::Array{T,N}
end

gave me a significant speedup, which I believe is a way to implement @baggepinnen's suggestion. With the following script, I got 7.518 ms runtime for the modified Sequence definition and a 152.005 ms runtime for the original definition.

using TimeWarp, BenchmarkTools

# formula for pure tone: A*sin(2π * f * t)
# where A is amplitude,
# f is the frequency of the tone measured in Hz
# t is the time of the tone measured in s

const sf = 44100 # sampling frequency of 44.1 kHz

# synthesize a 440 Hz pure tone sampled at 44.1 kHz
# duration of 0.4 s (400 ms)
# 1 * sin(2π * 440 Hz * t)
s1_t = range(0.4, length=ceil(Int64, (.4 * sf)))
s1 = sin.(2π * 440 .* s1_t)

# synthesize a 600 Hz pure tone sampled at 44.1 kHz
# duratoin 0f 0.8 s (800 ms)
# 1 * sin(2π * 600 Hz * t)
s2_t = range(0.8, length=ceil(Int64, (.8 * sf)))
s2 = sin.(2π * 600 .* s2_t)

@btime fastdtw($s1, $s2, 1)

Answer 6 · 2020-04-23T21:52:59.000Z

I am experimenting with some additional performance improvements over at https://github.com/baggepinnen/DynamicAxisWarping.jl
I got rid of the Sequence completely. It was only used to get a custom indexing behaviour, so I replaced it with a custom indexing method instead. Having to wrap and unwrap vectors is a bit of a hassle and this got rid of all wrapper methods

Answer 7 · 2020-04-23T21:54:07.000Z

On a somewhat unrelated note, have you seen the recent paper
https://arxiv.org/pdf/2003.11246.pdf
"FastDTW is approximate and Generally Slower than the Algorithm it Approximates"

Answer 8 · 2020-04-23T22:17:47.000Z

I will have to start watching your fork. Interesting paper there too; matches some behavior I had been casually observing with speech recognition data.