FFT performance issue
Lightup1 opened this issue · 1 comments
Lightup1 commented
Started from a performance test.
using QuantumOptics, BenchmarkTools, LinearAlgebra, MKL
using FFTW
# FFTW.set_provider!("mkl")
# FFTW.set_provider!("fftw")
FFTW.set_num_threads(6)
##
b1 = PositionBasis(-1, 1, 2^14)
b2 = MomentumBasis(b1)
##
Tpx_test = QuantumOptics.transform(b2, b1)
ppsi = Ket(b2,rand(ComplexF64,length(b2)))
psi = Ket(b1,rand(ComplexF64,length(b2)))
@benchmark QuantumOpticsBase.mul!($ppsi, $Tpx_test, $psi)
##
p1=plan_fft(rand(ComplexF64,2^14))
data1=rand(ComplexF64,2^14)
data2=rand(ComplexF64,2^14)
@benchmark mul!($data2,$p1,$data1)
1 thread:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 81.200 μs … 1.018 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 87.300 μs ┊ GC (median): 0.00%
Time (mean ± σ): 92.110 μs ± 28.820 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▇█▅█▆▆▅▅▄▃▃▂▂▁▁ ▁ ▂
█████████████████████▇▇█▇██▇█▇██▇▇█▇▇▇▇▇▇▇▆▅▆▆▃▆▆▅▄▄▂▅▂▃▃▃▄ █
81.2 μs Histogram: log(frequency) by time 159 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
6 thread:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 57.700 μs … 769.900 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 75.000 μs ┊ GC (median): 0.00%
Time (mean ± σ): 74.873 μs ± 10.677 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▁ ▂▄ ▃▆█▁
▁▁▂▂▁▁▁▁▂▁▁▁▁▂▇█████▆▄▄▄▆█████▆▅▄▅▅▇▆▅▃▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
57.7 μs Histogram: frequency by time 96.6 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
The odd thing is that the pure vector fft is much faster than Ket fft.
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 23.600 μs … 800.300 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 35.600 μs ┊ GC (median): 0.00%
Time (mean ± σ): 36.250 μs ± 8.544 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█▂ █▂▆▄
▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▆██▇████▇▇▄▃▃▃▄▃▅▅▄▄▃▃▂▂▂▁▂▁▁▁▁▁▁ ▂
23.6 μs Histogram: frequency by time 45.2 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
Lightup1 commented
Checked the code I think it may caused by the scaling operation.
I'll close the issue.