Am using LazyArrays correctly? It appears slower and allocating more memory than regular Arrays.
ericqu opened this issue · 1 comments
I am trying to reduce memory allocation for matrix multiplication in my project.
I have a simple (non-lazy) implementation and naively thought I could reduce the allocation impact using the LazyArrays magic wand.
It does not seem to work immediately. I guess the most probable case is that I did not use it properly; if so, please let me know how to fix my code. Alternatively, I was maybe overly optimistic, and this is not a good use case for LazyArrays; I would appreciate your feedback. And maybe it is because I run the code on an M1, and somehow it can't benefit from a BLAS/LAPACK.
the version is
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.2.0)
CPU: Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, cyclone)
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS =
and the code to reproduce the issue is:
using LinearAlgebra, BenchmarkTools
using LazyArrays
MXs = [1.0 1.0 1.0
1.0 2.0 1.0
1.0 3.0 1.0
1.0 1.0 -1.0
1.0 2.0 -1.0
1.0 3.0 -1.0]
MY1 = [1.0
3.0
3.0
2.0
2.0
1.0]
function normeq(x, y)
nn, p = size(x)
if ndims(y) == 2
n, ypp = size(y)
else
ypp = 1
end
back = ypp - 1
xy = [x y]
xytxy = xy' * xy
return xytxy[1:p, end-back:end]
end
function l_normeq(x, y)
nn, p = size(x)
if ndims(y) == 2
n, ypp = size(y)
else
ypp = 1
end
back = ypp - 1
xy = ApplyArray(hcat, x, y)
xytxy = xy' * xy
xytxy = ApplyArray(*, xy', xy)
m_xytxy = Matrix{Float64}(undef, 4, 4)
copyto!(m_xytxy, xytxy)
return m_xytxy[1:p, end-back:end]
end
normeq(MXs, MY1)
l_normeq(MXs, MY1)
@btime normeq(MXs, MY1)
@btime l_normeq(MXs, MY1)
I obtain the following results:
184.997 ns (3 allocations: 528 bytes)
11.875 μs (705 allocations: 15.97 KiB)
I should add that I tried with larger arrays (the small arrays are only for simplicity), but the results go in the same direction.
sorry it is a mistake.