JuliaArrays/LazyArrays.jl

Am using LazyArrays correctly? It appears slower and allocating more memory than regular Arrays.

ericqu opened this issue · 1 comments

I am trying to reduce memory allocation for matrix multiplication in my project.
I have a simple (non-lazy) implementation and naively thought I could reduce the allocation impact using the LazyArrays magic wand.
It does not seem to work immediately. I guess the most probable case is that I did not use it properly; if so, please let me know how to fix my code. Alternatively, I was maybe overly optimistic, and this is not a good use case for LazyArrays; I would appreciate your feedback. And maybe it is because I run the code on an M1, and somehow it can't benefit from a BLAS/LAPACK.

the version is

Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.2.0)
  CPU: Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, cyclone)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 

and the code to reproduce the issue is:

using LinearAlgebra, BenchmarkTools
using LazyArrays

MXs = [1.0 1.0 1.0
    1.0 2.0 1.0
    1.0 3.0 1.0
    1.0 1.0 -1.0
    1.0 2.0 -1.0
    1.0 3.0 -1.0]

MY1 = [1.0
    3.0
    3.0
    2.0
    2.0
    1.0]

function normeq(x, y)
    nn, p = size(x)
    if ndims(y) == 2
        n, ypp = size(y)
    else
        ypp = 1
    end
    back = ypp - 1

    xy = [x y]
    xytxy = xy' * xy

    return xytxy[1:p, end-back:end]
end

function l_normeq(x, y)
    nn, p = size(x)
    if ndims(y) == 2
        n, ypp = size(y)
    else
        ypp = 1
    end
    back = ypp - 1

    xy = ApplyArray(hcat, x, y)

    xytxy = xy' * xy
    xytxy = ApplyArray(*, xy', xy)
    m_xytxy = Matrix{Float64}(undef, 4, 4)
    copyto!(m_xytxy, xytxy)

    return m_xytxy[1:p, end-back:end]
end

normeq(MXs, MY1)
l_normeq(MXs, MY1)

@btime normeq(MXs, MY1)
@btime l_normeq(MXs, MY1)

I obtain the following results:

  184.997 ns (3 allocations: 528 bytes)
  11.875 μs (705 allocations: 15.97 KiB)

I should add that I tried with larger arrays (the small arrays are only for simplicity), but the results go in the same direction.

sorry it is a mistake.