Implement a lazy identity operator

Question

Implement a lazy identity operator

david-pl opened this issue 2 years ago · 11 comments

The identityoperator function returns a sparse array. As the action of the identity matrix on states and operators is known, however, we could make this lazy. This may not appear to be a large change, but it could potentially have a large impact on the implementation of LazyTensor and its mul! methods.

I'm not set on actually doing is, it's just an idea that I thought worth discussing.

We could, for example, use FillArrays to implement this. They export an Eye function which lazily represents a matrix that is zero except on the diagonal with corresponding getindex methods and such. We could just use that as .data field when constructing identityoperator.

using FillArrays
id = Eye{ComplexF64}(4,3)
id isa AbstractMatrix  # true

Alternatively, if we don't want to add this dependency, we could also implement it ourselves (it's not super complicated), which would also give us more control over e.g. the mul! implementations and such. I'd prefer not duplicating work, however.

The big advantage I see in implementing this is that it might greatly simplify the implementation of LazyTensor. If identity operators are efficient and lazy, then we could remove the entire concept of indices from it. This will also simplify the underlying mul! methods.

The disadvantage I is that this may have a negative impact on LazyTensor performance. Without further dispatches on the Eye matrix above, for example, we'd be doing a ton of unnecessary multiplication operations. Ultimately the question whether this impacts performance negatively can only be answered if we do the implementation first, so I'm not sure if we want to invest in this.

One option I see for now is to just change the implementation of identityoperator to return a lazy Eye (or add a new method such as lazy_identityoperator) and leave the LazyTensor implementation untouched.

Answer 1 · 2023-04-15T12:43:07.000Z

Actually, it's not super hard to get a first comparison for performance. If you just throw "lazy" identities in a LazyTensor you can see an approximate impact without any further optimizations. Consider the following:

using QuantumOpticsBase
using FillArrays
using BenchmarkTools

dims = [2,3,5,4]
bs = GenericBasis.(dims)
bcomp = ⊗(bs...)

idx = 3
a = randoperator(bs[idx])

function lazy_identity(b::Basis)
    n = length(b)
    return Operator(b,b,Eye{ComplexF64}(n))
end

A = LazyTensor(bcomp, (idx,), (a,))
op_list = [i == idx ? a : lazy_identity(b) for (i,b) in enumerate(bs)]
B = LazyTensor(bcomp, Tuple(1:length(dims)), (op_list...,))


ψ = randstate(bcomp)
dψ1 = copy(ψ)
dψ2 = copy(ψ)

# test
QuantumOpticsBase.mul!(dψ1, A, ψ)
QuantumOpticsBase.mul!(dψ2, B, ψ)
dψ1 == dψ2  # true


bench1 = @benchmark QuantumOpticsBase.mul!($dψ1, $A, $ψ)
bench2 = @benchmark QuantumOpticsBase.mul!($dψ2, $B, $ψ)

That shows a slow down of a factor of 2 - 3. Note that this will get worse with a larger number of Hilbert spaces. Still, it might be worth looking into optimizations here.

Answer 2 · 2023-04-15T15:12:03.000Z

Could one give the identityoperator its own type? Then LazyTensor could dispatch on this and choose not to store them as part of its operators field?

Answer 3 · 2023-04-17T16:53:29.000Z

You can just define

const IdentityOperator{BL,BR} = Operator{BL,BR,<:FillArrays.Eye}

and dispatch on that. And yes, then filtering them out on construction would keep things as performant as they are now. However, my idea would be to not do that, but rather skip over them in the LazyTensor mul! method. It might be possible to look into improving that method since you don't have to bother with the indices of LazyTensor anymore. But I'm not sure, I'd have to look into the mul! method again, which isn't the simplest piece of code.

Answer 4 · 2023-04-17T17:54:53.000Z

I like this approach! Also seems like FillArrays is a pretty lightweight dependency. I think I can add this special case to LazyTensor mul!() without much trouble.

Answer 5 · 2023-04-17T18:07:58.000Z

Big question for me is whether we (1) provide lazy_identityoperator() or (2) have identityoperator() return a FillArray.

Option 1 is clearly safer as it does not break any code that relies on identityoperator() returning a sparse matrix operator, but it does introduce new API that we might deprecate later if we decide to move to option 2.

Most code that uses identityoperator() should not actually break with option 2, as operations like a * identityoperator() should still produce exactly the same results as before, so I think I lean slightly towards option 2. Thoughts?

Answer 6 · 2023-04-17T18:28:01.000Z

Big question for me is whether we (1) provide lazy_identityoperator() or (2) have identityoperator() return a FillArray.

Option 1 is clearly safer as it does not break any code that relies on identityoperator() returning a sparse matrix operator, but it does introduce new API that we might deprecate later if we decide to move to option 2.

Most code that uses identityoperator() should not actually break with option 2, as operations like a * identityoperator() should still produce exactly the same results as before, so I think I lean slightly towards option 2. Thoughts?

I would argue for option 2 since this makes the contagious LazyTensor in #86 more efficient and natural to combine with the standard syntax (of course only if it's trivial to make sure nothing is broken by option 2).

Answer 7 · 2023-04-17T18:35:42.000Z

Yeah, I agree that providing minimal surprise when tensoring LazyTensors with identities is nice. As for breakage, I think external code is really the only issue. The fix is pretty trivial if it does cause problems - just do sparse(identityoperator()). @david-pl do you think this, plus @mabuni1998's other changes, are worth a minor version bump?

Answer 8 · 2023-04-17T20:51:26.000Z

In favor of option 2 as well. Maybe we can start keeping a human-focused changelog for this type of stuff: https://keepachangelog.com/en/1.0.0/

I started having one midway through some of my projects (just marking the first tracked version as "changes not logged before this date"), and it has been a valuable reference

Answer 9 · 2023-04-18T17:57:24.000Z

@amilsted yeah, I'd say this together with #86 qualifies for a breaking change, so option 2 should be fine. In any case, the Eye should behave mostly like a matrix, so things shouldn't be too bad I hope. Would be great to do it all in one minor version bump! Do you maybe have time to implement it?

Answer 10 · 2023-04-18T18:10:10.000Z

Yeah, I'll give it a shot.

Answer 11 · 2023-04-21T21:29:36.000Z

Implementation in #94 gets speed back up to where it should be.