cmcaine/EvilArrays.jl

Not conforming to the AbstractArray interface?

Closed this issue · 6 comments

I think AbstractArrays are required to accept Int indices (source).

Thanks for your interest!

This is intended: for any Int index it will throw KeyError, which is the same behaviour as if you just don't know the right keys. This supports the intended use case of forcing users to use eachindex or similar methods.

This function works on any AbstractVector{Int} but fails on EvilArrays:

function my_safe_sum(v::AbstractVector{Int})
    out = 0
    for i in firstindex(v):lastindex(v)
        out += v[i]
    end
    out
end
julia> my_safe_sum(EvilArray([1,2,3]))
ERROR: MethodError: no method matching isless(::EvilArrays.EvilArrayIndex{Int64}, ::EvilArrays.EvilArrayIndex{Int64})
Closest candidates are:
  isless(::Any, ::Missing) at missing.jl:88
  isless(::Missing, ::Any) at missing.jl:87
Stacktrace:
 [1] <(x::EvilArrays.EvilArrayIndex{Int64}, y::EvilArrays.EvilArrayIndex{Int64})
   @ Base ./operators.jl:343
 [2] <=(x::EvilArrays.EvilArrayIndex{Int64}, y::EvilArrays.EvilArrayIndex{Int64})
   @ Base ./operators.jl:392
 [3] >=(x::EvilArrays.EvilArrayIndex{Int64}, y::EvilArrays.EvilArrayIndex{Int64})
   @ Base ./operators.jl:416
 [4] (::Colon)(start::EvilArrays.EvilArrayIndex{Int64}, stop::EvilArrays.EvilArrayIndex{Int64})
   @ Base ./range.jl:7
 [5] my_safe_sum(v::EvilArray{Int64, 1, Vector{Int64}})
   @ Main ./REPL[4]:3
 [6] top-level scope
   @ REPL[9]:1

This is likely because axes does not return a tuple of AbstractUnitRange{<:Integer} as required.

This is a problem because tests using this package would report a false positive failure for this valid function.

A similar problem occurs in this slightly more justified use of firstidnex and lastindex rather than eachindex:

function sum_all_but_last_three(v::AbstractVector{Int})
    out = 0
    for i in firstindex(v):lastindex(v)-3
        out += v[i]
    end
    out
end
julia> sum_all_but_last_three(EvilArray([1,2,3,4,5]))
ERROR: MethodError: no method matching -(::EvilArrays.EvilArrayIndex{Int64}, ::Int64)
Closest candidates are:
  -(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:86
  -(::LinearAlgebra.UniformScaling, ::Number) at /Applications/Julia-1.9.0-DEV-7ad0e3deae.app/Contents/Resources/julia/share/julia/stdlib/v1.9/LinearAlgebra/src/uniformscaling.jl:146
  -(::Base.TwicePrecision, ::Number) at twiceprecision.jl:304
  ...
Stacktrace:
 [1] sum_all_but_last_three(v::EvilArray{Int64, 1, Vector{Int64}})
   @ Main ./REPL[11]:3
 [2] top-level scope
   @ REPL[15]:1

Yeah, you're right. Thanks for opening the issue.

This is related to #2: Does the abstract array interface require contiguous indices once you've found one valid index? Base arrays and OffsetArrays conform to this, but it was unclear to me (and apparently some others) if this is actually required.

At the time I wrote this package I thought contiguous indices were not required, but looking at axes again does make it clear that some kind of unit range of Ints is required and there's no way of interpreting something non-contiguous as a unit range.

I think actions from this issue are:

  1. Implement this package in terms of OffsetArrays
  2. Open a PR on julia to update the docs for the AbstractArray interface

Does the abstract array interface require contiguous indices once you've found one valid index?

axes must "Return a tuple of AbstractUnitRange{<:Integer} of valid indices". An AbstractUnitRange has to be contiguous, but if we are reading that in the spirit of Evil Arrays, this tuple need not be all the valid indices of the AbstractArray. For example, an AbstractVector may support indexing at indices 1, 2, 3, 5 and 7, and have axes defined by (rand([2:3, Base.OneTo(2), -4:-5, 7:7]),), which will return a tuple of AbstractUnitRange{<:Integer} of valid indices.

I guess the question is how evil to be? We could provide an object that apparently changes shape and contents as you access it, but how useful is that for the intended purpose of the package?

I think an array whose size and shape can change whenever an async task is allowed to run might be useful for testing async functions; or for testing multiprocess or multithread code a function that returns an array and a lock and will throw errors if you try to do anything to it without locking it first.

Feel free to open PRs for either of those use cases. Personally, I won't do that kind of thing unless people actually start using this package :)

We could provide an object that apparently changes shape and contents as you access it, but how useful is that for the intended purpose of the package?

Not very useful.

I think an array whose size and shape can change whenever an async task is allowed to run might be useful for testing async functions;

Maybe, but I'm skeptical it would often be useful

or for testing multiprocess or multithread code a function that returns an array and a lock and will throw errors if you try to do anything to it without locking it first.

That sounds useful!

Feel free to open PRs for either of those use cases. Personally, I won't do that kind of thing unless people actually start using this package :)

I'm afraid I don't have the bandwidth right now, but segfaults in methods that use @inbounds on AbstractArrays when only expecting Arrays is a large eco-system wide issue, so it would be great if this package and/or related tools make it easier to test that a function works on any instance that conforms to the interface. Thanks for your work!