Why does a subtype of AbstractArray result in imprecise matrix operations in Julia?
Asked Answered
C

1

5

I'm currently working on creating a subtype of AbstractArray in Julia, which allows you to store a vector in addition to an Array itself. You can think of it as the column "names", with element types as a subtype of AbstractFloat. Hence, it has some similarities to the NamedArray.jl package, but restricts to only assigning the columns with Floats (in case of matrices).

The struct that I've created so far (following the guide to create a subtype of AbstractArray) is defined as follows:

struct FooArray{T, N, AT, VT} <: AbstractArray{T, N}
    data::AT
    vec::VT
    function FooArray(data::AbstractArray{T1, N}, vec::AbstractVector{T2}) where {T1 <: AbstractFloat, T2 <: AbstractFloat, N}
        length(vec) == size(data, 2) || error("Inconsistent dimensions")
        new{T1, N, typeof(data), typeof(vec)}(data, vec)
    end
end
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, i::Int) = getindex(fooarr.data, i)
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, I::Vararg{Int, 2}) = getindex(fooarr.data, I...)
@inline Base.@propagate_inbounds Base.size(fooarr::FooArray) = size(fooarr.data)
Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()

This already seems to be enough to create objects of type fooArray and do some simple math with it. However, I've observed that some essential functions such as matrix-vector multiplications seem to be imprecise. For example, the following should consistently return a vector of 0.0, but:

R = rand(100, 3)
S = FooArray(R, collect(1.0:3.0))
y = rand(100)
S'y - R'y
3-element Vector{Float64}:
 -7.105427357601002e-15
 0.0
 3.552713678800501e-15

While the differences are very small, they can quickly add up over many different calculations, leading to significant errors.

Where do these differences come from?

A look at the calculations via macro @code_llvm reveals that appearently different matmul functions from LinearAlgebra are used (with other minor differences):

@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:111 within `*'
   ...
@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:106 within `*'
   ...

Redefining the adjoint and * functions on our FooArray object provides the expected, correct result:

import Base: *, adjoint, /
Base.adjoint(a::FooArray) = FooArray(a.data', zeros(size(a.data, 1)))
*(a::FooArray{T, 2, AT, VT} where {AT, VT}, b::AbstractVector{S}) where {T, S} = a.data * b
S'y - R'y
3-element Vector{Float64}:
 0.0
 0.0
 0.0

However, this solution (which is also done in NamedArrays here) would require defining and maintaining all sorts of functions, not just the standard functions in base, adding more and more dependencies just because of this small error margin.

Is there any simpler way to get rid of this issue without redefining every operation and possibly many other functions from other packages?

I'm using Julia version 1.6.1 on Windows 64-bit system.

Catnip answered 9/11, 2021 at 13:53 Comment(5)
Isn't this just 0.30000000000000004.com?Biennial
I can't run the posted code. At first I just thought it was a missed correction at fooarr.mat, but after I changed it to fooarr.data, attempting S'y - R'y causes a StackOverflowError due to getindex(...i::Int64) recursion.Hydra
@OscarSmith The intent for the type seems to be wrapping the matrix, as the redefinitions forwards from arr to arr.data. The first approach tries to do that element by element through getindex, so it doesn't seem like there should be a discrepancy.Hydra
Side note: a type name starting with a lowercase letter is a bit irritating. Stylistically.Shaeffer
@Hydra indeed, there were some errors in the code. I've edited the original post, the code should now be runnable.Catnip
K
6

Yes, the implementation of matrix multiplication will vary depending upon your array type. The builtin Array will use BLAS, whereas your custom fooArray will use a generic implementation, and due to the non-associativity of floating point arithmetic, these different approaches will indeed yield different values — and note that they may be different from the ground truth, even for the builtin Arrays!

julia> using Random; Random.seed!(0); R = rand(100, 3); y = rand(100);

julia> R'y - Float64.(big.(R)'big.(y))
3-element Vector{Float64}:
 -3.552713678800501e-15
  0.0
  0.0

You may be able to implement your custom array as a DenseArray, which will ensure that it uses the same (BLAS-enabled) codepath. You just need to implement a few more methods, most importantly strides and unsafe_convert:

julia> struct FooArray{T, N} <: DenseArray{T, N}
          data::Array{T, N}
       end
       Base.getindex(fooarr::FooArray, i::Int) = fooarr.data[i]
       Base.size(fooarr::FooArray) = size(fooarr.data)
       Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()
       Base.strides(fooarr::FooArray) = strides(fooarr.data)
       Base.unsafe_convert(P::Type{Ptr{T}}, fooarr::FooArray{T}) where {T} = Base.unsafe_convert(P, fooarr.data)

julia> R = rand(100, 3); S = FooArray(R); y = rand(100)
       R'y - S'y
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> R = rand(100, 1000); S = FooArray(R); y = rand(100)
       R'y == S'y
true
Kelila answered 9/11, 2021 at 21:9 Comment(4)
So if BLAS is not doing operations in the generic linear order, what is it doing? Is it parallel?Hydra
@Hydra I believe both implementations traverse the array in a SIMD-friendly cache-optimized access pattern, which simultaneously improves both performance and accuracy over the naive triple for loop. They just do so slightly differently.Kelila
@Kelila Thank you very much for the explanation. Interesting that subtyping DenseArray uses BLAS, but as AbstractArray not. The difference is also very notable in computation time for larger matrices, so using BLAS instead of the generic multiplication should be preferable anyways.Catnip
small addendum to this solution: some functions using mapreduce such as sum or mean still can exhibit a similar error, e.g. sum(R, dims = 1) - sum(S, dims = 1) is still not equal to zero. A very simple solution is to enable fast linear indexing: import Base: has_fast_linear_indexing has_fast_linear_indexing(fooarr::FooArray) = trueCatnip

© 2022 - 2024 — McMap. All rights reserved.