Why does a subtype of AbstractArray result in imprecise matrix operations in Julia?

I'm currently working on creating a subtype of AbstractArray in Julia, which allows you to store a vector in addition to an Array itself. You can think of it as the column "names", with element types as a subtype of AbstractFloat. Hence, it has some similarities to the NamedArray.jl package, but restricts to only assigning the columns with Floats (in case of matrices).

The struct that I've created so far (following the guide to create a subtype of AbstractArray) is defined as follows:

struct FooArray{T, N, AT, VT} <: AbstractArray{T, N}
    data::AT
    vec::VT
    function FooArray(data::AbstractArray{T1, N}, vec::AbstractVector{T2}) where {T1 <: AbstractFloat, T2 <: AbstractFloat, N}
        length(vec) == size(data, 2) || error("Inconsistent dimensions")
        new{T1, N, typeof(data), typeof(vec)}(data, vec)
    end
end
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, i::Int) = getindex(fooarr.data, i)
@inline Base.@propagate_inbounds Base.getindex(fooarr::FooArray, I::Vararg{Int, 2}) = getindex(fooarr.data, I...)
@inline Base.@propagate_inbounds Base.size(fooarr::FooArray) = size(fooarr.data)
Base.IndexStyle(::Type{<:FooArray}) = IndexLinear()

This already seems to be enough to create objects of type fooArray and do some simple math with it. However, I've observed that some essential functions such as matrix-vector multiplications seem to be imprecise. For example, the following should consistently return a vector of 0.0, but:

R = rand(100, 3)
S = FooArray(R, collect(1.0:3.0))
y = rand(100)
S'y - R'y
3-element Vector{Float64}:
 -7.105427357601002e-15
 0.0
 3.552713678800501e-15

While the differences are very small, they can quickly add up over many different calculations, leading to significant errors.

Where do these differences come from?

A look at the calculations via macro @code_llvm reveals that appearently different matmul functions from LinearAlgebra are used (with other minor differences):

@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:111 within `*'
   ...
@code_llvm S'y
   ...
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\LinearAlgebra\src\matmul.jl:106 within `*'
   ...

Redefining the adjoint and * functions on our FooArray object provides the expected, correct result:

import Base: *, adjoint, /
Base.adjoint(a::FooArray) = FooArray(a.data', zeros(size(a.data, 1)))
*(a::FooArray{T, 2, AT, VT} where {AT, VT}, b::AbstractVector{S}) where {T, S} = a.data * b
S'y - R'y
3-element Vector{Float64}:
 0.0
 0.0
 0.0

However, this solution (which is also done in NamedArrays here) would require defining and maintaining all sorts of functions, not just the standard functions in base, adding more and more dependencies just because of this small error margin.

Is there any simpler way to get rid of this issue without redefining every operation and possibly many other functions from other packages?

I'm using Julia version 1.6.1 on Windows 64-bit system.

julia> struct FooArray{T, N} <: DenseArray{T, N} data::Array{T, N} end Base.getindex(fooarr::FooArray, i::Int) = fooarr.data[i] Base.size(fooarr::FooArray) = size(fooarr.data) Base.IndexStyle(::Type{<:FooArray}) = IndexLinear() Base.strides(fooarr::FooArray) = strides(fooarr.data) Base.unsafe_convert(P::Type{Ptr{T}}, fooarr::FooArray{T}) where {T} = Base.unsafe_convert(P, fooarr.data) julia> R = rand(100, 3); S = FooArray(R); y = rand(100) R'y - S'y 3-element Vector{Float64}: 0.0 0.0 0.0 julia> R = rand(100, 1000); S = FooArray(R); y = rand(100) R'y == S'y true

Recommended topics

Hot tags