Check if all the elements of a Julia array are equal

B

4

26

The shortest way I can think of to test whether all the elements in an array arr are equal is all(arr[1] .== arr). While this is certainly short, it seems a bit inelegant. Is there a built-in function that does this?

I suspect there's something along the lines of ==(arr...), but that doesn't work because the == operator can only take two arguments. I'm not sure how Julia parses expressions like arr[1] == arr[2] == arr[3], but is there some way to adapt this to an array with an arbitrary number of elements?

Broucek answered 30/11, 2017 at 2:13 Comment(1)

This function is now implemented in the Julia Base library as allequal since v1.8; docs.julialang.org/en/v1/base/collections/#Base.allequal – Diogenes 11/5, 2023 at 16:40

S

18

Great question @tparker and great answer @ColinTBowers. While trying to think about them both, it occurred to me to try the straight-forward old-school Julian way-of-the-for-loop. The result was faster on the important input of a long vector of identical elements, so I'm adding this note. Also, the function name allequal seems to be appropriate enough to mention. So here are the variants:

allequal_1(x) = all(y->y==x[1],x)

# allequal_2(x) used to be erroneously defined as foldl(==,x)   

@inline function allequal_3(x)
    length(x) < 2 && return true
    e1 = x[1]
    i = 2
    @inbounds for i=2:length(x)
        x[i] == e1 || return false
    end
    return true
end

And the benchmark:

julia> using BenchmarkTools

julia> v = fill(1,10_000_000);  # long vector of 1s

julia> allequal_1(v)
true

julia> allequal_3(v)
true

julia> @btime allequal_1($v);
  9.573 ms (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  6.853 ms (0 allocations: 0 bytes)

UPDATE: Another important case to benchmark is when there is a short-circuit opportunity. So (as requested in commment):

julia> v[100] = 2
2

julia> allequal_1(v),allequal_2(v),allequal_3(v)
(false, false, false)

julia> @btime allequal_1($v);
  108.946 ns (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  68.221 ns (0 allocations: 0 bytes)

All things being equal, a for version should get to be allequal in Base.

Synchroscope answered 30/11, 2017 at 16:54 Comment(9)

Thanks, foldl is exactly the function I was looking for. Do you think there's reason beyond the @inbounds for why the third implementation is faster? – Broucek 30/11, 2017 at 20:55

I really like how Julia Base is constructed by combining and building on top of general functions, making many implementations high-level and abstract. Look at reduce.jl for example. I would prefer that an allequal(though it should not actually be in Base) would be like that, built on top of some generic construct like all, instead of making some hyper-optimized custom loop. This way, improving the performance of all will improve every function that depends on it. – Wessex 1/12, 2017 at 8:34

BTW. foldl does not appear to short-circuit, so that is definitely not optimal here. – Wessex 1/12, 2017 at 8:46

@Wessex You are right it doesn't short-circuit (against my recollection - fixing answer). But perhaps (and in concordance to the Base philosophy in your comment), a short-circuit foldl (and other reductions) should be defined, based on a zero element of op. With another signature to foldl which has a zero element definition, or simply a specialized signature for the common binary ops. – Synchroscope 1/12, 2017 at 10:35

Might be worth adding a benchmark from the same machine for a long vector where one of the early elements is different, to see how big a deal the lack of short-circuiting is. – Broucek 4/12, 2017 at 17:7

Note that the optimality of allequal_2 is beside the point, because it's incorrect! For example allequal_2([0,0,0]) evaluates to false. – Per 4/8, 2019 at 2:19

@Per Thanks for taking note. I will edit the answer. Probably got confused by true == 1 and false == 0 which have questionable C-like origins. – Synchroscope 4/8, 2019 at 11:25

@DanGetz, what is the need to specify i = 2 before running the for i=2:length(x) loop? – Explosion 12/9, 2023 at 5:11

@Explosion no real need. Maybe it was left from a different version. – Synchroscope 12/9, 2023 at 7:14

P

18

all is the right solution, but you want the method all(p, itr) for predicate p and iterable itr, since it will employ short-circuiting behaviour (break as soon as a false is found). So:

all(y->y==x[1], x)

To see the difference, you can run the following little speed test:

for n = 100000:250000:1100000
    x = rand(1:2, n);
    @time all(x .== x[1]);
    @time all(y->y==x[1], x);
    println("------------------------")
end

Ignore the first iteration as it is timing compile time.

  0.000177 seconds (22 allocations: 17.266 KiB)
  0.006155 seconds (976 allocations: 55.062 KiB)
------------------------
  0.000531 seconds (23 allocations: 47.719 KiB)
  0.000003 seconds (1 allocation: 16 bytes)
------------------------
  0.000872 seconds (23 allocations: 78.219 KiB)
  0.000001 seconds (1 allocation: 16 bytes)
------------------------
  0.001210 seconds (23 allocations: 108.781 KiB)
  0.000001 seconds (1 allocation: 16 bytes)
------------------------
  0.001538 seconds (23 allocations: 139.281 KiB)
  0.000002 seconds (1 allocation: 16 bytes)

The first solution is fairly obviously O(n), while the second is O(1) at best and O(n) at worst (depending on the data generating process for itr).

Pinnate answered 30/11, 2017 at 3:49 Comment(3)

Wow, great answer! Although I'm surprised that there's no more compact syntax than manually comparing with the first element. – Broucek 30/11, 2017 at 5:1

@Broucek allsame(x) = all(y->y==x[1],x). One of the nice things about the language is how easy it is to write these shortcuts yourself and have them sitting in your own core package. More generally, there is a strong push in the Julia community to keep Base as trim as possible. Nice, simple, function names for every type of operation can be in packages, and if enough people like it, the package becomes popular, and so on... but Base remains lean and fast. – Pinnate 30/11, 2017 at 5:28

Beautiful that this even works for the empty array: allsame([]) == true, as it should be, as there are no different elements inside an empty array. – Medora 6/12, 2017 at 11:49

S

18

Great question @tparker and great answer @ColinTBowers. While trying to think about them both, it occurred to me to try the straight-forward old-school Julian way-of-the-for-loop. The result was faster on the important input of a long vector of identical elements, so I'm adding this note. Also, the function name allequal seems to be appropriate enough to mention. So here are the variants:

allequal_1(x) = all(y->y==x[1],x)

# allequal_2(x) used to be erroneously defined as foldl(==,x)   

@inline function allequal_3(x)
    length(x) < 2 && return true
    e1 = x[1]
    i = 2
    @inbounds for i=2:length(x)
        x[i] == e1 || return false
    end
    return true
end

And the benchmark:

julia> using BenchmarkTools

julia> v = fill(1,10_000_000);  # long vector of 1s

julia> allequal_1(v)
true

julia> allequal_3(v)
true

julia> @btime allequal_1($v);
  9.573 ms (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  6.853 ms (0 allocations: 0 bytes)

UPDATE: Another important case to benchmark is when there is a short-circuit opportunity. So (as requested in commment):

julia> v[100] = 2
2

julia> allequal_1(v),allequal_2(v),allequal_3(v)
(false, false, false)

julia> @btime allequal_1($v);
  108.946 ns (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  68.221 ns (0 allocations: 0 bytes)