Check if all the elements of a Julia array are equal
Asked Answered
B

4

26

The shortest way I can think of to test whether all the elements in an array arr are equal is all(arr[1] .== arr). While this is certainly short, it seems a bit inelegant. Is there a built-in function that does this?

I suspect there's something along the lines of ==(arr...), but that doesn't work because the == operator can only take two arguments. I'm not sure how Julia parses expressions like arr[1] == arr[2] == arr[3], but is there some way to adapt this to an array with an arbitrary number of elements?

Broucek answered 30/11, 2017 at 2:13 Comment(1)
This function is now implemented in the Julia Base library as allequal since v1.8; docs.julialang.org/en/v1/base/collections/#Base.allequalDiogenes
S
18

Great question @tparker and great answer @ColinTBowers. While trying to think about them both, it occurred to me to try the straight-forward old-school Julian way-of-the-for-loop. The result was faster on the important input of a long vector of identical elements, so I'm adding this note. Also, the function name allequal seems to be appropriate enough to mention. So here are the variants:

allequal_1(x) = all(y->y==x[1],x)

# allequal_2(x) used to be erroneously defined as foldl(==,x)   

@inline function allequal_3(x)
    length(x) < 2 && return true
    e1 = x[1]
    i = 2
    @inbounds for i=2:length(x)
        x[i] == e1 || return false
    end
    return true
end

And the benchmark:

julia> using BenchmarkTools

julia> v = fill(1,10_000_000);  # long vector of 1s

julia> allequal_1(v)
true

julia> allequal_3(v)
true

julia> @btime allequal_1($v);
  9.573 ms (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  6.853 ms (0 allocations: 0 bytes)

UPDATE: Another important case to benchmark is when there is a short-circuit opportunity. So (as requested in commment):

julia> v[100] = 2
2

julia> allequal_1(v),allequal_2(v),allequal_3(v)
(false, false, false)

julia> @btime allequal_1($v);
  108.946 ns (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  68.221 ns (0 allocations: 0 bytes)

All things being equal, a for version should get to be allequal in Base.

Synchroscope answered 30/11, 2017 at 16:54 Comment(9)
Thanks, foldl is exactly the function I was looking for. Do you think there's reason beyond the @inbounds for why the third implementation is faster?Broucek
I really like how Julia Base is constructed by combining and building on top of general functions, making many implementations high-level and abstract. Look at reduce.jl for example. I would prefer that an allequal(though it should not actually be in Base) would be like that, built on top of some generic construct like all, instead of making some hyper-optimized custom loop. This way, improving the performance of all will improve every function that depends on it.Wessex
BTW. foldl does not appear to short-circuit, so that is definitely not optimal here.Wessex
@Wessex You are right it doesn't short-circuit (against my recollection - fixing answer). But perhaps (and in concordance to the Base philosophy in your comment), a short-circuit foldl (and other reductions) should be defined, based on a zero element of op. With another signature to foldl which has a zero element definition, or simply a specialized signature for the common binary ops.Synchroscope
Might be worth adding a benchmark from the same machine for a long vector where one of the early elements is different, to see how big a deal the lack of short-circuiting is.Broucek
Note that the optimality of allequal_2 is beside the point, because it's incorrect! For example allequal_2([0,0,0]) evaluates to false.Per
@Per Thanks for taking note. I will edit the answer. Probably got confused by true == 1 and false == 0 which have questionable C-like origins.Synchroscope
@DanGetz, what is the need to specify i = 2 before running the for i=2:length(x) loop?Explosion
@Explosion no real need. Maybe it was left from a different version.Synchroscope
P
18

all is the right solution, but you want the method all(p, itr) for predicate p and iterable itr, since it will employ short-circuiting behaviour (break as soon as a false is found). So:

all(y->y==x[1], x)

To see the difference, you can run the following little speed test:

for n = 100000:250000:1100000
    x = rand(1:2, n);
    @time all(x .== x[1]);
    @time all(y->y==x[1], x);
    println("------------------------")
end

Ignore the first iteration as it is timing compile time.

  0.000177 seconds (22 allocations: 17.266 KiB)
  0.006155 seconds (976 allocations: 55.062 KiB)
------------------------
  0.000531 seconds (23 allocations: 47.719 KiB)
  0.000003 seconds (1 allocation: 16 bytes)
------------------------
  0.000872 seconds (23 allocations: 78.219 KiB)
  0.000001 seconds (1 allocation: 16 bytes)
------------------------
  0.001210 seconds (23 allocations: 108.781 KiB)
  0.000001 seconds (1 allocation: 16 bytes)
------------------------
  0.001538 seconds (23 allocations: 139.281 KiB)
  0.000002 seconds (1 allocation: 16 bytes)

The first solution is fairly obviously O(n), while the second is O(1) at best and O(n) at worst (depending on the data generating process for itr).

Pinnate answered 30/11, 2017 at 3:49 Comment(3)
Wow, great answer! Although I'm surprised that there's no more compact syntax than manually comparing with the first element.Broucek
@Broucek allsame(x) = all(y->y==x[1],x). One of the nice things about the language is how easy it is to write these shortcuts yourself and have them sitting in your own core package. More generally, there is a strong push in the Julia community to keep Base as trim as possible. Nice, simple, function names for every type of operation can be in packages, and if enough people like it, the package becomes popular, and so on... but Base remains lean and fast.Pinnate
Beautiful that this even works for the empty array: allsame([]) == true, as it should be, as there are no different elements inside an empty array.Medora
S
18

Great question @tparker and great answer @ColinTBowers. While trying to think about them both, it occurred to me to try the straight-forward old-school Julian way-of-the-for-loop. The result was faster on the important input of a long vector of identical elements, so I'm adding this note. Also, the function name allequal seems to be appropriate enough to mention. So here are the variants:

allequal_1(x) = all(y->y==x[1],x)

# allequal_2(x) used to be erroneously defined as foldl(==,x)   

@inline function allequal_3(x)
    length(x) < 2 && return true
    e1 = x[1]
    i = 2
    @inbounds for i=2:length(x)
        x[i] == e1 || return false
    end
    return true
end

And the benchmark:

julia> using BenchmarkTools

julia> v = fill(1,10_000_000);  # long vector of 1s

julia> allequal_1(v)
true

julia> allequal_3(v)
true

julia> @btime allequal_1($v);
  9.573 ms (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  6.853 ms (0 allocations: 0 bytes)

UPDATE: Another important case to benchmark is when there is a short-circuit opportunity. So (as requested in commment):

julia> v[100] = 2
2

julia> allequal_1(v),allequal_2(v),allequal_3(v)
(false, false, false)

julia> @btime allequal_1($v);
  108.946 ns (1 allocation: 16 bytes)

julia> @btime allequal_3($v);
  68.221 ns (0 allocations: 0 bytes)

All things being equal, a for version should get to be allequal in Base.

Synchroscope answered 30/11, 2017 at 16:54 Comment(9)
Thanks, foldl is exactly the function I was looking for. Do you think there's reason beyond the @inbounds for why the third implementation is faster?Broucek
I really like how Julia Base is constructed by combining and building on top of general functions, making many implementations high-level and abstract. Look at reduce.jl for example. I would prefer that an allequal(though it should not actually be in Base) would be like that, built on top of some generic construct like all, instead of making some hyper-optimized custom loop. This way, improving the performance of all will improve every function that depends on it.Wessex
BTW. foldl does not appear to short-circuit, so that is definitely not optimal here.Wessex
@Wessex You are right it doesn't short-circuit (against my recollection - fixing answer). But perhaps (and in concordance to the Base philosophy in your comment), a short-circuit foldl (and other reductions) should be defined, based on a zero element of op. With another signature to foldl which has a zero element definition, or simply a specialized signature for the common binary ops.Synchroscope
Might be worth adding a benchmark from the same machine for a long vector where one of the early elements is different, to see how big a deal the lack of short-circuiting is.Broucek
Note that the optimality of allequal_2 is beside the point, because it's incorrect! For example allequal_2([0,0,0]) evaluates to false.Per
@Per Thanks for taking note. I will edit the answer. Probably got confused by true == 1 and false == 0 which have questionable C-like origins.Synchroscope
@DanGetz, what is the need to specify i = 2 before running the for i=2:length(x) loop?Explosion
@Explosion no real need. Maybe it was left from a different version.Synchroscope
G
15

Just a slight improvent: allsame(x) = all(y -> y == first(x), x) is more general than allsame(x) = all(y -> y == x[1], x), and works even when x is something other than an AbstractArray, e.g. a generator.

Gabbert answered 22/8, 2018 at 11:44 Comment(3)
In particular, it works for arrays with non-standard indices, such as zero-based. BTW, if memory serves, you can write ==(first(x)) instead of y -> y == first(x). That one's new.Wessex
What's a generator?Broucek
It's a Julia object that can be iterated to produce values without needing to allocate a container for them or writing them in memory. For example, a comprehension like c = [x^2 for x in 1:100] allocates a vector of the squares of the first 100 integers. In contrast, a generator g = (x^2 for x in 1:100) produces the same values sequentially, without allocating the vector. You can do things like for s in g; $show s; end to print them. Generators turn out to be a very powerful performance tool.Gabbert
C
3

As of Julia 1.8, you can use either:

allequal(arr)

Or before 1.8, you can simply check:

length(unique(arr)) == 1
Checklist answered 2/12, 2023 at 7:18 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.