Testing equality of arrays in C#
Asked Answered
D

11

40

I have two arrays. For example:

int[] Array1 = new[] {1, 2, 3, 4, 5, 6, 7, 8, 9};
int[] Array2 = new[] {9, 1, 4, 5, 2, 3, 6, 7, 8};

What is the best way to determine if they have the same elements?

Delorenzo answered 16/3, 2009 at 6:32 Comment(5)
Are you actually dealing with numbers or is that just for the example?Alessandraalessandria
Can you use a List<T> instead (already has Contains method)?Lilith
@ed it wasn't about a simple contains, but determining both array has the same elements, re-read the question and see the answers :)Trough
@Simucal I just used integers for the example here . IN my scenario, it could be an array of objectsDelorenzo
possible duplicate of Comparing two collections for equality irrespective of the order of items in themDaysidayspring
H
21

Assuming that the values in the array are unique, you can implement a performant solution using LINQ:

// create a LINQ query that matches each item in ar1 with 
// its counterpart in ar2. we select "1" because we're only 
// interested in the count of pairs, not the values.
var q = from a in ar1 
        join b in ar2 on a equals b 
        select 1;

// if lengths of the arrays are equal and the count of matching pairs 
// is equal to the array length, then they must be equivalent.
bool equals = ar1.Length == ar2.Length && q.Count() == ar1.Length;

// when q.Count() is called, the join in the query gets translated
// to a hash-based lookup code which can run faster than nested
// for loops. 
Herrah answered 16/3, 2009 at 6:47 Comment(13)
Thanks for this code snippet. Works like a charm for my scenario!!Delorenzo
I find this to be much slower than just doing a loop and comparing each item. The loop may not be as nice looking, but much, much faster.Bis
I'm not sure about that. As far as I know LINQ-to-Objects generate intermediate hash tables to loop over joins which are much faster than a straight loop.Herrah
Here is my reference: #1562122Herrah
This doesn't work when both of the arrays have duplicate values. If both arrays contain {2,2}, the join will have 4 elements instead of 2 and the expression will be false instead of true.Polarimeter
@ssg I'm guessing you haven't actually researched any LINQ vs loop benchmark analyses. It's a generally accepted fact that LINQ is always going to be slower (certainly not faster) in most any scenario. It's strength is maintainability, not performance. See here, or here, or here.Amenity
@b1nary.atr0phy none of the links you provided is about joins. see https://mcmap.net/q/247517/-is-linq-join-operator-using-nested-loop-merge-or-hashset-joins for details about its behavior.Herrah
Linq has the potential to be faster than a raw loop, but it depends on how you use Linq and the size of data being processed. For large loops involving lookups Linq joins can be significantly faster as they use a hashtable, however for small amounts of data the performance gain is less than the cost of setting up the hashtable in the first place. Generally Linq is quite readable, but when you start using joins it arguably becomes less so. As with all code, use whatever syntax you find to be the most readable until such a time as empirical testing demonstrates the need for optimization.Discombobulate
I know this is very old, but the LINQ code does not work for my scenario. I changed it to a for loop comparing each element, providing the length of the arrays are equal, and it is blazingly fast.Outstay
@Outstay I'm glad that for loops work for your use case. It might be slow for large arrays due to the polynomial complexity of nested loops.Herrah
@Sedat Kapanoglu I'm not sure this applies, but In other code, I have a for loop that goes through an array searching for "white space" in a "visually readable" json file and removes it. I've had the size of that array be on the order of 2GB and it completes in far less time than you might think. With this code, if the arrays are the same length, there's really no complexity since the index of the equivalence relation for each array is the same. If an inequality is detected at any index, then the arrays are not equal.Outstay
@Outstay Removing whitespace is an O(N) operation, so not comparable to nested for loops. You only need an array of ~50,000 elements in your nested for loops to reach the iteration count of your 2GB whitespace removal.Herrah
@Sedat Kapanoglu OK. you are right. But to determine equality of two arrays you can stop immediately if the arrays are different lengths. In my case. the arrays are in the same order so there is no need to for multiple nested loops. If the arrays are ordered differently, perhaps an array is not an ideal way to store the data.Outstay
L
104

You could also use SequenceEqual, provided the IEnumerable objects are sorted first.

int[] a1 = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };    
int[] a2 = new[] { 9, 1, 4, 5, 2, 3, 6, 7, 8 };    

bool equals = a1.OrderBy(a => a).SequenceEqual(a2.OrderBy(a => a));
Loving answered 1/4, 2009 at 2:35 Comment(1)
yep, sort/order then SequenceEqualLenee
H
21

Assuming that the values in the array are unique, you can implement a performant solution using LINQ:

// create a LINQ query that matches each item in ar1 with 
// its counterpart in ar2. we select "1" because we're only 
// interested in the count of pairs, not the values.
var q = from a in ar1 
        join b in ar2 on a equals b 
        select 1;

// if lengths of the arrays are equal and the count of matching pairs 
// is equal to the array length, then they must be equivalent.
bool equals = ar1.Length == ar2.Length && q.Count() == ar1.Length;

// when q.Count() is called, the join in the query gets translated
// to a hash-based lookup code which can run faster than nested
// for loops. 
Herrah answered 16/3, 2009 at 6:47 Comment(13)
Thanks for this code snippet. Works like a charm for my scenario!!Delorenzo
I find this to be much slower than just doing a loop and comparing each item. The loop may not be as nice looking, but much, much faster.Bis
I'm not sure about that. As far as I know LINQ-to-Objects generate intermediate hash tables to loop over joins which are much faster than a straight loop.Herrah
Here is my reference: #1562122Herrah
This doesn't work when both of the arrays have duplicate values. If both arrays contain {2,2}, the join will have 4 elements instead of 2 and the expression will be false instead of true.Polarimeter
@ssg I'm guessing you haven't actually researched any LINQ vs loop benchmark analyses. It's a generally accepted fact that LINQ is always going to be slower (certainly not faster) in most any scenario. It's strength is maintainability, not performance. See here, or here, or here.Amenity
@b1nary.atr0phy none of the links you provided is about joins. see https://mcmap.net/q/247517/-is-linq-join-operator-using-nested-loop-merge-or-hashset-joins for details about its behavior.Herrah
Linq has the potential to be faster than a raw loop, but it depends on how you use Linq and the size of data being processed. For large loops involving lookups Linq joins can be significantly faster as they use a hashtable, however for small amounts of data the performance gain is less than the cost of setting up the hashtable in the first place. Generally Linq is quite readable, but when you start using joins it arguably becomes less so. As with all code, use whatever syntax you find to be the most readable until such a time as empirical testing demonstrates the need for optimization.Discombobulate
I know this is very old, but the LINQ code does not work for my scenario. I changed it to a for loop comparing each element, providing the length of the arrays are equal, and it is blazingly fast.Outstay
@Outstay I'm glad that for loops work for your use case. It might be slow for large arrays due to the polynomial complexity of nested loops.Herrah
@Sedat Kapanoglu I'm not sure this applies, but In other code, I have a for loop that goes through an array searching for "white space" in a "visually readable" json file and removes it. I've had the size of that array be on the order of 2GB and it completes in far less time than you might think. With this code, if the arrays are the same length, there's really no complexity since the index of the equivalence relation for each array is the same. If an inequality is detected at any index, then the arrays are not equal.Outstay
@Outstay Removing whitespace is an O(N) operation, so not comparable to nested for loops. You only need an array of ~50,000 elements in your nested for loops to reach the iteration count of your 2GB whitespace removal.Herrah
@Sedat Kapanoglu OK. you are right. But to determine equality of two arrays you can stop immediately if the arrays are different lengths. In my case. the arrays are in the same order so there is no need to for multiple nested loops. If the arrays are ordered differently, perhaps an array is not an ideal way to store the data.Outstay
C
13

Will the values always be unique? If so, how about (after checking equal length):

var set = new HashSet<int>(array1);
bool allThere = array2.All(set.Contains);
Crosby answered 16/3, 2009 at 6:42 Comment(1)
marc , I could also compare via IStructuralEquatable (tuples and arrays). So when should I choose IStructuralEquatable vs SequenceEqual ?Pesce
T
7
var shared = arr1.Intersect(arr2);
bool equals = arr1.Length == arr2.Length && shared.Count() == arr1.Length;
Trough answered 16/3, 2009 at 6:53 Comment(0)
U
7

Use extension methods (which are new in 3.0). If the length of the Intersection of the two arrays equals that of their Union then the arrays are equal.

bool equals = arrayA.Intersect(arrayB).Count() == arrayA.Union(arrayB).Count()

Succinct.

Usance answered 27/4, 2009 at 6:53 Comment(0)
L
6

For the most efficient approach (Reflectored from Microsoft code), see Stack Overflow question Comparing two collections for equality irrespective of the order of items in them.

Libertine answered 20/10, 2010 at 16:46 Comment(0)
S
5

Framework 4.0 introduced IStructuralEquatable interface which helps to compare types such as arrays or tuples:

 class Program
    {
        static void Main()
        {
            int[] array1 = { 1, 2, 3 };
            int[] array2 = { 1, 2, 3 };
            IStructuralEquatable structuralEquator = array1;
            Console.WriteLine(array1.Equals(array2));                                  // False
            Console.WriteLine(structuralEquator.Equals(array2, EqualityComparer<int>.Default));  // True

            // string arrays
            string[] a1 = "a b c d e f g".Split();
            string[] a2 = "A B C D E F G".Split();
            IStructuralEquatable structuralEquator1 = a1;
            bool areEqual = structuralEquator1.Equals(a2, StringComparer.InvariantCultureIgnoreCase);

            Console.WriteLine("Arrays of strings are equal:"+  areEqual);

            //tuples
            var firstTuple = Tuple.Create(1, "aaaaa");
            var secondTuple = Tuple.Create(1, "AAAAA");
            IStructuralEquatable structuralEquator2 = firstTuple;
            bool areTuplesEqual = structuralEquator2.Equals(secondTuple, StringComparer.InvariantCultureIgnoreCase);

            Console.WriteLine("Are tuples equal:" + areTuplesEqual);
            IStructuralComparable sc1 = firstTuple;
            int comparisonResult = sc1.CompareTo(secondTuple, StringComparer.InvariantCultureIgnoreCase);
            Console.WriteLine("Tuples comarison result:" + comparisonResult);//0
        }
    } 
Selfliquidating answered 29/12, 2012 at 21:23 Comment(0)
U
1

I have found the solution detailed here to be a very clean way, though a bit verbose for some people.

The best thing is that it works for other IEnumerables as well.

Unfriended answered 16/3, 2009 at 6:40 Comment(2)
That link describes SequenceEqual (in .NET 3.5), and will return false on this data since they are in a different order.Crosby
For reference, it would be (using extension methods) bool areEqual = array1.SequenceEqual(array2);Crosby
S
1

This will check that each array contains the same values in order.

int[] ar1 = { 1, 1, 5, 2, 4, 6, 4 };
int[] ar2 = { 1, 1, 5, 2, 4, 6, 4 };

var query = ar1.Where((b, i) => b == ar2[i]);

Assert.AreEqual(ar1.Length, query.Count());
Sigman answered 12/1, 2013 at 9:26 Comment(1)
That looks like a method from NUnit and is not a general solution.Indignation
A
1
    public static bool ValueEquals(Array array1, Array array2)
    {
        if( array1 == null && array2 == null )
        {
            return true;
        }

        if( (array1 == null) || (array2 == null) )
        {
            return false;
        }

        if( array1.Length != array2.Length )
        {
            return false;
        }
        if( array1.Equals(array2))
        {
           return true;
        }
        else
        {
            for (int Index = 0; Index < array1.Length; Index++)
            {
                if( !Equals(array1.GetValue(Index), array2.GetValue(Index)) )
                {
                    return false;
                }
            }
        }
        return true;
    }
Asocial answered 17/11, 2016 at 8:43 Comment(0)
T
0

There are, of course, many ways to compare arrays based on the structure. To add more to the answers above, you can write your own custom comparers. Let's say you want to check whether both, the 2 arrays, contain even elements - you define your comparison based on the business rules of your application, that's why it's so subjective.

Here is a way to do it, writing your own comparer. Please note that there is no much care about the GetHashCode() method and you have to write your custom equality logic, at this moment, based on the default behavior (compare reference types) .equals() will give you different results if you use another collection to hold the arrays and we are saying that these 2 arrays contain even numbers and they are therefore equal, but we are breaking the rule that "If two values x and y evaluate equal then they MUST have the same hashcode". Don't worry too much here because we are comparing integers. With that said here is the example:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApp5
{
    class EvenComparer : EqualityComparer<int>
    {


        public override bool Equals(int x, int y)
        {
            if((x % 2 == 0 && y % 2 == 0))
            {
                return true;
            }

            return false;

        }

        public override int GetHashCode(int obj)
        {
            return obj.GetHashCode();
        }
    }


    class Program
    {
        static void Main(string[] args)
        {

            //If you want to check whether the arrays are equal in the sense of containing the same elements in the same order

            int[] Array1 =  { 2, 4, 6};
            int[] Array2 =  {8, 10, 12 };



            string[] arr1 = { "Jar Jar Binks", "Kill! Kill!", "Aaaaargh!" };
            string[] arr2 = { "Jar Jar Binks", "Kill! Kill!", "Aaaaargh!" };

            bool areEqual = (arr1 as IStructuralEquatable).Equals(arr2, StringComparer.Ordinal);
            bool areEqual2 = (Array1 as IStructuralEquatable).Equals(Array2, new EvenComparer());

            Console.WriteLine(areEqual);
            Console.WriteLine(areEqual2);


            Console.WriteLine(Array1.GetHashCode());
            Console.WriteLine(Array2.GetHashCode());

        }
    }
}

After reading the answers I realize that nobody specified that you have to include the

   using System.Collections;

namespace or you will get a missing using directive or assembly reference error for using the IStructuralEquatable interface.

Hope it helps someone someday

Troxler answered 18/6, 2018 at 16:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.