Comparing two byte arrays in .NET
Asked Answered
C

28

652

How can I do this fast?

Sure I can do this:

static bool ByteArrayCompare(byte[] a1, byte[] a2)
{
    if (a1.Length != a2.Length)
        return false;

    for (int i=0; i<a1.Length; i++)
        if (a1[i]!=a2[i])
            return false;

    return true;
}

But I'm looking for either a BCL function or some highly optimized proven way to do this.

java.util.Arrays.equals((sbyte[])(Array)a1, (sbyte[])(Array)a2);

works nicely, but it doesn't look like that would work for x64.

Note my super-fast answer here.

Confounded answered 4/9, 2008 at 7:33 Comment(3)
"This kinda counts on the fact that the arrays start qword aligned." That's a big if. You should fix the code to reflect that.Clawson
return a1.Length == a2.Length && !a1.Where((t, i) => t != a2[i]).Any();Rennold
for those needing a greater than/less than type comparison, check out this answer https://mcmap.net/q/64995/-how-to-compare-two-byte-arrays-with-greater-than-or-less-than-operator-value-in-c-or-linqConfounded
C
87

Edit: modern fast way is to use a1.SequenceEquals(a2)

User gil suggested unsafe code which spawned this solution:

// Copyright (c) 2008-2013 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe bool UnsafeCompare(byte[] a1, byte[] a2) {
  unchecked {
    if(a1==a2) return true;
    if(a1==null || a2==null || a1.Length!=a2.Length)
      return false;
    fixed (byte* p1=a1, p2=a2) {
      byte* x1=p1, x2=p2;
      int l = a1.Length;
      for (int i=0; i < l/8; i++, x1+=8, x2+=8)
        if (*((long*)x1) != *((long*)x2)) return false;
      if ((l & 4)!=0) { if (*((int*)x1)!=*((int*)x2)) return false; x1+=4; x2+=4; }
      if ((l & 2)!=0) { if (*((short*)x1)!=*((short*)x2)) return false; x1+=2; x2+=2; }
      if ((l & 1)!=0) if (*((byte*)x1) != *((byte*)x2)) return false;
      return true;
    }
  }
}

which does 64-bit based comparison for as much of the array as possible. This kind of counts on the fact that the arrays start qword aligned. It'll work if not qword aligned, just not as fast as if it were.

It performs about seven timers faster than the simple `for` loop. Using the J# library performed equivalently to the original `for` loop. Using .SequenceEqual runs around seven times slower; I think just because it is using IEnumerator.MoveNext. I imagine LINQ-based solutions being at least that slow or worse.

Confounded answered 10/1, 2012 at 18:11 Comment(15)
Nice solution. But one (small) hint: A compare if references a1 and a2 are equal may speed up things if one gives the same array for a1 and b1.Fraxinella
BTW: Tried casting to Guid for a 128-bit compare, but that made it run slower.Confounded
New test data on .NET 4 x64 release: IStructualEquatable.equals ~180x slower, SequenceEqual 15x slower, SHA1 hash compare 11x slower, bitconverter ~same, unsafe 7x faster, pinvoke 11x faster. Pretty cool that unsafe is only a little bit slower than P/Invoke on memcmp.Confounded
I understand your code but not your comment on "qword aligned". What is it and why does it impact the performance?Docilla
This link gives good detail about why memory alignment matters ibm.com/developerworks/library/pa-dalign - so, an optimization could be to check alignment and if both arrays are off alignment by the same amount, do byte compares until they are both on a qword boundary.Confounded
@Hafthor: interesting article. It's quite outdated though (Feb 2005). I wonder how things are nowadays (eg if some processors still ask the OS to take care of the alignment).Docilla
wouldnt this give false when both a1 and a2 are null?Canvasback
@Confounded You should integrate your "new test data" into your answer, for better visibility. I think it really puts things into perspective!Robbyrobbyn
@Confounded Also, how do you use Bitconverter for comparing arrays?Robbyrobbyn
@CristiDiaconescu I loopified KevinDriedger's answer. What I should probably do is make the test suite and my results available on github and link to it on my answer.Confounded
@RüdigerStevens idea to add a simple if (a1 == a2) return true; to the beginning would also make this return true if both are null, addressing @Canvasback concern about nulls comparing unequal.Penny
@Confounded unsafe can be faster than pinvoke with loop unrolling. See my answer.Synergetic
Regarding the assumption that pinned GC objects be 64-bit aligned, it's highly likely to be the case, but do you know of any official mentions or guarantees? It might just follow from GC heap alignment, which would be a reliable agent to bet on for rigorous enforcement. By chance, did you check x86 under stress? Also, a minor point regarding your statement that misalignment will run "but just not as fast". To be fair you might note that by incessantly insisting on askew access, your code perpetuates quite a perverse penalty in this (fully-precluded) case, maybe even worse than byte access.Kamilahkamillah
@RüdigerStevens With full and due deference to the original poster (wecome to undo of course), I added the change you proposed. As you note, this adds reference equality as a true-positive for abandoning the compare (avoiding pinning altogether in this case), formalizes a new correctness assumption that null==null, while also avoiding NullReferenceException cases. Unchanged from before, null is never equal to any byte[0].Kamilahkamillah
This unsafe code was ever so slightly faster than the PInvoke solution. However my arrays are always 32 bytes in length. And it was also the same speed to simply do if ((a[0] == b[0]) && (a[1] == b[1]) && ... && (a[31] == b[31])).Parochialism
C
720

You can use Enumerable.SequenceEqual method.

using System;
using System.Linq;
...
var a1 = new int[] { 1, 2, 3};
var a2 = new int[] { 1, 2, 3};
var a3 = new int[] { 1, 2, 4};
var x = a1.SequenceEqual(a2); // true
var y = a1.SequenceEqual(a3); // false

If you can't use .NET 3.5 for some reason, your method is OK.
Compiler\run-time environment will optimize your loop so you don't need to worry about performance.

Crumb answered 4/9, 2008 at 7:53 Comment(12)
But doesn't SequenceEqual take longer to process than an unsafe comparison? Especially when your doing 1000's of comparisons?Dimeter
Yes, this runs about 50x slower than the unsafe comparison.Confounded
Thank you for comparing performance - for me it makes decision clear - SequenceEqual is safe but slow in this case.Khichabia
This is really raising the dead here, but slow is really a bad word to use here. 50x slower sounds bad, but it's not often you're comparing enough data for it to make a difference, and if you are, you really need to benchmark this for your own case, for a myriad of reasons. For example, note the creator of the unsafe answer notes a difference of 7x slow, as opposed to 50x slower (the unsafe method's speed also depends on the alignment of data). In the rare cases where these numbers matter, P/Invoke is even faster.Catty
So the slower implementation gets over 300 likes? I would suggest hooking the msvcrt.dll as that would be the fastest way to get the job done.Marvismarwin
Fastest is not the most important thing to a business. Maintainability is much "faster" than the savings on this code will amount to in 99% of cases. I am using SequenceEqual and my entire code is < 1ms. Those µs you are saving will never add up to the 5 minutes of lack of readability of P/Invoke.Sloat
@Sloat I understand your point but don't fully agree with it. You rarely come back to a part of your code that you are pretty sure that it works and try to understand it again. The same with your colleagues. Commenting such part of the code with something like /* very well tested blazing fast array comparison method */ is pretty much enough and won't damage your business. on the other hand, trying to optimize an application where the programmer was careless through the entire development... it's basically reviewing all pieces of the code ;)Pythoness
Fast also really depends on the size of the data and how often your doing this. I am comparing a byte array with maybee 10 bytes in it, outside of a loop. If I was trying to compare 10mb bitmaps perhaps it would be worth looking into the p/invoke solution. Premature Optimization and Micro Optimizations can be killerAtaraxia
The P/Invoke method is obviously also not portable; this is. The time I spent writing that sentence far exceeds the time savings of using P/Invoke for this purpose in our application, portability aside.Insect
The question was how to do it faster than an ordinary for loop. This is simply not an answer to the question. But obviously, lots of people came here looking for "how do I write code fast" and SequenceEqual is great for that. (As for P/invoke, not only is it less portable, it takes time to cross the boundary between C# and C++ and you'll want to avoid that cost if the array is small.)Apothecium
I found this post really helpful and would like to provide an update under .NET 6.0 (Windows). After testing both invoke/memcmp and SequenceEqual I found that there was virtually no difference in performance. For what I needed (millions of comparisons) SequenceEqual did the job and does not require direct access to WinAPI.Helbonna
Not to be a grumpy old man, but "Fastest is not the most important thing to a business." is exactly why we've gotten to the point where we are as an industry IMO... (Electron, looking at you!)Ursola
Y
272

P/Invoke powers activate!

[DllImport("msvcrt.dll", CallingConvention=CallingConvention.Cdecl)]
static extern int memcmp(byte[] b1, byte[] b2, long count);

static bool ByteArrayCompare(byte[] b1, byte[] b2)
{
    // Validate buffers are the same length.
    // This also ensures that the count does not exceed the length of either buffer.  
    return b1.Length == b2.Length && memcmp(b1, b2, b1.Length) == 0;
}
Yongyoni answered 18/9, 2009 at 15:49 Comment(15)
P/Invoke yaay - this proved to be fastest by far on bitmaps at least: #2031717Highborn
Nice solution, but unless you pin the byte arrays into place you're asking for big time problems.Polak
Pinning is not necessary in this case. The marshaller performs automatic pinning when calling native code with PInvoke. Reference: #2218944Preengage
P/Invoke may elicit boos but it is by far the fastest of all the solutions presented, including an implementation I came up with that uses unsafe pointer-sized comparisons. There are a few optimizations you can make though before calling out to native code including reference equality and comparing the first and last elements.Connotative
Why the boo? Poster wanted a fast implementation and an optimized assembly language compare can't be beat. I don't know how to get a "REPE CMPSD" out of .NET without P/INVOKE.Treat
Another perk of memcmp solution is that it can be extended to implement full ordering of byte arrays, not just compare for equality. Big +1.Syphilology
I had some issues w/ the exact implementation of this method; resolved them w/ information found here: pinvoke.net/default.aspx/msvcrt.memcmpJenisejenkel
about 11x faster than simple for loop. About 50% faster than unsafe .NET code.Confounded
This is fast, but it may also be different from using a for loop. This implementation will compare memory instead of using the Equals comparison. In the case of a byte[], the result should be the same, but this solution cannot be generalized to arbitrary arrays.Ineluctable
This is ingenious. Since these are compiler intrinsics I never thought you could use P/Invoke to call them, moreover the speedup is due to superscalar/vectorization optimizations. That's hard to get anywhere else... great find!Haiku
Nitpick: MSVCR.dll is not supposed to be used by user code. To use the MSVCR, you would have to distribute the runtime use the version you distribute. (msdn.microsoft.com/en-us/library/… and blogs.msdn.com/b/oldnewthing/archive/2014/04/11/10516280.aspx)Zig
Note that the last parameter of memcmp is an IntPtr, not a long, because if the program is running at 32 bits it is 32 bits, at 64 bits it is 64 bits. Clearly then you have to memcmp(b1, b2, (IntPtr)b1.Length)Hodess
The Function should be called ByteArrayEquals. Equals methods return true/false. Compare methods return <0, 0, >0 for less than , equals or greater.Febrifacient
Damn, I love P/Invoke.Klinges
Can avoid the unmanaged transition in a special case… return b1.Length == b2.Length && (b1.Length == 0 || memcmp(b1, b2, b1.Length) == 0);Kamilahkamillah
P
196

Span<T> offers an extremely competitive alternative without having to throw confusing and/or non-portable fluff into your own application's code base:

// byte[] is implicitly convertible to ReadOnlySpan<byte>
static bool ByteArraysEqual(ReadOnlySpan<byte> a1, ReadOnlySpan<byte> a2)
{
    return a1.SequenceEqual(a2);
}

The (guts of the) implementation as of .NET 8.0.0 can be found here... sometimes. In the releases since I first posted this answer, the team has added a conditional path that will sometimes use an optimized CLR intrinsic whose guts can be found here instead. You can see the initial motivation and implementation in dotnet/runtime#83945. It's pretty compelling stuff, though at least in the .NET 8.0.0 version, it won't help most people coming to this page, since the intrinsic part only kicks in when the JIT compiler can "see" the length as a constant.

I've revised @EliArbel's gist to add this method as SpansEqual, drop most of the less interesting performers in others' benchmarks, run it with different array sizes, output graphs, and mark SpansEqual as the baseline so that it reports how the different methods compare to SpansEqual. I've also switched to using Linux for these comparisons in the future, which should be mostly transparent except for PInvokeMemcmp, which now uses my system's libc instead of msvcrt.

The below numbers are from the results, lightly edited to remove "Error" column.

| Method        | ByteCount  | Mean              | StdDev          | Ratio | RatioSD |
|-------------- |----------- |------------------:|----------------:|------:|--------:|
| SpansEqual    | 15         |          2.039 ns |       0.0006 ns |  1.00 |    0.00 |
| LongPointers  | 15         |          2.254 ns |       0.0040 ns |  1.11 |    0.00 |
| Unrolled      | 15         |          8.276 ns |       0.0048 ns |  4.06 |    0.00 |
| PInvokeMemcmp | 15         |          5.065 ns |       0.0077 ns |  2.48 |    0.00 |
|               |            |                   |                 |       |         |
| SpansEqual    | 1026       |         14.363 ns |       0.0861 ns |  1.00 |    0.00 |
| LongPointers  | 1026       |         36.848 ns |       0.0675 ns |  2.57 |    0.02 |
| Unrolled      | 1026       |         20.619 ns |       0.0094 ns |  1.44 |    0.01 |
| PInvokeMemcmp | 1026       |         11.995 ns |       0.0347 ns |  0.84 |    0.00 |
|               |            |                   |                 |       |         |
| SpansEqual    | 1048585    |     16,410.211 ns |      30.1120 ns |  1.00 |    0.00 |
| LongPointers  | 1048585    |     38,607.983 ns |     165.0060 ns |  2.35 |    0.01 |
| Unrolled      | 1048585    |     28,857.431 ns |      19.9256 ns |  1.76 |    0.00 |
| PInvokeMemcmp | 1048585    |     14,869.920 ns |     119.6501 ns |  0.91 |    0.01 |
|               |            |                   |                 |       |         |
| SpansEqual    | 2147483591 | 38,986,402.263 ns | 133,647.1174 ns |  1.00 |    0.00 |
| LongPointers  | 2147483591 | 76,738,513.333 ns |  18,573.6696 ns |  1.97 |    0.01 |
| Unrolled      | 2147483591 | 56,399,801.524 ns | 188,024.0981 ns |  1.45 |    0.01 |
| PInvokeMemcmp | 2147483591 | 39,772,546.319 ns |  10,138.7623 ns |  1.02 |    0.00 |

I was surprised to see SpansEqual not come out on top for the max-array-size methods, but the difference is so minor that I don't think it'll ever matter. After refreshing to run on .NET 6.0.4 with my newer hardware, SpansEqual now comfortably outperforms all others at all array sizes. And after refreshing to run on .NET 8.0.0 (and, more pertinently here, to use my Linux system's libc instead of my old Windows system's msvcrt), PInvokeMemcmp takes a slight edge at certain array sizes, though not the largest or the smallest size.

My system info:

BenchmarkDotNet v0.13.11, EndeavourOS
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Penny answered 3/2, 2018 at 15:45 Comment(8)
I'd never thought I will be using Span<T> or something close to it in all the stuff that I do. Thanks to you I can now brag about this to my co-workers.Unrequited
Is SequenceEqual especially implemented as a Span method? Thought it was just one of the IEnumerable extension methods.Patrickpatrilateral
@Patrickpatrilateral yes, {ReadOnly,}Span<T> has its own version of SequenceEqual (same name because it has the same contract as the corresponding IEnumerable<T> extension method, it's just faster). Note that {ReadOnly,}Span<T> can't use IEnumerable<T> extension methods because of the restrictions on ref struct types.Penny
Nice to know. I'm currently still stuck on .NET 4.5, so I sadly can't use the Spans yet.Patrickpatrilateral
@Sentinel the System.Memory package has "portable" / "slow" Span<T> implementations for netstandard1.1 and above (so play with this interactive chart to see which those are). "Fast" Span<T> is only available in .NET Core 2.1, at this moment, but note that for SequenceEqual<T> specifically, there should be very little difference between "fast" and "slow" / "portable" (though netstandard2.0 targets should see a slight improvement because they have the vectorized code path).Penny
@JoeAmenta Ok thanks for the info. I will look into this more. I am doing a lot of work playing with protocols, bytes, and porting quite a bit of stuff from golang I suppose, which has these array span constructs. Span would be a lovely addition to multiplatform .NET.Saari
install-package system.memoryAppendectomy
This is now fundamentally the correct answer in .NET 6 onwards, the underlying implementation of Span<T>.SequenceEquals has vectorised fast paths if the CPU supports it. This will beat all other methods in 99.9% of cases.Fiester
A
166

There's a new built-in solution for this in .NET 4 - IStructuralEquatable

static bool ByteArrayCompare(byte[] a1, byte[] a2) 
{
    return StructuralComparisons.StructuralEqualityComparer.Equals(a1, a2);
}
Amena answered 15/11, 2011 at 17:38 Comment(4)
According to this blog post that's actually very slow.Perspex
Crazy slow. About 180x slower than simple for loop.Confounded
Why not just StructuralComparisons.StructuralEqualityComparer.Equals(a1, a2). No NullReferenceExceptions here.Freemon
@Freemon Thanks, Can't argue with a one liner ! The previous solution was slightly more efficient since it saved the cast to IStructuralEquatable (an array is statically known to be IStructuralEquatable), but indeed your suggestions makes the method work for null arguments as well.Amena
C
87

Edit: modern fast way is to use a1.SequenceEquals(a2)

User gil suggested unsafe code which spawned this solution:

// Copyright (c) 2008-2013 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe bool UnsafeCompare(byte[] a1, byte[] a2) {
  unchecked {
    if(a1==a2) return true;
    if(a1==null || a2==null || a1.Length!=a2.Length)
      return false;
    fixed (byte* p1=a1, p2=a2) {
      byte* x1=p1, x2=p2;
      int l = a1.Length;
      for (int i=0; i < l/8; i++, x1+=8, x2+=8)
        if (*((long*)x1) != *((long*)x2)) return false;
      if ((l & 4)!=0) { if (*((int*)x1)!=*((int*)x2)) return false; x1+=4; x2+=4; }
      if ((l & 2)!=0) { if (*((short*)x1)!=*((short*)x2)) return false; x1+=2; x2+=2; }
      if ((l & 1)!=0) if (*((byte*)x1) != *((byte*)x2)) return false;
      return true;
    }
  }
}

which does 64-bit based comparison for as much of the array as possible. This kind of counts on the fact that the arrays start qword aligned. It'll work if not qword aligned, just not as fast as if it were.

It performs about seven timers faster than the simple `for` loop. Using the J# library performed equivalently to the original `for` loop. Using .SequenceEqual runs around seven times slower; I think just because it is using IEnumerator.MoveNext. I imagine LINQ-based solutions being at least that slow or worse.

Confounded answered 10/1, 2012 at 18:11 Comment(15)
Nice solution. But one (small) hint: A compare if references a1 and a2 are equal may speed up things if one gives the same array for a1 and b1.Fraxinella
BTW: Tried casting to Guid for a 128-bit compare, but that made it run slower.Confounded
New test data on .NET 4 x64 release: IStructualEquatable.equals ~180x slower, SequenceEqual 15x slower, SHA1 hash compare 11x slower, bitconverter ~same, unsafe 7x faster, pinvoke 11x faster. Pretty cool that unsafe is only a little bit slower than P/Invoke on memcmp.Confounded
I understand your code but not your comment on "qword aligned". What is it and why does it impact the performance?Docilla
This link gives good detail about why memory alignment matters ibm.com/developerworks/library/pa-dalign - so, an optimization could be to check alignment and if both arrays are off alignment by the same amount, do byte compares until they are both on a qword boundary.Confounded
@Hafthor: interesting article. It's quite outdated though (Feb 2005). I wonder how things are nowadays (eg if some processors still ask the OS to take care of the alignment).Docilla
wouldnt this give false when both a1 and a2 are null?Canvasback
@Confounded You should integrate your "new test data" into your answer, for better visibility. I think it really puts things into perspective!Robbyrobbyn
@Confounded Also, how do you use Bitconverter for comparing arrays?Robbyrobbyn
@CristiDiaconescu I loopified KevinDriedger's answer. What I should probably do is make the test suite and my results available on github and link to it on my answer.Confounded
@RüdigerStevens idea to add a simple if (a1 == a2) return true; to the beginning would also make this return true if both are null, addressing @Canvasback concern about nulls comparing unequal.Penny
@Confounded unsafe can be faster than pinvoke with loop unrolling. See my answer.Synergetic
Regarding the assumption that pinned GC objects be 64-bit aligned, it's highly likely to be the case, but do you know of any official mentions or guarantees? It might just follow from GC heap alignment, which would be a reliable agent to bet on for rigorous enforcement. By chance, did you check x86 under stress? Also, a minor point regarding your statement that misalignment will run "but just not as fast". To be fair you might note that by incessantly insisting on askew access, your code perpetuates quite a perverse penalty in this (fully-precluded) case, maybe even worse than byte access.Kamilahkamillah
@RüdigerStevens With full and due deference to the original poster (wecome to undo of course), I added the change you proposed. As you note, this adds reference equality as a true-positive for abandoning the compare (avoiding pinning altogether in this case), formalizes a new correctness assumption that null==null, while also avoiding NullReferenceException cases. Unchanged from before, null is never equal to any byte[0].Kamilahkamillah
This unsafe code was ever so slightly faster than the PInvoke solution. However my arrays are always 32 bytes in length. And it was also the same speed to simply do if ((a[0] == b[0]) && (a[1] == b[1]) && ... && (a[31] == b[31])).Parochialism
P
32

If you are not opposed to doing it, you can import the J# assembly "vjslib.dll" and use its Arrays.equals(byte[], byte[]) method...

Don't blame me if someone laughs at you though...


EDIT: For what little it is worth, I used Reflector to disassemble the code for that, and here is what it looks like:

public static bool equals(sbyte[] a1, sbyte[] a2)
{
  if (a1 == a2)
  {
    return true;
  }
  if ((a1 != null) && (a2 != null))
  {
    if (a1.Length != a2.Length)
    {
      return false;
    }
    for (int i = 0; i < a1.Length; i++)
    {
      if (a1[i] != a2[i])
      {
        return false;
      }
    }
    return true;
  }
  return false;
}
Permeate answered 4/9, 2008 at 7:48 Comment(0)
E
28

.NET 3.5 and newer have a new public type, System.Data.Linq.Binary that encapsulates byte[]. It implements IEquatable<Binary> that (in effect) compares two byte arrays. Note that System.Data.Linq.Binary also has implicit conversion operator from byte[].

MSDN documentation:System.Data.Linq.Binary

Reflector decompile of the Equals method:

private bool EqualsTo(Binary binary)
{
    if (this != binary)
    {
        if (binary == null)
        {
            return false;
        }
        if (this.bytes.Length != binary.bytes.Length)
        {
            return false;
        }
        if (this.hashCode != binary.hashCode)
        {
            return false;
        }
        int index = 0;
        int length = this.bytes.Length;
        while (index < length)
        {
            if (this.bytes[index] != binary.bytes[index])
            {
                return false;
            }
            index++;
        }
    }
    return true;
}

Interesting twist is that they only proceed to byte-by-byte comparison loop if hashes of the two Binary objects are the same. This, however, comes at the cost of computing the hash in constructor of Binary objects (by traversing the array with for loop :-) ).

The above implementation means that in the worst case you may have to traverse the arrays three times: first to compute hash of array1, then to compute hash of array2 and finally (because this is the worst case scenario, lengths and hashes equal) to compare bytes in array1 with bytes in array 2.

Overall, even though System.Data.Linq.Binary is built into BCL, I don't think it is the fastest way to compare two byte arrays :-|.

Etti answered 1/5, 2009 at 13:14 Comment(0)
A
23

I posted a similar question about checking if byte[] is full of zeroes. (SIMD code was beaten so I removed it from this answer.) Here is fastest code from my comparisons:

static unsafe bool EqualBytesLongUnrolled (byte[] data1, byte[] data2)
{
    if (data1 == data2)
        return true;
    if (data1.Length != data2.Length)
        return false;

    fixed (byte* bytes1 = data1, bytes2 = data2) {
        int len = data1.Length;
        int rem = len % (sizeof(long) * 16);
        long* b1 = (long*)bytes1;
        long* b2 = (long*)bytes2;
        long* e1 = (long*)(bytes1 + len - rem);

        while (b1 < e1) {
            if (*(b1) != *(b2) || *(b1 + 1) != *(b2 + 1) || 
                *(b1 + 2) != *(b2 + 2) || *(b1 + 3) != *(b2 + 3) ||
                *(b1 + 4) != *(b2 + 4) || *(b1 + 5) != *(b2 + 5) || 
                *(b1 + 6) != *(b2 + 6) || *(b1 + 7) != *(b2 + 7) ||
                *(b1 + 8) != *(b2 + 8) || *(b1 + 9) != *(b2 + 9) || 
                *(b1 + 10) != *(b2 + 10) || *(b1 + 11) != *(b2 + 11) ||
                *(b1 + 12) != *(b2 + 12) || *(b1 + 13) != *(b2 + 13) || 
                *(b1 + 14) != *(b2 + 14) || *(b1 + 15) != *(b2 + 15))
                return false;
            b1 += 16;
            b2 += 16;
        }

        for (int i = 0; i < rem; i++)
            if (data1 [len - 1 - i] != data2 [len - 1 - i])
                return false;

        return true;
    }
}

Measured on two 256MB byte arrays:

UnsafeCompare                           : 86,8784 ms
EqualBytesSimd                          : 71,5125 ms
EqualBytesSimdUnrolled                  : 73,1917 ms
EqualBytesLongUnrolled                  : 39,8623 ms
Authoritarian answered 23/10, 2015 at 17:7 Comment(5)
I confirm. I also ran the tests. This is faster than the answer that uses memcmp unsafe call.Sandpaper
@AmberdeBlack Are you sure? Did you test with tiny arrays?Mellifluent
@Authoritarian Are you sure this is faster than memcmp, cause my testing shows otherwise?Mellifluent
I got virtually identical performance between this and memcmp, so +1 for a fully managed solution.Potemkin
Is there a difference in performance between using ulong* and long*?Scarbrough
L
12
 using System.Linq; //SequenceEqual

 byte[] ByteArray1 = null;
 byte[] ByteArray2 = null;

 ByteArray1 = MyFunct1();
 ByteArray2 = MyFunct2();

 if (ByteArray1.SequenceEqual<byte>(ByteArray2) == true)
 {
    MessageBox.Show("Match");
 }
 else
 {
   MessageBox.Show("Don't match");
 }
Lanchow answered 6/1, 2011 at 16:15 Comment(2)
That's what I've been using. But it umm... sounds like a sequential comparison you'd otherwise do using a simple loop, hence not very fast. It'd be nice to reflect it and see what's actually doing. Judging by the name, it's nothing fancy.Shirlyshiroma
Yes, but already mentioned in the accepted answer. btw, you could remove the type specification there.Canvasback
H
12

Let's add one more!

Recently Microsoft released a special NuGet package, System.Runtime.CompilerServices.Unsafe. It's special because it's written in IL, and provides low-level functionality not directly available in C#.

One of its methods, Unsafe.As<T>(object) allows casting any reference type to another reference type, skipping any safety checks. This is usually a very bad idea, but if both types have the same structure, it can work. So we can use this to cast a byte[] to a long[]:

bool CompareWithUnsafeLibrary(byte[] a1, byte[] a2)
{
    if (a1.Length != a2.Length) return false;

    var longSize = (int)Math.Floor(a1.Length / 8.0);
    var long1 = Unsafe.As<long[]>(a1);
    var long2 = Unsafe.As<long[]>(a2);

    for (var i = 0; i < longSize; i++)
    {
        if (long1[i] != long2[i]) return false;
    }

    for (var i = longSize * 8; i < a1.Length; i++)
    {
        if (a1[i] != a2[i]) return false;
    }

    return true;
}

Note that long1.Length would still return the original array's length, since it's stored in a field in the array's memory structure.

This method is not quite as fast as other methods demonstrated here, but it is a lot faster than the naive method, doesn't use unsafe code or P/Invoke or pinning, and the implementation is quite straightforward (IMO). Here are some BenchmarkDotNet results from my machine:

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4870HQ CPU 2.50GHz, ProcessorCount=8
Frequency=2435775 Hz, Resolution=410.5470 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0

                 Method |          Mean |    StdDev |
----------------------- |-------------- |---------- |
          UnsafeLibrary |   125.8229 ns | 0.3588 ns |
          UnsafeCompare |    89.9036 ns | 0.8243 ns |
           JSharpEquals | 1,432.1717 ns | 1.3161 ns |
 EqualBytesLongUnrolled |    43.7863 ns | 0.8923 ns |
              NewMemCmp |    65.4108 ns | 0.2202 ns |
            ArraysEqual |   910.8372 ns | 2.6082 ns |
          PInvokeMemcmp |    52.7201 ns | 0.1105 ns |

I've also created a gist with all the tests.

Havener answered 3/4, 2017 at 17:57 Comment(1)
It doesn't use the unsafe keyword, yet it calls unsafe code anyways by using the System.Runtime.CompilerServices.UnsafeFaircloth
S
10

I developed a method that slightly beats memcmp() (plinth's answer) and very slighly beats EqualBytesLongUnrolled() (Arek Bulski's answer) on my PC. Basically, it unrolls the loop by 4 instead of 8.

Update 30 Mar. 2019:

Starting in .NET core 3.0, we have SIMD support!

This solution is fastest by a considerable margin on my PC:

#if NETCOREAPP3_0
using System.Runtime.Intrinsics.X86;
#endif
…

public static unsafe bool Compare(byte[] arr0, byte[] arr1)
{
    if (arr0 == arr1)
    {
        return true;
    }
    if (arr0 == null || arr1 == null)
    {
        return false;
    }
    if (arr0.Length != arr1.Length)
    {
        return false;
    }
    if (arr0.Length == 0)
    {
        return true;
    }
    fixed (byte* b0 = arr0, b1 = arr1)
    {
#if NETCOREAPP3_0
        if (Avx2.IsSupported)
        {
            return Compare256(b0, b1, arr0.Length);
        }
        else if (Sse2.IsSupported)
        {
            return Compare128(b0, b1, arr0.Length);
        }
        else
#endif
        {
            return Compare64(b0, b1, arr0.Length);
        }
    }
}
#if NETCOREAPP3_0
public static unsafe bool Compare256(byte* b0, byte* b1, int length)
{
    byte* lastAddr = b0 + length;
    byte* lastAddrMinus128 = lastAddr - 128;
    const int mask = -1;
    while (b0 < lastAddrMinus128) // unroll the loop so that we are comparing 128 bytes at a time.
    {
        if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0), Avx.LoadVector256(b1))) != mask)
        {
            return false;
        }
        if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 32), Avx.LoadVector256(b1 + 32))) != mask)
        {
            return false;
        }
        if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 64), Avx.LoadVector256(b1 + 64))) != mask)
        {
            return false;
        }
        if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 96), Avx.LoadVector256(b1 + 96))) != mask)
        {
            return false;
        }
        b0 += 128;
        b1 += 128;
    }
    while (b0 < lastAddr)
    {
        if (*b0 != *b1) return false;
        b0++;
        b1++;
    }
    return true;
}
public static unsafe bool Compare128(byte* b0, byte* b1, int length)
{
    byte* lastAddr = b0 + length;
    byte* lastAddrMinus64 = lastAddr - 64;
    const int mask = 0xFFFF;
    while (b0 < lastAddrMinus64) // unroll the loop so that we are comparing 64 bytes at a time.
    {
        if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0), Sse2.LoadVector128(b1))) != mask)
        {
            return false;
        }
        if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 16), Sse2.LoadVector128(b1 + 16))) != mask)
        {
            return false;
        }
        if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 32), Sse2.LoadVector128(b1 + 32))) != mask)
        {
            return false;
        }
        if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 48), Sse2.LoadVector128(b1 + 48))) != mask)
        {
            return false;
        }
        b0 += 64;
        b1 += 64;
    }
    while (b0 < lastAddr)
    {
        if (*b0 != *b1) return false;
        b0++;
        b1++;
    }
    return true;
}
#endif
public static unsafe bool Compare64(byte* b0, byte* b1, int length)
{
    byte* lastAddr = b0 + length;
    byte* lastAddrMinus32 = lastAddr - 32;
    while (b0 < lastAddrMinus32) // unroll the loop so that we are comparing 32 bytes at a time.
    {
        if (*(ulong*)b0 != *(ulong*)b1) return false;
        if (*(ulong*)(b0 + 8) != *(ulong*)(b1 + 8)) return false;
        if (*(ulong*)(b0 + 16) != *(ulong*)(b1 + 16)) return false;
        if (*(ulong*)(b0 + 24) != *(ulong*)(b1 + 24)) return false;
        b0 += 32;
        b1 += 32;
    }
    while (b0 < lastAddr)
    {
        if (*b0 != *b1) return false;
        b0++;
        b1++;
    }
    return true;
}
Synergetic answered 23/6, 2016 at 22:23 Comment(5)
My measurements differs for .NET 462 can the NETCORE:Wagshul
The code crashes when comparing two 0-length arrays, because pinning returns null.Kamilahkamillah
memcmp is not just an equity comparer. It provides infomation which object is bigger or smaller. Can you adopt your algorithm for this purpose and check the performance?Gaul
Is it faster than Span and memcpy?Condescending
@Condescending On .NET core 3 and modern CPU, it should be 2-3 times faster for large arrays.Synergetic
B
8

I would use unsafe code and run the for loop comparing Int32 pointers.

Maybe you should also consider checking the arrays to be non-null.

Busty answered 4/9, 2008 at 7:45 Comment(0)
C
7

If you look at how .NET does string.Equals, you see that it uses a private method called EqualsHelper which has an "unsafe" pointer implementation. .NET Reflector is your friend to see how things are done internally.

This can be used as a template for byte array comparison which I did an implementation on in blog post Fast byte array comparison in C#. I also did some rudimentary benchmarks to see when a safe implementation is faster than the unsafe.

That said, unless you really need killer performance, I'd go for a simple fr loop comparison.

Copperas answered 9/8, 2009 at 20:36 Comment(0)
M
7

For those of you that care about order (i.e. want your memcmp to return an int like it should instead of nothing), .NET Core 3.0 (and presumably .NET Standard 2.1 aka .NET 5.0) will include a Span.SequenceCompareTo(...) extension method (plus a Span.SequenceEqualTo) that can be used to compare two ReadOnlySpan<T> instances (where T: IComparable<T>).

In the original GitHub proposal, the discussion included approach comparisons with jump table calculations, reading a byte[] as long[], SIMD usage, and p/invoke to the CLR implementation's memcmp.

Going forward, this should be your go-to method for comparing byte arrays or byte ranges (as should using Span<byte> instead of byte[] for your .NET Standard 2.1 APIs), and it is sufficiently fast enough that you should no longer care about optimizing it (and no, despite the similarities in name it does not perform as abysmally as the horrid Enumerable.SequenceEqual).

#if NETCOREAPP3_0_OR_GREATER
// Using the platform-native Span<T>.SequenceEqual<T>(..)
public static int Compare(byte[] range1, int offset1, byte[] range2, int offset2, int count)
{
    var span1 = range1.AsSpan(offset1, count);
    var span2 = range2.AsSpan(offset2, count);

    return span1.SequenceCompareTo(span2);
    // or, if you don't care about ordering
    // return span1.SequenceEqual(span2);
}
#else
// The most basic implementation, in platform-agnostic, safe C#
public static bool Compare(byte[] range1, int offset1, byte[] range2, int offset2, int count)
{
    // Working backwards lets the compiler optimize away bound checking after the first loop
    for (int i = count - 1; i >= 0; --i)
    {
        if (range1[offset1 + i] != range2[offset2 + i])
        {
            return false;
        }
    }

    return true;
}
#endif
Morganatic answered 12/6, 2019 at 0:48 Comment(0)
H
5

I did some measurements using attached program .net 4.7 release build without the debugger attached. I think people have been using the wrong metric since what you are about if you care about speed here is how long it takes to figure out if two byte arrays are equal. i.e. throughput in bytes.

StructuralComparison :              4.6 MiB/s
for                  :            274.5 MiB/s
ToUInt32             :            263.6 MiB/s
ToUInt64             :            474.9 MiB/s
memcmp               :           8500.8 MiB/s

As you can see, there's no better way than memcmp and it's orders of magnitude faster. A simple for loop is the second best option. And it still boggles my mind why Microsoft cannot simply include a Buffer.Compare method.

[Program.cs]:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;

namespace memcmp
{
    class Program
    {
        static byte[] TestVector(int size)
        {
            var data = new byte[size];
            using (var rng = new System.Security.Cryptography.RNGCryptoServiceProvider())
            {
                rng.GetBytes(data);
            }
            return data;
        }

        static TimeSpan Measure(string testCase, TimeSpan offset, Action action, bool ignore = false)
        {
            var t = Stopwatch.StartNew();
            var n = 0L;
            while (t.Elapsed < TimeSpan.FromSeconds(10))
            {
                action();
                n++;
            }
            var elapsed = t.Elapsed - offset;
            if (!ignore)
            {
                Console.WriteLine($"{testCase,-16} : {n / elapsed.TotalSeconds,16:0.0} MiB/s");
            }
            return elapsed;
        }

        [DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
        static extern int memcmp(byte[] b1, byte[] b2, long count);

        static void Main(string[] args)
        {
            // how quickly can we establish if two sequences of bytes are equal?

            // note that we are testing the speed of different comparsion methods

            var a = TestVector(1024 * 1024); // 1 MiB
            var b = (byte[])a.Clone();

            // was meant to offset the overhead of everything but copying but my attempt was a horrible mistake... should have reacted sooner due to the initially ridiculous throughput values...
            // Measure("offset", new TimeSpan(), () => { return; }, ignore: true);
            var offset = TimeZone.Zero

            Measure("StructuralComparison", offset, () =>
            {
                StructuralComparisons.StructuralEqualityComparer.Equals(a, b);
            });

            Measure("for", offset, () =>
            {
                for (int i = 0; i < a.Length; i++)
                {
                    if (a[i] != b[i]) break;
                }
            });

            Measure("ToUInt32", offset, () =>
            {
                for (int i = 0; i < a.Length; i += 4)
                {
                    if (BitConverter.ToUInt32(a, i) != BitConverter.ToUInt32(b, i)) break;
                }
            });

            Measure("ToUInt64", offset, () =>
            {
                for (int i = 0; i < a.Length; i += 8)
                {
                    if (BitConverter.ToUInt64(a, i) != BitConverter.ToUInt64(b, i)) break;
                }
            });

            Measure("memcmp", offset, () =>
            {
                memcmp(a, b, a.Length);
            });
        }
    }
}
Haiku answered 28/9, 2017 at 14:23 Comment(2)
is the memcmp call dependent on msvc something tied to Visual C++ or can it use clang too ?Copywriter
You can import almost any function as long as there's some metadata to bind to it. The reason I use msvcrt is that it ships with the CLR. But there's nothing special about it. You can DllImport anything. Just make sure the marshalling and calling conventions match.Haiku
M
4

Couldn't find a solution I'm completely happy with (reasonable performance, but no unsafe code/pinvoke) so I came up with this, nothing really original, but works:

    /// <summary>
    /// 
    /// </summary>
    /// <param name="array1"></param>
    /// <param name="array2"></param>
    /// <param name="bytesToCompare"> 0 means compare entire arrays</param>
    /// <returns></returns>
    public static bool ArraysEqual(byte[] array1, byte[] array2, int bytesToCompare = 0)
    {
        if (array1.Length != array2.Length) return false;

        var length = (bytesToCompare == 0) ? array1.Length : bytesToCompare;
        var tailIdx = length - length % sizeof(Int64);

        //check in 8 byte chunks
        for (var i = 0; i < tailIdx; i += sizeof(Int64))
        {
            if (BitConverter.ToInt64(array1, i) != BitConverter.ToInt64(array2, i)) return false;
        }

        //check the remainder of the array, always shorter than 8 bytes
        for (var i = tailIdx; i < length; i++)
        {
            if (array1[i] != array2[i]) return false;
        }

        return true;
    }

Performance compared with some of the other solutions on this page:

Simple Loop: 19837 ticks, 1.00

*BitConverter: 4886 ticks, 4.06

UnsafeCompare: 1636 ticks, 12.12

EqualBytesLongUnrolled: 637 ticks, 31.09

P/Invoke memcmp: 369 ticks, 53.67

Tested in linqpad, 1000000 bytes identical arrays (worst case scenario), 500 iterations each.

Mellifluent answered 29/3, 2016 at 17:32 Comment(2)
yeah, I noted that in the comment of https://mcmap.net/q/63825/-comparing-two-byte-arrays-in-net that my testing shows this is actually a little bit slower than the simple for loop I had in the original question.Confounded
are you sure? In my testing it is 4 times faster? Nothing beats good old native code though, even with marshaling overhead.Mellifluent
W
4

It seems that EqualBytesLongUnrolled is the best from the above suggested.

Skipped methods (Enumerable.SequenceEqual,StructuralComparisons.StructuralEqualityComparer.Equals), were not-patient-for-slow. On 265MB arrays I have measured this:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-3770 CPU 3.40GHz, ProcessorCount=8
Frequency=3323582 ticks, Resolution=300.8802 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1590.0

Type=CompareMemoriesBenchmarks  Mode=Throughput  

                 Method |      Median |    StdDev | Scaled | Scaled-SD |
----------------------- |------------ |---------- |------- |---------- |
             NewMemCopy |  30.0443 ms | 1.1880 ms |   1.00 |      0.00 |
 EqualBytesLongUnrolled |  29.9917 ms | 0.7480 ms |   0.99 |      0.04 |
          msvcrt_memcmp |  30.0930 ms | 0.2964 ms |   1.00 |      0.03 |
          UnsafeCompare |  31.0520 ms | 0.7072 ms |   1.03 |      0.04 |
       ByteArrayCompare | 212.9980 ms | 2.0776 ms |   7.06 |      0.25 |

OS=Windows
Processor=?, ProcessorCount=8
Frequency=3323582 ticks, Resolution=300.8802 ns, Timer=TSC
CLR=CORE, Arch=64-bit ? [RyuJIT]
GC=Concurrent Workstation
dotnet cli version: 1.0.0-preview2-003131

Type=CompareMemoriesBenchmarks  Mode=Throughput  

                 Method |      Median |    StdDev | Scaled | Scaled-SD |
----------------------- |------------ |---------- |------- |---------- |
             NewMemCopy |  30.1789 ms | 0.0437 ms |   1.00 |      0.00 |
 EqualBytesLongUnrolled |  30.1985 ms | 0.1782 ms |   1.00 |      0.01 |
          msvcrt_memcmp |  30.1084 ms | 0.0660 ms |   1.00 |      0.00 |
          UnsafeCompare |  31.1845 ms | 0.4051 ms |   1.03 |      0.01 |
       ByteArrayCompare | 212.0213 ms | 0.1694 ms |   7.03 |      0.01 |
Wagshul answered 26/9, 2016 at 11:24 Comment(0)
D
3

For comparing short byte arrays the following is an interesting hack:

if(myByteArray1.Length != myByteArray2.Length) return false;
if(myByteArray1.Length == 8)
   return BitConverter.ToInt64(myByteArray1, 0) == BitConverter.ToInt64(myByteArray2, 0); 
else if(myByteArray.Length == 4)
   return BitConverter.ToInt32(myByteArray2, 0) == BitConverter.ToInt32(myByteArray2, 0); 

Then I would probably fall out to the solution listed in the question.

It'd be interesting to do a performance analysis of this code.

Doth answered 18/9, 2009 at 15:29 Comment(1)
int i=0; for(;i<a1.Length-7;i+=8) if(BitConverter.ToInt64(a1,i)!=BitConverter.ToInt64(a2,i)) return false; for(;i<a1.Length;i++) if(a1[i]!=a2[i]) return false; return true; // a little bit slower than simple for loop.Confounded
J
3

I have not seen many linq solutions here.

I am not sure of the performance implications, however I generally stick to linq as rule of thumb and then optimize later if necessary.

public bool CompareTwoArrays(byte[] array1, byte[] array2)
 {
   return !array1.Where((t, i) => t != array2[i]).Any();
 }

Please do note this only works if they are the same size arrays. an extension could look like so

public bool CompareTwoArrays(byte[] array1, byte[] array2)
 {
   if (array1.Length != array2.Length) return false;
   return !array1.Where((t, i) => t != array2[i]).Any();
 }
Jobey answered 2/2, 2018 at 6:53 Comment(1)
The whole point of the question is a faster solution that the function posted in the question.Low
R
2

I thought about block-transfer acceleration methods built into many graphics cards. But then you would have to copy over all the data byte-wise, so this doesn't help you much if you don't want to implement a whole portion of your logic in unmanaged and hardware-dependent code...

Another way of optimization similar to the approach shown above would be to store as much of your data as possible in a long[] rather than a byte[] right from the start, for example if you are reading it sequentially from a binary file, or if you use a memory mapped file, read in data as long[] or single long values. Then, your comparison loop will only need 1/8th of the number of iterations it would have to do for a byte[] containing the same amount of data. It is a matter of when and how often you need to compare vs. when and how often you need to access the data in a byte-by-byte manner, e.g. to use it in an API call as a parameter in a method that expects a byte[]. In the end, you only can tell if you really know the use case...

Remembrance answered 7/8, 2009 at 11:9 Comment(1)
The accepted answer recasts the byte buffer as a long buffer and compares it as you describe.Confounded
I
1

Sorry, if you're looking for a managed way you're already doing it correctly and to my knowledge there's no built in method in the BCL for doing this.

You should add some initial null checks and then just reuse it as if it where in BCL.

Impulse answered 4/9, 2008 at 7:45 Comment(1)
You were right when you wrote that, however in 2010 (.NET 4.0) a BCL method came, see Ohad Schneider's answer. At the time of the question, .NET 3.5 had Linq (see aku's answer).Brentbrenton
C
1

I settled on a solution inspired by the EqualBytesLongUnrolled method posted by ArekBulski with an additional optimization. In my instance, array differences in arrays tend to be near the tail of the arrays. In testing, I found that when this is the case for large arrays, being able to compare array elements in reverse order gives this solution a huge performance gain over the memcmp based solution. Here is that solution:

public enum CompareDirection { Forward, Backward }

private static unsafe bool UnsafeEquals(byte[] a, byte[] b, CompareDirection direction = CompareDirection.Forward)
{
    // returns when a and b are same array or both null
    if (a == b) return true;

    // if either is null or different lengths, can't be equal
    if (a == null || b == null || a.Length != b.Length)
        return false;

    const int UNROLLED = 16;                // count of longs 'unrolled' in optimization
    int size = sizeof(long) * UNROLLED;     // 128 bytes (min size for 'unrolled' optimization)
    int len = a.Length;
    int n = len / size;         // count of full 128 byte segments
    int r = len % size;         // count of remaining 'unoptimized' bytes

    // pin the arrays and access them via pointers
    fixed (byte* pb_a = a, pb_b = b)
    {
        if (r > 0 && direction == CompareDirection.Backward)
        {
            byte* pa = pb_a + len - 1;
            byte* pb = pb_b + len - 1;
            byte* phead = pb_a + len - r;
            while(pa >= phead)
            {
                if (*pa != *pb) return false;
                pa--;
                pb--;
            }
        }

        if (n > 0)
        {
            int nOffset = n * size;
            if (direction == CompareDirection.Forward)
            {
                long* pa = (long*)pb_a;
                long* pb = (long*)pb_b;
                long* ptail = (long*)(pb_a + nOffset);
                while (pa < ptail)
                {
                    if (*(pa + 0) != *(pb + 0) || *(pa + 1) != *(pb + 1) ||
                        *(pa + 2) != *(pb + 2) || *(pa + 3) != *(pb + 3) ||
                        *(pa + 4) != *(pb + 4) || *(pa + 5) != *(pb + 5) ||
                        *(pa + 6) != *(pb + 6) || *(pa + 7) != *(pb + 7) ||
                        *(pa + 8) != *(pb + 8) || *(pa + 9) != *(pb + 9) ||
                        *(pa + 10) != *(pb + 10) || *(pa + 11) != *(pb + 11) ||
                        *(pa + 12) != *(pb + 12) || *(pa + 13) != *(pb + 13) ||
                        *(pa + 14) != *(pb + 14) || *(pa + 15) != *(pb + 15)
                    )
                    {
                        return false;
                    }
                    pa += UNROLLED;
                    pb += UNROLLED;
                }
            }
            else
            {
                long* pa = (long*)(pb_a + nOffset);
                long* pb = (long*)(pb_b + nOffset);
                long* phead = (long*)pb_a;
                while (phead < pa)
                {
                    if (*(pa - 1) != *(pb - 1) || *(pa - 2) != *(pb - 2) ||
                        *(pa - 3) != *(pb - 3) || *(pa - 4) != *(pb - 4) ||
                        *(pa - 5) != *(pb - 5) || *(pa - 6) != *(pb - 6) ||
                        *(pa - 7) != *(pb - 7) || *(pa - 8) != *(pb - 8) ||
                        *(pa - 9) != *(pb - 9) || *(pa - 10) != *(pb - 10) ||
                        *(pa - 11) != *(pb - 11) || *(pa - 12) != *(pb - 12) ||
                        *(pa - 13) != *(pb - 13) || *(pa - 14) != *(pb - 14) ||
                        *(pa - 15) != *(pb - 15) || *(pa - 16) != *(pb - 16)
                    )
                    {
                        return false;
                    }
                    pa -= UNROLLED;
                    pb -= UNROLLED;
                }
            }
        }

        if (r > 0 && direction == CompareDirection.Forward)
        {
            byte* pa = pb_a + len - r;
            byte* pb = pb_b + len - r;
            byte* ptail = pb_a + len;
            while(pa < ptail)
            {
                if (*pa != *pb) return false;
                pa++;
                pb++;
            }
        }
    }

    return true;
}
Counsel answered 25/2, 2018 at 16:19 Comment(0)
I
1

This is similar to others, but the difference here is that there is no falling through to the next highest number of bytes I can check at once, e.g. if I have 63 bytes (in my SIMD example) I can check the equality of the first 32 bytes, and then the last 32 bytes, which is faster than checking 32 bytes, 16 bytes, 8 bytes, and so on. The first check you enter is the only check you will need to compare all of the bytes.

This does come out on top in my tests, but just by a hair.

The following code is exactly how I tested it in airbreather/ArrayComparePerf.cs.

public unsafe bool SIMDNoFallThrough()    #requires  System.Runtime.Intrinsics.X86
{
    if (a1 == null || a2 == null)
        return false;

    int length0 = a1.Length;

    if (length0 != a2.Length) return false;

    fixed (byte* b00 = a1, b01 = a2)
    {
        byte* b0 = b00, b1 = b01, last0 = b0 + length0, last1 = b1 + length0, last32 = last0 - 31;

        if (length0 > 31)
        {
            while (b0 < last32)
            {
                if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0), Avx.LoadVector256(b1))) != -1)
                    return false;
                b0 += 32;
                b1 += 32;
            }
            return Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(last0 - 32), Avx.LoadVector256(last1 - 32))) == -1;
        }

        if (length0 > 15)
        {
            if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0), Sse2.LoadVector128(b1))) != 65535)
                return false;
            return Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(last0 - 16), Sse2.LoadVector128(last1 - 16))) == 65535;
        }

        if (length0 > 7)
        {
            if (*(ulong*)b0 != *(ulong*)b1)
                return false;
            return *(ulong*)(last0 - 8) == *(ulong*)(last1 - 8);
        }

        if (length0 > 3)
        {
            if (*(uint*)b0 != *(uint*)b1)
                return false;
            return *(uint*)(last0 - 4) == *(uint*)(last1 - 4);
        }

        if (length0 > 1)
        {
            if (*(ushort*)b0 != *(ushort*)b1)
                return false;
            return *(ushort*)(last0 - 2) == *(ushort*)(last1 - 2);
        }

        return *b0 == *b1;
    }
}

If no SIMD is preferred, the same method applied to the the existing LongPointers algorithm:

public unsafe bool LongPointersNoFallThrough()
{
    if (a1 == null || a2 == null || a1.Length != a2.Length)
        return false;
    fixed (byte* p1 = a1, p2 = a2)
    {
        byte* x1 = p1, x2 = p2;
        int l = a1.Length;
        if ((l & 8) != 0)
        {
            for (int i = 0; i < l / 8; i++, x1 += 8, x2 += 8)
                if (*(long*)x1 != *(long*)x2) return false;
            return *(long*)(x1 + (l - 8)) == *(long*)(x2 + (l - 8));
        }
        if ((l & 4) != 0)
        {
            if (*(int*)x1 != *(int*)x2) return false; x1 += 4; x2 += 4;
            return *(int*)(x1 + (l - 4)) == *(int*)(x2 + (l - 4));
        }
        if ((l & 2) != 0)
        {
            if (*(short*)x1 != *(short*)x2) return false; x1 += 2; x2 += 2;
            return *(short*)(x1 + (l - 2)) == *(short*)(x2 + (l - 2));
        }
        return *x1 == *x2;
    }
}
Impanation answered 22/9, 2021 at 7:43 Comment(0)
F
0

This is almost certainly much slower than any other version given here, but it was fun to write.

static bool ByteArrayEquals(byte[] a1, byte[] a2) 
{
    return a1.Zip(a2, (l, r) => l == r).All(x => x);
}
Febrifacient answered 1/9, 2017 at 17:56 Comment(0)
M
-2

If you are looking for a very fast byte array equality comparer, I suggest you take a look at this STSdb Labs article: Byte array equality comparer. It features some of the fastest implementations for byte[] array equality comparing, which are presented, performance tested and summarized.

You can also focus on these implementations:

BigEndianByteArrayComparer - fast byte[] array comparer from left to right (BigEndian) BigEndianByteArrayEqualityComparer - - fast byte[] equality comparer from left to right (BigEndian) LittleEndianByteArrayComparer - fast byte[] array comparer from right to left (LittleEndian) LittleEndianByteArrayEqualityComparer - fast byte[] equality comparer from right to left (LittleEndian)

Manganin answered 2/7, 2014 at 11:7 Comment(0)
T
-2

Use SequenceEquals for this to comparison.

Tennyson answered 23/4, 2015 at 18:44 Comment(0)
O
-3

The short answer is this:

    public bool Compare(byte[] b1, byte[] b2)
    {
        return Encoding.ASCII.GetString(b1) == Encoding.ASCII.GetString(b2);
    }

In such a way you can use the optimized .NET string compare to make a byte array compare without the need to write unsafe code. This is how it is done in the background:

private unsafe static bool EqualsHelper(String strA, String strB)
{
    Contract.Requires(strA != null);
    Contract.Requires(strB != null);
    Contract.Requires(strA.Length == strB.Length);

    int length = strA.Length;

    fixed (char* ap = &strA.m_firstChar) fixed (char* bp = &strB.m_firstChar)
    {
        char* a = ap;
        char* b = bp;

        // Unroll the loop

        #if AMD64
            // For the AMD64 bit platform we unroll by 12 and
            // check three qwords at a time. This is less code
            // than the 32 bit case and is shorter
            // pathlength.

            while (length >= 12)
            {
                if (*(long*)a     != *(long*)b)     return false;
                if (*(long*)(a+4) != *(long*)(b+4)) return false;
                if (*(long*)(a+8) != *(long*)(b+8)) return false;
                a += 12; b += 12; length -= 12;
            }
       #else
           while (length >= 10)
           {
               if (*(int*)a != *(int*)b) return false;
               if (*(int*)(a+2) != *(int*)(b+2)) return false;
               if (*(int*)(a+4) != *(int*)(b+4)) return false;
               if (*(int*)(a+6) != *(int*)(b+6)) return false;
               if (*(int*)(a+8) != *(int*)(b+8)) return false;
               a += 10; b += 10; length -= 10;
           }
       #endif

        // This depends on the fact that the String objects are
        // always zero terminated and that the terminating zero is not included
        // in the length. For odd string sizes, the last compare will include
        // the zero terminator.
        while (length > 0)
        {
            if (*(int*)a != *(int*)b) break;
            a += 2; b += 2; length -= 2;
        }

        return (length <= 0);
    }
}
Over answered 9/6, 2015 at 18:33 Comment(6)
In my tests, the conversion to a string destroys the advantage of the faster compare. This was about 2.5 times slower than a simple for loop.Tiffany
When i did the same the simple for was about 8 times slower. Can you write your code here?Over
Will this break if a byte contains a null (0) value?Tipcat
-1 As well as being slow because of the conversion to string as pointed out by @DougClutter, this will fail if the byte array contains non-ASCII data. To get the right result it would need to use iso-8859-1.Chyack
Very bad approach. While the comparison itself is indeed fast, to get there you need to needlessly burden the garbage collector with yet another allocation, as well as do the allocation itself, copy the data (which requires a loop anyway because each byte must be widened to a char), and the result doesn't even handle characters that are outside the ASCII range! So, no, it's all bad.Gamine
Compare(new byte[]{128}, new byte[]{ 255 }) == true not buggy at all...Low
M
-3

Since many of the fancy solutions above don't work with UWP and because I love Linq and functional approaches I pressent you my version to this problem. To escape the comparison when the first difference occures, I chose .FirstOrDefault()

public static bool CompareByteArrays(byte[] ba0, byte[] ba1) =>
    !(ba0.Length != ba1.Length || Enumerable.Range(1,ba0.Length)
        .FirstOrDefault(n => ba0[n] != ba1[n]) > 0);
Minster answered 7/2, 2018 at 14:43 Comment(2)
-1 because this code is broken and apparently untested. This throws an IndexOutOfRangeException when comparing non-empty arrays because you're accessing elements 1 through ba0.Length when it should be 0 through ba0.Length - 1. If you fix that with Enumerable.Range(0, ba0.Length) it still incorrectly returns true for arrays of equal length where only the first elements differ because you can't distinguish between the first elements satisfying predicate and no elements satisfying predicate; FirstOrDefault<int>() returns 0 in both cases.Ahern
The lesson here kids: don't bring a knife to a gun fightMoravia

© 2022 - 2024 — McMap. All rights reserved.