Performance of Arrays vs. Lists
Asked Answered
J

15

256

Say you need to have a list/array of integers which you need iterate frequently, and I mean extremely often. The reasons may vary, but say it's in the heart of the inner most loop of a high volume processing.

In general, one would opt for using Lists (List) due to their flexibility in size. On top of that, msdn documentation claims Lists use an array internally and should perform just as fast (a quick look with Reflector confirms this). Neverless, there is some overhead involved.

Did anyone actually measure this? would iterating 6M times through a list take the same time as an array would?

Jejune answered 18/1, 2009 at 10:20 Comment(2)
Performance issues aside, I prefer usage of Arrays over Lists for their fixed size (in cases where changing the number of items is not required, of course). When reading existing code, I find it helpful to quickly know that an item is forced to have fixed size, rather than having to inspect the code further down in the function.Rog
T[] vs. List<T> can make a big performance difference. I just optimized an extremely (nested) loop intensive application to move from lists to arrays on .NET 4.0. I was expecting maybe 5% to 10% improvement but got over 40% speedup! No other changes than moving directly from list to array. All enumerations were done with foreach statements. Based on Marc Gravell's answer, it looks like foreach with List<T> is particularly bad.Taproom
C
277

Very easy to measure...

In a small number of tight-loop processing code where I know the length is fixed I use arrays for that extra tiny bit of micro-optimisation; arrays can be marginally faster if you use the indexer / for form - but IIRC believe it depends on the type of data in the array. But unless you need to micro-optimise, keep it simple and use List<T> etc.

Of course, this only applies if you are reading all of the data; a dictionary would be quicker for key-based lookups.

Here's my results using "int" (the second number is a checksum to verify they all did the same work):

(edited to fix bug)

List/for: 1971ms (589725196)
Array/for: 1864ms (589725196)
List/foreach: 3054ms (589725196)
Array/foreach: 1860ms (589725196)

based on the test rig:

using System;
using System.Collections.Generic;
using System.Diagnostics;
static class Program
{
    static void Main()
    {
        List<int> list = new List<int>(6000000);
        Random rand = new Random(12345);
        for (int i = 0; i < 6000000; i++)
        {
            list.Add(rand.Next(5000));
        }
        int[] arr = list.ToArray();

        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = list.Count;
            for (int i = 0; i < len; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            for (int i = 0; i < arr.Length; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in list)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in arr)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        Console.ReadLine();
    }
}
Careworn answered 18/1, 2009 at 10:23 Comment(8)
Interesting detail: Here's the times RELEASE/DEBUG on my rig (.net 3.5 sp1): 0.92, 0.80, 0.96, 0.93; which tells me that there is some intelligence in the VM to optimize the Array/for loop approximately 10% better than the other cases.Congregational
Yes, there is JIT optimisation for array/for. Actually, I was under the impression that it included the Length case (since it knows it is fixed), hence why I didn't pull it out first (unlike List where I did). Thanks for the update.Careworn
oh, also the difference between for and foreach is greatly diminished if you actually don't do double the work in this case! (start with correcting the initialisation to list.Add(rand.Next());)Congregational
I intended to use rand.Next() - oops! However, that makes exactly zero difference to the for/foreach issue.Careworn
Weird. My very similar tests show no significant difference between for and foreach when using arrays. Will investigate thoroughly in a blog post with a benchmark which other people can run and send me results for...Pandorapandour
foreach (int i in arr) { chk += list[i]; } Why are you accessing list[i] inside both the foreach loops, instead of doing chk += i; ? That is why this is taking much longer, I believeEcstasy
I get dramatically different results if using a different variable for chk for each test (chk1, chk2, chk3, chk4). List/for=1303ms, Array/for=433ms. Any ideas why?Sims
The link mentioned in the above comment by Jon to Jon Skeet's blog was broken. Below is the updated link. codeblog.jonskeet.uk/2009/01/29/…Skerry
M
150

Short answer:

In .NET List<T> and Array<T> have the same speed/performance because in .NET List is wrapper around Array.

Once more: List is an Array inside! In .NET List<T> is an ArrayList<T> from other languages.


Details what you need to use in which cases:

  • Array need to use:

    • So often as possible. It's fast and takes smallest RAM range for same amount information.
    • If you know exact count of cells needed
    • If data saved in array < 85000 b (85000/32 = 2656 elements for integer data)
    • If needed high Random Access speed
  • List need to use:

    • If needed to add cells to the end of list (often)
    • If needed to add cells in the beginning/middle of the list (NOT OFTEN)
    • If data saved in array < 85000 b (85000/32 = 2656 elements for integer data)
    • If needed high Random Access speed
  • LinkedList need to use:

    • If needed to add cells in the beginning/middle/end of the list (often)

    • If needed only sequential access (forward/backward)

    • If you need to save LARGE items, but items count is low.

    • Better do not use for large amount of items, as it's use additional memory for links.

      If you not sure that you need LinkedList -- YOU DON'T NEED IT.

      Just do not use it.


More details:

color meaning

Array vs List vs Linked List

Much more details:

https://mcmap.net/q/10282/-when-should-i-use-a-list-vs-a-linkedlist

Maximo answered 12/3, 2017 at 21:28 Comment(10)
I'm slightly confused by your assertion that list prepend is relatively fast but insertion is slow. Insertion is also linear time, and faster by 50% on average than prepend.Pulverize
@MikeMarynowski in c# list is wrapper around Array. So in case of insertion to list you will have linear time only to some point. After this system will create new one bigger array and will need time to copy items from old one.Maximo
Same thing with prepends.Pulverize
A prepend operation is just an insert at 0. It's the worst case insert in terms of performance, so if insert is slow, prepend is even slower.Pulverize
Both insert and prepend is O(n) (amortized). A prepend IS an insert, but it's the slowest possible insert because it has to move ALL the items in the list one spot up. An insert in a random location only has to move up the items that are at a higher index than the insertion point, so 50% of the items on average.Pulverize
A pretty good rule-of-thumb is "never use linked list" :-). Something like a List or Queue is almost always faster, unless you need insertion in middle AND somehow maintain a pointer there. Of course there are cases where a linked list is genuinely the best choice, but it is exceedingly rare, in my experience.Excitant
Yeah, this chart doesn't account for cache locality. In the real world, unless you're working on your linked list so frequently that you're not getting any cache misses (spoiler, you're probably not), then List is going to win by a considerabale margin in nearly every situation for any operation. Cache misses are the worst.Homebody
Linked lists are anything but great for sequential processing. You have so much indirection that your cpu has literally no idea where to look for the next chunk of data. It might just be a constant time penalty, but unless you have a really critical need to use a linked list you'll be far better off not messing with your cpus memory prefetching.Deranged
Probably would be best to just remove prepending from that table; it's merely a special-case of inserting. And, as @MikeMarynowski was saying, it's inconsistent to say that a List<> is fast at prepending but slow at inserting, as prepending is the worst-case inserting.Cloyd
Practically, a List<> can be good if modifications are only additions, or performance isn't critical.Cloyd
L
30

I think the performance will be quite similar. The overhead that is involved when using a List vs an Array is, IMHO when you add items to the list, and when the list has to increase the size of the array that it's using internally, when the capacity of the array is reached.

Suppose you have a List with a Capacity of 10, then the List will increase it's capacity once you want to add the 11th element. You can decrease the performance impact by initializing the Capacity of the list to the number of items it will hold.

But, in order to figure out if iterating over a List is as fast as iterating over an array, why don't you benchmark it ?

int numberOfElements = 6000000;

List<int> theList = new List<int> (numberOfElements);
int[] theArray = new int[numberOfElements];

for( int i = 0; i < numberOfElements; i++ )
{
    theList.Add (i);
    theArray[i] = i;
}

Stopwatch chrono = new Stopwatch ();

chrono.Start ();

int j;

 for( int i = 0; i < numberOfElements; i++ )
 {
     j = theList[i];
 }

 chrono.Stop ();
 Console.WriteLine (String.Format("iterating the List took {0} msec", chrono.ElapsedMilliseconds));

 chrono.Reset();

 chrono.Start();

 for( int i = 0; i < numberOfElements; i++ )
 {
     j = theArray[i];
 }

 chrono.Stop ();
 Console.WriteLine (String.Format("iterating the array took {0} msec", chrono.ElapsedMilliseconds));

 Console.ReadLine();

On my system; iterating over the array took 33msec; iterating over the list took 66msec.

To be honest, I didn't expect that the variation would be that much. So, I've put my iteration in a loop: now, I execute both iteration 1000 times. The results are:

iterating the List took 67146 msec iterating the array took 40821 msec

Now, the variation is not that large anymore, but still ...

Therefore, I've started up .NET Reflector, and the getter of the indexer of the List class, looks like this:

public T get_Item(int index)
{
    if (index >= this._size)
    {
        ThrowHelper.ThrowArgumentOutOfRangeException();
    }
    return this._items[index];
}

As you can see, when you use the indexer of the List, the List performs a check whether you're not going out of the bounds of the internal array. This additional check comes with a cost.

Longstanding answered 18/1, 2009 at 10:32 Comment(3)
Hi Frederik, Thanks! How would you explain that your list took double the time of the arrays. Not what you would expect. Did you try to increase the number of elements?Jejune
Wouldn't return this._items[index]; already throw an exception if the index was out of range? Why does .NET have this extra check when the end result is the same with or without it?Ruyle
@John Mercier the check is against the Size of the list (number of items currently contained), which is different to and probably less than the capacity of the _items array. The array is allocated with excess capacity to make adding future items quicker by not requiring re-allocation for every addition.Wrestling
T
24

if you are just getting a single value out of either (not in a loop) then both do bounds checking (you're in managed code remember) it's just the list does it twice. See the notes later for why this is likely not a big deal.

If you are using your own for(int int i = 0; i < x.[Length/Count];i++) then the key difference is as follows:

  • Array:
    • bounds checking is removed
  • Lists
    • bounds checking is performed

If you are using foreach then the key difference is as follows:

  • Array:
    • no object is allocated to manage the iteration
    • bounds checking is removed
  • List via a variable known to be List.
    • the iteration management variable is stack allocated
    • bounds checking is performed
  • List via a variable known to be IList.
    • the iteration management variable is heap allocated
    • bounds checking is performed also Lists values may not be altered during the foreach whereas the array's can be.

The bounds checking is often no big deal (especially if you are on a cpu with a deep pipeline and branch prediction - the norm for most these days) but only your own profiling can tell you if that is an issue. If you are in parts of your code where you are avoiding heap allocations (good examples are libraries or in hashcode implementations) then ensuring the variable is typed as List not IList will avoid that pitfall. As always profile if it matters.

Temporize answered 21/1, 2009 at 13:11 Comment(0)
C
12

[See also this question]

I've modified Marc's answer to use actual random numbers and actually do the same work in all cases.

Results:

         for      foreach
Array : 1575ms     1575ms (+0%)
List  : 1630ms     2627ms (+61%)
         (+3%)     (+67%)

(Checksum: -1000038876)

Compiled as Release under VS 2008 SP1. Running without debugging on a [email protected], .NET 3.5 SP1.

Code:

class Program
{
    static void Main(string[] args)
    {
        List<int> list = new List<int>(6000000);
        Random rand = new Random(1);
        for (int i = 0; i < 6000000; i++)
        {
            list.Add(rand.Next());
        }
        int[] arr = list.ToArray();

        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = list.Count;
            for (int i = 0; i < len; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = arr.Length;
            for (int i = 0; i < len; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in list)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in arr)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);
        Console.WriteLine();

        Console.ReadLine();
    }
}
Congregational answered 18/1, 2009 at 10:20 Comment(2)
That's weird - I've just run your exact code, built from the command line (3.5SP1) with /o+ /debug- and my results are: list/for: 1524; array/for: 1472; list/foreach:4128; array/foreach:1484.Pandorapandour
You say this was compiled as release - can I just check that you did run it rather than debugging it? Silly question, I know, but I can't explain the results otherwise...Pandorapandour
L
4

I was worried that the Benchmarks posted in other answers would still leave room for the compiler to optimize, eliminate or merge loops so I wrote one that:

  • Used unpredictable inputs (random)
  • Runs a calculated with the result printed to the console
  • Modifies the input data each repetition

The result as that a direct array has about 250% better performance than an access to an array wrapped in an IList:

  • 1 billion array accesses: 4000 ms
  • 1 billion list accesses: 10000 ms
  • 100 million array accesses: 350 ms
  • 100 million list accesses: 1000 ms

Here's the code:

static void Main(string[] args) {
  const int TestPointCount = 1000000;
  const int RepetitionCount = 1000;

  Stopwatch arrayTimer = new Stopwatch();
  Stopwatch listTimer = new Stopwatch();

  Point2[] points = new Point2[TestPointCount];
  var random = new Random();
  for (int index = 0; index < TestPointCount; ++index) {
    points[index].X = random.NextDouble();
    points[index].Y = random.NextDouble();
  }

  for (int repetition = 0; repetition <= RepetitionCount; ++repetition) {
    if (repetition > 0) { // first repetition is for cache warmup
      arrayTimer.Start();
    }
    doWorkOnArray(points);
    if (repetition > 0) { // first repetition is for cache warmup
      arrayTimer.Stop();
    }

    if (repetition > 0) { // first repetition is for cache warmup
      listTimer.Start();
    }
    doWorkOnList(points);
    if (repetition > 0) { // first repetition is for cache warmup
      listTimer.Stop();
    }
  }

  Console.WriteLine("Ignore this: " + points[0].X + points[0].Y);
  Console.WriteLine(
    string.Format(
      "{0} accesses on array took {1} ms",
      RepetitionCount * TestPointCount, arrayTimer.ElapsedMilliseconds
    )
  );
  Console.WriteLine(
    string.Format(
      "{0} accesses on list took {1} ms",
      RepetitionCount * TestPointCount, listTimer.ElapsedMilliseconds
    )
  );

}

private static void doWorkOnArray(Point2[] points) {
  var random = new Random();

  int pointCount = points.Length;

  Point2 accumulated = Point2.Zero;
  for (int index = 0; index < pointCount; ++index) {
    accumulated.X += points[index].X;
    accumulated.Y += points[index].Y;
  }

  accumulated /= pointCount;

  // make use of the result somewhere so the optimizer can't eliminate the loop
  // also modify the input collection so the optimizer can merge the repetition loop
  points[random.Next(0, pointCount)] = accumulated;
}

private static void doWorkOnList(IList<Point2> points) {
  var random = new Random();

  int pointCount = points.Count;

  Point2 accumulated = Point2.Zero;
  for (int index = 0; index < pointCount; ++index) {
    accumulated.X += points[index].X;
    accumulated.Y += points[index].Y;
  }

  accumulated /= pointCount;

  // make use of the result somewhere so the optimizer can't eliminate the loop
  // also modify the input collection so the optimizer can merge the repetition loop
  points[random.Next(0, pointCount)] = accumulated;
}
Laggard answered 16/2, 2017 at 11:0 Comment(0)
F
2

The measurements are nice, but you are going to get significantly different results depending on what you're doing exactly in your inner loop. Measure your own situation. If you're using multi-threading, that alone is a non-trivial activity.

Fullblown answered 18/1, 2009 at 11:41 Comment(0)
L
2

Indeed, if you perform some complex calculations inside the loop, then the performance of the array indexer versus the list indexer may be so marginally small, that eventually, it doesn't matter.

Longstanding answered 18/1, 2009 at 11:45 Comment(0)
A
2

Do not attempt to add capacity by increasing the number of elements.

Performance

List For Add: 1ms
Array For Add: 2397ms

    Stopwatch watch;
        #region --> List For Add <--

        List<int> intList = new List<int>();
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 60000; rpt++)
        {
            intList.Add(rand.Next());
        }
        watch.Stop();
        Console.WriteLine("List For Add: {0}ms", watch.ElapsedMilliseconds);
        #endregion

        #region --> Array For Add <--

        int[] intArray = new int[0];
        watch = Stopwatch.StartNew();
        int sira = 0;
        for (int rpt = 0; rpt < 60000; rpt++)
        {
            sira += 1;
            Array.Resize(ref intArray, intArray.Length + 1);
            intArray[rpt] = rand.Next();

        }
        watch.Stop();
        Console.WriteLine("Array For Add: {0}ms", watch.ElapsedMilliseconds);

        #endregion
Abadan answered 24/12, 2015 at 12:56 Comment(2)
I get resizing an array 60k times is going to be slow... Surely in real world usage though, the approach would just be to check how many extra slots you need, resize it to length + 60k, and then zip through the inserts.Allanadale
Resizing an array is very fast if you double the size each time you find you need more space. I found it seems to take about the same time doing that as it does just resizing it once after the initial declaration. That gives you the flexibility of a list and most of the speed of an array.Wardlaw
E
2

Here's one that uses Dictionaries, IEnumerable:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

static class Program
{
    static void Main()
    {
        List<int> list = new List<int>(6000000);

        for (int i = 0; i < 6000000; i++)
        {
                list.Add(i);
        }
        Console.WriteLine("Count: {0}", list.Count);

        int[] arr = list.ToArray();
        IEnumerable<int> Ienumerable = list.ToArray();
        Dictionary<int, bool> dict = list.ToDictionary(x => x, y => true);

        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = list.Count;
            for (int i = 0; i < len; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            for (int i = 0; i < arr.Length; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in Ienumerable)
            {
                chk += i;
            }
        }

        Console.WriteLine("Ienumerable/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in dict.Keys)
            {
                chk += i;
            }
        }

        Console.WriteLine("Dict/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);


        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in list)
            {
                chk += i;
            }
        }

        watch.Stop();
        Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in arr)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);



        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in Ienumerable)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Ienumerable/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in dict.Keys)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Dict/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        Console.ReadLine();
    }
}
Expedient answered 29/1, 2016 at 20:34 Comment(0)
K
1

In some brief tests I have found a combination of the two to be better in what I would call reasonably intensive Math:

Type: List<double[]>

Time: 00:00:05.1861300

Type: List<List<double>>

Time: 00:00:05.7941351

Type: double[rows * columns]

Time: 00:00:06.0547118

Running the Code:

int rows = 10000;
int columns = 10000;

IMatrix Matrix = new IMatrix(rows, columns);

Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();


for (int r = 0; r < Matrix.Rows; r++)
    for (int c = 0; c < Matrix.Columns; c++)
        Matrix[r, c] = Math.E;

for (int r = 0; r < Matrix.Rows; r++)
    for (int c = 0; c < Matrix.Columns; c++)
        Matrix[r, c] *= -Math.Log(Math.E);


stopwatch.Stop();
TimeSpan ts = stopwatch.Elapsed;

Console.WriteLine(ts.ToString());

I do wish we had some top notch Hardware Accelerated Matrix Classes like the .NET Team have done with the System.Numerics.Vectors Class!

C# could be the best ML Language with a bit more work in this area!

Khudari answered 11/4, 2017 at 10:10 Comment(0)
K
0

Since List<> uses arrays internally, the basic performance should be the same. Two reasons, why the List might be slightly slower:

  • To look up a element in the list, a method of List is called, which does the look up in the underlying array. So you need an additional method call there. On the other hand the compiler might recognize this and optimize the "unnecessary" call away.
  • The compiler might do some special optimizations if it knows the size of the array, that it can't do for a list of unknown length. This might bring some performance improvement if you only have a few elements in your list.

To check if it makes any difference for you, it's probably best adjust the posted timing functions to a list of the size you're planning to use and see how the results for your special case are.

Kinnikinnick answered 18/1, 2009 at 11:28 Comment(0)
G
0

Since I had a similar question this got me a fast start.

My question is a bit more specific, 'what is the fastest method for a reflexive array implementation'

The testing done by Marc Gravell shows a lot, but not exactly access timing. His timing include the looping over the array's and lists as well. Since I also came up with a third method that I wanted to test, a 'Dictionary', just to compare, I extended hist test code.

Firts, I do a test using a constant, which gives me a certain timing including the loop. This is a 'bare' timing, excluding the actual access. Then I do a test with accessing the subject structure, this gives me and 'overhead included' timing, looping and actual access.

The difference between 'bare' timing and 'overhead indluded' timing gives me an indication of the 'structure access' timing.

But how accurate is this timing? During the test windows will do some time slicing for shure. I have no information about the time slicing but I asume it is evenly distributed during the test and in the order of tens of msec which means that the accuracy for the timing should be in the order of +/- 100 msec or so. A bit rough estimate? Anyway a source of a systematic mearure error.

Also, the tests were done in 'Debug' mode with no optimalisation. Otherwise the compiler might change the actual test code.

So, I get two results, one for a constant, marked '(c)', and one for access marked '(n)' and the difference 'dt' tells me how much time the actual access takes.

And this are the results:

          Dictionary(c)/for: 1205ms (600000000)
          Dictionary(n)/for: 8046ms (589725196)
 dt = 6841

                List(c)/for: 1186ms (1189725196)
                List(n)/for: 2475ms (1779450392)
 dt = 1289

               Array(c)/for: 1019ms (600000000)
               Array(n)/for: 1266ms (589725196)
 dt = 247

 Dictionary[key](c)/foreach: 2738ms (600000000)
 Dictionary[key](n)/foreach: 10017ms (589725196)
 dt = 7279

            List(c)/foreach: 2480ms (600000000)
            List(n)/foreach: 2658ms (589725196)
 dt = 178

           Array(c)/foreach: 1300ms (600000000)
           Array(n)/foreach: 1592ms (589725196)
 dt = 292


 dt +/-.1 sec   for    foreach
 Dictionary     6.8       7.3
 List           1.3       0.2
 Array          0.2       0.3

 Same test, different system:
 dt +/- .1 sec  for    foreach
 Dictionary     14.4   12.0
       List      1.7    0.1
      Array      0.5    0.7

With better estimates on the timing errors (how to remove the systematic measurement error due to time slicing?) more could be said about the results.

It looks like List/foreach has the fastest access but the overhead is killing it.

The difference between List/for and List/foreach is stange. Maybe some cashing is involved?

Further, for access to an array it does not matter if you use a for loop or a foreach loop. The timing results and its accuracity makes the results 'comparible'.

Using a dictionary is by far the slowest, I only considered it because on the left side (the indexer) I have a sparse list of integers and not a range as is used in this tests.

Here is the modified test code.

Dictionary<int, int> dict = new Dictionary<int, int>(6000000);
List<int> list = new List<int>(6000000);
Random rand = new Random(12345);
for (int i = 0; i < 6000000; i++)
{
    int n = rand.Next(5000);
    dict.Add(i, n);
    list.Add(n);
}
int[] arr = list.ToArray();

int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    int len = dict.Count;
    for (int i = 0; i < len; i++)
    {
        chk += 1; // dict[i];
    }
}
watch.Stop();
long c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("         Dictionary(c)/for: {0}ms ({1})", c_dt, chk);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    int len = dict.Count;
    for (int i = 0; i < len; i++)
    {
        chk += dict[i];
    }
}
watch.Stop();
long n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("         Dictionary(n)/for: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);

watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    int len = list.Count;
    for (int i = 0; i < len; i++)
    {
        chk += 1; // list[i];
    }
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("               List(c)/for: {0}ms ({1})", c_dt, chk);

watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    int len = list.Count;
    for (int i = 0; i < len; i++)
    {
        chk += list[i];
    }
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("               List(n)/for: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    for (int i = 0; i < arr.Length; i++)
    {
        chk += 1; // arr[i];
    }
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("              Array(c)/for: {0}ms ({1})", c_dt, chk);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    for (int i = 0; i < arr.Length; i++)
    {
        chk += arr[i];
    }
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("Array(n)/for: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in dict.Keys)
    {
        chk += 1; // dict[i]; ;
    }
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("Dictionary[key](c)/foreach: {0}ms ({1})", c_dt, chk);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in dict.Keys)
    {
        chk += dict[i]; ;
    }
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("Dictionary[key](n)/foreach: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in list)
    {
        chk += 1; // i;
    }
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("           List(c)/foreach: {0}ms ({1})", c_dt, chk);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in list)
    {
        chk += i;
    }
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("           List(n)/foreach: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in arr)
    {
        chk += 1; // i;
    }
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine("          Array(c)/foreach: {0}ms ({1})", c_dt, chk);

chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
    foreach (int i in arr)
    {
        chk += i;
    }
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine("Array(n)/foreach: {0}ms ({1})", n_dt, chk);
Console.WriteLine("dt = {0}", n_dt - c_dt);
Gynaeco answered 5/3, 2017 at 15:4 Comment(0)
I
0

I have two clarifications to add to @Marc Gravell answer.

Tests were done in .NET 6 in x64 release.

Test code is at end.

Array and List not tested in same way

To test array and List under same condition, "for" should be modified as well.

for (int i = 0; i < arr.Length; i++)

New version :

int len = arr.Length;
for (int i = 0; i < len; i++)

Bottleneck List/foreach :

The bottleneck with List (List/foreach test) can be fixed.

Change it to :

list.ForEach(x => chk += x);

Test run on Laptop on Windows 10 pro 21H1 x64 with Core i7-10510U

List/for Count out: 1495ms (589725196)
List/for Count in: 1706ms (589725196)
Array/for Count out: 945ms (589725196)
Array/for Count in: 1072ms (589725196)
List/foreach: 2114ms (589725196)
List/foreach fixed: 1210ms (589725196)
Array/foreach: 1179ms (589725196)

Results interpretation

Array/for is faster than original test. (12% less)

List/foreach fixed is faster than List/for.

List/foreach fixed is close to Array/foreach.

I have run this test several times. Results change but orders of magnitude remain the same.

These results of this test show that you really have to have great need for performance to be forced to use Array.

Depending on method used to manipulate List, performance can be divided by 2.

This test is partial. There is no random access, direct access, write access test, etc.

Did I get some parts wrong or do you have any other ideas for improving performance?

Test code :

using System;
using System.Collections.Generic;
using System.Diagnostics;
static class Program
{
    static void Main()
    {        List<int> list = new List<int>(6000000);
        Random rand = new Random(12345);
        for (int i = 0; i < 6000000; i++)
        {
            list.Add(rand.Next(5000));
        }
        int[] arr = list.ToArray();

        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = list.Count;
            for (int i = 0; i < len; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for Count out: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            for (int i = 0; i < list.Count; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for Count in: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = arr.Length;
            for (int i = 0; i < len; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for Count out: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            for (int i = 0; i < arr.Length; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for Count in: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in list)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            list.ForEach(i => chk += i);
        }
        watch.Stop();
        Console.WriteLine("List/foreach fixed: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in arr)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        Console.ReadLine();
    }
}
Interception answered 28/11, 2021 at 21:27 Comment(0)
R
0
static long[] longs = new long[500000];
static long[] longs2 = {};
static List<long> listLongs = new List<long> { };
static void Main(string[] args)
{
    Console.CursorVisible = false;
    Stopwatch time = new Stopwatch();

    time.Start();
    for (int f = 50000000; f < 50255000; f++)
    {
        listLongs.Add(f);
    }

    //List  Time: 1ms    Count : 255000
    Console.WriteLine("List Time: " + time.ElapsedMilliseconds + " | Count: " + listLongs.Count());

    time.Restart();
    time.Start();
    for (long i = 1; i < 500000; i++)
    {
        longs[i] = i * 200;
    }

    //Array Time: 2ms Length: 500000 (Unrealistic Data)
    Console.WriteLine("Array Time: " + time.ElapsedMilliseconds + " | Length: " + longs.Length);

    time.Restart();
    time.Start();
    for (int i = 50000000; i < 50055000; i++)
    {
        longs2 = longs2.Append(i).ToArray();
    }

    //Array Time: 17950ms Length: 55000
    Console.WriteLine("Array Append Time: " + time.ElapsedMilliseconds + " | Length: " + longs2.Length);

    Console.ReadLine();
}
Type Time Len
Array 2ms 500000
List 1ms 255000
Array Append 17950ms 55000

If you plan on appending small amounts of data to an array constantly then list is faster

It really comes down to how you are going to use the array.

Resurrection answered 8/6, 2022 at 15:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.