Fast way to convert a two dimensional array to a List ( one dimensional )

B

3

35

I have a two dimensional array and I need to convert it to a List (same object). I don't want to do it with for or foreach loop that will take each element and add it to the List. Is there some other way to do it?

Bentley answered 27/2, 2011 at 9:26 Comment(9)

What is that list supposed to contain? – Pickaninny 27/2, 2011 at 9:28

Is your 2D array rectangular(T[,]) or jagged(T[][])? – Seaden 27/2, 2011 at 9:30

The list contain double. and the 2d array is T[,] – Bentley 27/2, 2011 at 9:35

Why do you want to avoid loops? – Seaden 27/2, 2011 at 9:41

And do you want a short or a fast solution? – Seaden 27/2, 2011 at 10:9

Yanshof: Could you address the comments in your accepted answer? You claim you want a "fast" solution (given the title) but you've accepted the slowest of the three answers. – Mientao 27/2, 2011 at 10:15

@CodesInChaos: i would vote for short. – Wicopy 5/9, 2013 at 18:38

@JonSkeet: I get too many downvotes because of your above comment :) please remove it – Wicopy 5/9, 2013 at 18:39

@naveen: No - I stand by my comment. It's the slowest of the approaches, so anyone who is looking for an efficient solution should not use it. – Mientao 5/9, 2013 at 19:7

W

42

To convert double[,] to List<double>, if you are looking for a one-liner, here goes

double[,] d = new double[,]
{
    {1.0, 2.0},
    {11.0, 22.0},
    {111.0, 222.0},
    {1111.0, 2222.0},
    {11111.0, 22222.0}
};
List<double> lst = d.Cast<double>().ToList();

But, if you are looking for something efficient, I'd rather say you don't use this code.
Please follow either of the two answers mentioned below. Both are implementing much much better techniques.

Wicopy answered 27/2, 2011 at 9:43 Comment(11)

Aside from everything else, that will end up boxing every double in the array... it'll perform poorly. – Mientao 27/2, 2011 at 9:48

I think OP accepts this answer because he wants a "clearer"(I mean easy for him to implement and understand) way, not the real fastest way. – Muskellunge 27/2, 2011 at 9:52

@Danny: I'm not really sure how this method is any clearer or easier to understand than a for loop, which the OP explicitly wishes to avoid. Not to mention the title says "Fast". – Siliqua 27/2, 2011 at 9:53

In my quick benchmark of a 1000 x 1000 array, this performs over 30 times as slowly as the for loop or the Buffer.BlockCopy solution. I'm pretty surprised it's been accepted, given the "Fast" part of the title. – Mientao 27/2, 2011 at 9:59

(A longer test changed the timings slightly, but it's still easily an order of magnitude slower.) – Mientao 27/2, 2011 at 10:9

@downvoters: thanks for letting me know that I know less than JonSkeet. :) Please understand that I am not deleting the answer, because OP used this code somewhere and is happy with it. Tell me, how many of you downvoters work at enterprise level? funny – Wicopy 5/9, 2013 at 15:24

@naveen ToList is certainly the approach I'd use most of the time since it's usually fast enough. I assume the downvoters prefer other solutions because the question explicitly asks for a fast way and your solution is relatively slow. – Seaden 5/9, 2013 at 17:23

@CodesInChaos: fast way in India means, finish the code fast. Coding is a highly profitable thing here compared to other jobs. I still believe the guy wanted a solution he could implement fast, not execute fast. – Wicopy 5/9, 2013 at 18:32

@naveen: I see no real evidence of that, and given that the OP can use any of the answers by just copying and pasting them and adjusting to his variable names, they're all equally "fast" by that definition. Even if you think that's the most likely intention, your answer provides no indication of the inefficiency involved which would be appropriate in order to serve all readers rather than just the original poster. – Mientao 5/9, 2013 at 19:9

@JonSkeet: i realize that. the downvote irks me a bit. thats all :) – Wicopy 6/9, 2013 at 5:51

Well maybe you should improve your answer then? Even just explaining that it is slow (and why) would be better. Your answer does not help to answer the question which apparently many people believe was asked (myself included), so those downvotes are reasonable. Look at it this way: you're still massively rep-positive for an answer which doesn't provide an efficient solution. – Mientao 6/9, 2013 at 5:54

M

63

Well, you can make it use a "blit" sort of copy, although it does mean making an extra copy :(

double[] tmp = new double[array.GetLength(0) * array.GetLength(1)];    
Buffer.BlockCopy(array, 0, tmp, 0, tmp.Length * sizeof(double));
List<double> list = new List<double>(tmp);

If you're happy with a single-dimensional array of course, just ignore the last line :)

Buffer.BlockCopy is implemented as a native method which I'd expect to use extremely efficient copying after validation. The List<T> constructor which accepts an IEnumerable<T> is optimized for the case where it implements IList<T>, as double[] does. It will create a backing array of the right size, and ask it to copy itself into that array. Hopefully that will use Buffer.BlockCopy or something similar too.

Here's a quick benchmark of the three approaches (for loop, Cast<double>().ToList(), and Buffer.BlockCopy):

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        double[,] source = new double[1000, 1000];
        int iterations = 1000;

        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingCast(source);
        }
        sw.Stop();
        Console.WriteLine("LINQ: {0}", sw.ElapsedMilliseconds);

        GC.Collect();
        GC.WaitForPendingFinalizers();

        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingForLoop(source);
        }
        sw.Stop();
        Console.WriteLine("For loop: {0}", sw.ElapsedMilliseconds);

        GC.Collect();
        GC.WaitForPendingFinalizers();

        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingBlockCopy(source);
        }
        sw.Stop();
        Console.WriteLine("Block copy: {0}", sw.ElapsedMilliseconds);
    }


    static List<double> UsingCast(double[,] array)
    {
        return array.Cast<double>().ToList();
    }

    static List<double> UsingForLoop(double[,] array)
    {
        int width = array.GetLength(0);
        int height = array.GetLength(1);
        List<double> ret = new List<double>(width * height);
        for (int i = 0; i < width; i++)
        {
            for (int j = 0; j < height; j++)
            {
                ret.Add(array[i, j]);
            }
        }
        return ret;
    }

    static List<double> UsingBlockCopy(double[,] array)
    {
        double[] tmp = new double[array.GetLength(0) * array.GetLength(1)];    
        Buffer.BlockCopy(array, 0, tmp, 0, tmp.Length * sizeof(double));
        List<double> list = new List<double>(tmp);
        return list;
    }
}

Results (times in milliseconds);

LINQ: 253463
For loop: 9563
Block copy: 8697

EDIT: Having changed the for loop to call array.GetLength() on each iteration, the for loop and the block copy take around the same time.

Mientao answered 27/2, 2011 at 9:44 Comment(10)

The main problem with that one is that it can leave a big temporary array on the large object heap. – Seaden 27/2, 2011 at 9:51

@CodeInChaos: Absolutely. It's a pain we can't tell List<T> to just use the given array :( I think it's still likely to be faster than looping though. – Mientao 27/2, 2011 at 9:57

The problem with telling List<T> to use a certain array is that we could tell several lists to use the same array. Not sure how big a problem that's be in practice. – Seaden 27/2, 2011 at 10:2

@CodeInChaos: Yup, that's why there's no way of doing it. It's probably the right decision on the part of the BCL team - it's just irritating for things like this :) – Mientao 27/2, 2011 at 10:10

One interesting observation on the looping solution is that it's twice as slow if one swaps the inner and outer loop. Most likely due to CPU caches working better if you read/write sequentially. – Seaden 27/2, 2011 at 11:5

@CodeInChaos: Yes, that's a fairly well-known phenomenon. I don't think the reason is so much related to CPU caches as it is the physical location of the data in memory. You have a 2D array indexed in row-major order, it's much faster to iterate through the array sequentially than it is to jump around. Read more about it here and here. – Siliqua 27/2, 2011 at 11:48

Iterating sequentially is faster because when transferring from the RAM to the cache several sequential entries are fetched at the same time. I'd guess that if the size of your array entries is a multiple of the cache-line size the advantage of serial access disappears. – Seaden 27/2, 2011 at 12:12

@JonSkeet: Is there a way in which you can alter the block copy so that is only copies a specific range from the source array. For instance if I wanted all of the data between indexes (10,20) through (20, 20)? – Malayopolynesian 9/12, 2011 at 19:31

@LamdaComplex: I don't believe so - because that isn't copying a block of data - that's copying 11 separate values. – Mientao 9/12, 2011 at 19:32

@JonSkeet: I was really thinking this would help me. I'm actually tackling a slightly different problem involving 3D arrays and grabbing a volume of data from it and flattening it into a 1D array as fast as possible. #8449135 – Malayopolynesian 9/12, 2011 at 19:35

W

42

To convert double[,] to List<double>, if you are looking for a one-liner, here goes

double[,] d = new double[,]
{
    {1.0, 2.0},
    {11.0, 22.0},
    {111.0, 222.0},
    {1111.0, 2222.0},
    {11111.0, 22222.0}
};
List<double> lst = d.Cast<double>().ToList();