Algorithm speed-up using List<T>.Sort and IEnumerable

Asked 30/11, 2012 at 22:45 Answered 30/11, 2012 at 23:48

c#performance algorithm ienumerable icomparable

For my project, I first load an image from file, and put every pixel into a 2D pixels[,] array. Then, I want to examine each pixel and split them up into "bins" based on how they are colored, and then sort each bin. So I have a Bin object, which encapsulates a List<Pixel>, and I have a List<Bin> containing a (relatively small) number of bins.

My problem is that when I try to fill these bins from very large images (1920x1200 = 2.3 million pixels, eg), the algorithm I'm using is slower than I'd like, and I've traced the problem down to some C# language-specific features which seem to be taking way longer than I was expecting. I'd like some advice on how to better use C# to remove these bottlenecks.

After loading an image, I call a function called "fillBinsFromSource", which takes an enumerable list of pixels, finds which bin they belong in, and puts them there:

public void fillBinsFromSource(IEnumerable<Pixel> source)
    {
        Stopwatch s = new Stopwatch();
        foreach (Pixel p in source)
        {
            s.Start();
            // algorithm removed for brevity
            // involves a binary search of all Bins and a List.Add call
            s.Stop();
        }           
    }

For large images, it's expected that my algorithm will take a while, but when I put a Stopwatch outside the function call, it turns out that it takes about twice as long as the time accrued on s, which means doing the enumeration using foreach is taking up half the time of this function (about 800 ms of the 1.6 seconds for a 1920x1200 image).

The reason I need to pass an enumerable list is because sometimes users will add only a small region of a picture, not an entire picture. The time-consuming call passes down several iterators, first from an list of images, then from each image in the list, like so (simplified):

class ImageList : IEnumerable<Pixel>
{
    private List<Image> imageList;
    public IEnumerator<Pixel> GetEnumerator()
    {
        foreach (Image i in imageList)
            foreach (Pixel p in i)
                yield return p;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    } 

class Image : IEnumerable<Pixel>
{
    private Pixel[,] pixels; // all pixels in the image        
    private List<Pixel> selectedPixels;// all pixels in the user's selection

    public IEnumerator<Pixel> GetEnumerator()
    {
        if (selectedPixels == null)
            for (int i = 0; i < image.Width; i++)
                for (int j = 0; j < image.Height; j++)
                    yield return pixels[i, j];
        else
            for (int i = 0; i < selectedPixels.Count; i++)
                yield return selectedPixels[i];
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

Then finally I call this

ImageList list; // pretend it contains only 1 image, and it's large
fillBinsFromSource(list);

Question 1) Because of the need to enumerate over both the 2D array of pixels AND the selected region, depending on what the user has chosen, the enumeration is just really slow. How can I speed this up?

Then, after filling all these bins with Pixel objects, I sort them. I call List<Pixel>.Sort() and rely on IComparable, like so:

ImageList list; // pretend it contains only 1 image, and it's large
fillBinsFromSource(list);
foreach(Bin b in allBins)
    b.Sort(); // calls List<Pixel>.Sort()


class Pixel : IComparable
{
    // store both HSV and RGB
    float h, s, v;
    byte r, g, b;

    // we sort by HSV's value property
    public int CompareTo(object obj)
    {
        // this is much faster than calling CompareTo on a float
        Pixel rhs = obj as Pixel;
        if (v < rhs.v)
            return -1;
        else if (v > rhs.v)
            return 1;
        return 0;
    }

Question 2) Suppose allBins has 7 elements; sorting 7 separate lists which have a total of 2.3 million Pixels in them takes about 2 seconds. Sorting one list of 2.3 million random ints takes under 200 milliseconds. I can appreciate that there's some level of speed-up using primitive types, but is it really over 10x slower to use IComparable? Are there any speedups to be had here?

I apologize for the long question, if anyone has any advice for me, I'd appreciate it!

Farnsworth answered 30/11, 2012 at 22:45 Comment(8)

are the selected pixels a rectangular region, or an arbitrary noncontiguous set? – Flushing 30/11, 2012 at 22:53

Arbitrary and noncontiguous. They use selection tools I've programmed (freeform draw, rectangular, etc) to select 1 or more regions. – Farnsworth 30/11, 2012 at 22:54

One thought I had would be to define a delegate or some callback method, then instead of iterating over the pixels in fillBinsFromSource, I would simply do the basic loop in Image class and use the callback. This seems particularly (and unnecessarily) messy from a design point of view though, but I would avoid using any iterators at all. I would like to prevent muddying up my design though, if possible. – Farnsworth 30/11, 2012 at 23:3

Keep in mind that when iterating through an enumeration, if the enumeration orginates from a query expression (like a Linq query), the query expression will execute for every item in every iteration. If the query expression is expensive, it will impact iteration. You can avoid the expression cost using .ToList(), but then you're doubling your memory use. For small lists it's advantageous but for 2.3 million items the memory use could be prohibitive. – Flushing 30/11, 2012 at 23:48

Similarly, if the pixel enumerations originate from a query that pulls the pixels out of a bitmap using GetPixel(x,y), performance of the iterator will be gated by the horrifically slow performance of GetPixel(x,y). Don't use GetPixel. Ever. – Flushing 30/11, 2012 at 23:49

If the iterator itself is consuming much of the execution time (no way to know this except by profiling), consider changing the enumeration from an enum of pixels to an enum of scanlines - groups of contiguous pixels. This should reduce the number of items in the enumeration and the number of iterations by at least an order of magnitude in most situations. Your core loop would take each scanline from the enumeration and run a tight code loop over the contiguous pixels. – Flushing 30/11, 2012 at 23:56

Yes, I had never used the Bitmap class before this project, and had falsely assumed GetPixel was decently implemented. Before recently implementing the fix involving LockBits, simply loading the image into the initial pixel array took like 5 seconds. Fortunately that's been fixed and the loading of the image is separate from the code above. – Farnsworth 30/11, 2012 at 23:58

I would use for instead of foreach. In the performance tests I have seen foreach is slower then for in all but one case where it was almost the same. Since you want to sort the data use a SortedDictionary (or if you are accessing the contents often after filling use a SortedDictionary and convert it to a SortedList once filled). – Korrie 1/12, 2012 at 0:8

You really need to profile your code and see what is slow.

Most obvious:

Pixel does not implement generic IComparable<Pixel> and as result Compare had to do much more work.
Try to make Pixel value type. Most likely you'll see drop in performance, but may be not.
Consider passing 2d ranges (rectangle) and access Pixel objects directly by index instead of iterators if you find that performance is below what you find acceptable. Iterators are nice, but not free.

Basidiomycete answered 30/11, 2012 at 22:57 Comment(3)

Being fairly new to C#, I actually had no idea there was an IComparable<T>. It makes sense now that I looked it up, though. Changing Pixel to implement IComparable<T> sped the sorting up from about 2 seconds to 1.3 seconds. It's still about 7x slower than int sorting (I still find it hard to believe this is the cost of using IComparable), but it's certainly an improvement, so thanks! I will look further into your value type suggestion. – Farnsworth 30/11, 2012 at 23:18

Again, I had no idea there was such a fundamental difference between struct and class in C#. Changing Pixel to be a struct (value type) causes a few problems for me with some other code (which will, coincidentally, be fixed with the delegate comment on my main post), but simply changing Pixel to a struct increased the speed further, down to around 1 second. Together with IComparable<Pixel>, I cut my sort time in half! Does anyone have further suggestions to cut down the cost of using IComparable even more? – Farnsworth 30/11, 2012 at 23:54

@XenoScholar, Side note: please make sure to search and read about "c# struct vs class differences" (i.e. links from https://mcmap.net/q/870748/-class-vs-ref-struct) - value types often exhibit surprising behaviors if you are not prepared. And avoid making mutable value type to save you some debugging time :). – Basidiomycete 1/12, 2012 at 0:28

All kinds of indirection, like a visitor pattern or virtual inheritance, is poison if you want raw metal performance. Virtual calls, allocation, unpredictable branching do a lot of damage to the kind of algorithm where there is a small, tight, inner loop where 99,99% of the time is spent.

Why? Because the CPU likes to execute many (dozens) of instructions in parallel. It can do that only if it can peek ahead in the instruction stream. The aforementioned things prevent that.

You really need to get the innermost loop right. Don't allocate there, don't call virtual functions (or interface methods or delegates).

Probably, your innermost loop should process a rectangle of a given image with a hard-coded kernel. Instead of implementing your processing function per-pixel, implement it per-rectangle.

In contrast, it doesn't matter how you provide the stream of images. Use LINQ there all you want. It is a low-volume operation because there a millions of pixels per image.

Emilie answered 30/11, 2012 at 23:48 Comment(3)

Or for non-rectangular regions, consider grouping runs of contiguous pixels into scanlines instead of an enumeration of individual pixels scattered all over. – Flushing 30/11, 2012 at 23:52

I think I'm going to restructure my variables a bit. The reason I use IEnumerable everywhere is because I store pixels as a 2D array, and selectedPixels as a List<T>. I should probably consider making functions that convert these both to 1D arrays when I want to do massive looping, then do for(int i=0; i < image.IterableList.Length; i++) or something similar. I should probably also reconsider how I store these values... probably can store pixels as a 1D array and provide an overloaded 2D indexer if I ever need to access by 2D in a non-iterating piece of code. – Farnsworth 1/12, 2012 at 0:7

Yes, 2D arrays are slower because they always perform bounds checking. You also might want to switch to unsafe code for the core of the algorithm. Might well be worth it. – Emilie 1/12, 2012 at 12:0

Instead of using an iterator, or even building up an array / list of pixels to begin with, you could use the visitor pattern. Images, Image Lists and other objects representing arbitrary selections could all accept a visitor class with a single method VisitPixel and call that method for each pixel the object represents. The visitor class would then be responsible for binning all the pixels as they're visited.

This might eliminate the need to extract all of your pixels into a separate array. It might also eliminate the creation of iterators in favor of simple for loops.

Alexei Levenkov has some good points in regards to your second question. It might be even faster to use the Sort overload that takes an instance of IComparer<T>.

Dardanus answered 30/11, 2012 at 23:23 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags