Natural Sort Order in C#

Asked 29/10, 2008 at 22:4 Answered 8/12, 2023 at 10:47

148

Anyone have a good resource or provide a sample of a natural order sort in C# for an FileInfo array? I am implementing the IComparer interface in my sorts.

Rugg answered 29/10, 2008 at 22:4 Comment(0)

164

The easiest thing to do is just P/Invoke the built-in function in Windows, and use it as the comparison function in your IComparer:

[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);

Michael Kaplan has some examples of how this function works here, and the changes that were made for Vista to make it work more intuitively. The plus side of this function is that it will have the same behaviour as the version of Windows it runs on, however this does mean that it differs between versions of Windows so you need to consider whether this is a problem for you.

So a complete implementation would be something like:

[SuppressUnmanagedCodeSecurity]
internal static class SafeNativeMethods
{
    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
    public static extern int StrCmpLogicalW(string psz1, string psz2);
}

public sealed class NaturalStringComparer : IComparer<string>
{
    public int Compare(string a, string b)
    {
        return SafeNativeMethods.StrCmpLogicalW(a, b);
    }
}

public sealed class NaturalFileInfoNameComparer : IComparer<FileInfo>
{
    public int Compare(FileInfo a, FileInfo b)
    {
        return SafeNativeMethods.StrCmpLogicalW(a.Name, b.Name);
    }
}

Summerlin answered 29/10, 2008 at 22:9 Comment(10)

Great answer. Caveat: This won't work with Win2000, for those few folks still running things on that operating system. On the other hand, there's enough hints between Kaplan's blog and the MSDN documentation to create a similar function. – Monopolize 29/10, 2008 at 22:30

This is not portable, only works in Win32, but does not work in Linux / MacOS / Silverlight / Windows Phone / Metro – Rattan 29/7, 2012 at 9:36

@Rattan - He said .NET not Mono, so Linux/OSX isn't really a concern. Windows Phone/Metro didn't exist in 2008 when this answer was posted. And how often do you do file operations in Silverlight? So for the OP, and probably most other people, it was a suitable answer. In any case, you're free to provide a better answer; that's how this site works. – Summerlin 30/7, 2012 at 7:45

the title said C#, not just MS.NET in win32. so wider platform support can be relevant today. – Rattan 30/7, 2012 at 11:14

This does not mean that the original answer was wrong. I just add additional info with up-to-date info – Rattan 30/7, 2012 at 15:10

FYI, if you inherit from Comparer<T> instead of implementing IComparer<T>, you get a built-in implementation of the IComparer (non-generic) interface that calls your generic method, for use in APIs that use that instead. It's basically free to do too: just delete the "I" and change public int Compare(...) to public override int Compare(...). Same for IEqualityComparer<T> and EqualityComparer<T>. – Cutie 27/6, 2015 at 1:40

One more thing to add here: BCL's StringComparer.* (i.e. .Ordinal) .Equals(null, null) returns 'true' (aka '0') while StrCmpLogicalW(null, null) returns -2 / 'false'. So the NaturalStringComparer implementation is not logically complete / correct imho. – Suffix 5/7, 2018 at 19:38

All the RegEx approaches, even the most voted by users gives results that differs a lot by Windows natural sort, and most users here forgot to test it on special chars. Calling 'StrCmpLogicalW' is the way to go if you need a EXACT functionality of Windows natural sort, regardless of its rare circumstancial bug commented in one of the answers in this thread. Thanks for this solution! – Phifer 15/3, 2021 at 11:8

Just got an Error

ArgumentException: Unable to sort because the IComparer.Compare() method returns inconsistent results. Either a value does not compare equal to itself, or one value repeatedly compared to another value yields different results. IComparer: 'NaturalStringComparer'.

after using this class successfully for some time. This happened because I have NULL values inside my list to sort. – Rebato 12/9, 2021 at 8:24

Since 2012, there is the method Comparer<string>.Create(SafeNativeMethods.StrCmpLogicalW) which means you can now avoid writing the classes implementing IComparer<> by hand (still using your SafeNativeMethods type, as it is seen). – Luciana 11/4, 2023 at 15:20

Just thought I'd add to this (with the most concise solution I could find):

public static IOrderedEnumerable<T> OrderByAlphaNumeric<T>(this IEnumerable<T> source, Func<T, string> selector)
{
    int max = source
        .SelectMany(i => Regex.Matches(selector(i), @"\d+").Cast<Match>().Select(m => (int?)m.Value.Length))
        .Max() ?? 0;

    return source.OrderBy(i => Regex.Replace(selector(i), @"\d+", m => m.Value.PadLeft(max, '0')));
}

The above pads any numbers in the string to the max length of all numbers in all strings and uses the resulting string to sort.

The cast to (int?) is to allow for collections of strings without any numbers (.Max() on an empty enumerable throws an InvalidOperationException).

Darien answered 30/7, 2012 at 11:40 Comment(11)

+1 Not only is it the most concise it is the fastest I have seen. except for the accepted answer but I cannot use that one because of machine dependencies. It sorted over 4 million values in about 35 seconds. – Profile 4/10, 2012 at 20:58

This is both beautiful and impossible to read. I assume that the benefits of Linq will mean (at least) best average and best-case performance, so I think I'm going to go with it. Despite the lack of clarity. Thanks very much @Matthew Horsley – Niu 16/1, 2014 at 11:29

This is very good, but there is a bug for certain decimal numbers, my example was sorting of k8.11 vs k8.2. To fix this I implemented the following regex: \d+([\.,]\d)? – Platitude 10/3, 2016 at 10:26

You also need to take the length of the second group (decimal point + decimals) into account when you pad in this code m.Value.PadLeft(max, '0') – Platitude 10/3, 2016 at 12:52

Sorry concise maybe, but the performance is terrible. Taking more than a second to sort 1M short strings comparatively a very long time indeed. – Fazeli 1/12, 2016 at 0:40

I think you can use .DefaultIfEmpty().Max() instead of casting to int?. Also it is worth to do a source.ToList() to avoid re-enumerating the enumerable. – Equanimous 22/2, 2018 at 10:37

How would you order in descending order here? – Iain 28/2, 2018 at 16:22

What is the selector parameter here? – Geostatic 6/7, 2019 at 21:24

Interestingly it does not work on Linux, even if I add InvariantCulture to the OrderBy I need an implementation that gives the same sorting on Linux as in Windows. In particular dot is before underscore on Windows but the other way around on Linux. Hopefully one of the other solutions here will work :) – Labefaction 25/10, 2019 at 7:13

When I tested this solution I noticed that it is not gracefully handling "-". A list of {"3", "4", "-2", "1", "5"} gets sorted to {"1", "-2", "3", "4", "5"}. – Egeria 14/5, 2020 at 13:27

That's correct order @Anketam, at least on Windows 10. – Smatter 24/11, 2020 at 5:39

None of the existing implementations looked great so I wrote my own. The results are almost identical to the sorting used by modern versions of Windows Explorer (Windows 7/8). The only differences I've seen are 1) although Windows used to (e.g. XP) handle numbers of any length, it's now limited to 19 digits - mine is unlimited, 2) Windows gives inconsistent results with certain sets of Unicode digits - mine works fine (although it doesn't numerically compare digits from surrogate pairs; nor does Windows), and 3) mine can't distinguish different types of non-primary sort weights if they occur in different sections (e.g. "e-1é" vs "é1e-" - the sections before and after the number have diacritic and punctuation weight differences).

public static int CompareNatural(string strA, string strB) {
    return CompareNatural(strA, strB, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase);
}

public static int CompareNatural(string strA, string strB, CultureInfo culture, CompareOptions options) {
    CompareInfo cmp = culture.CompareInfo;
    int iA = 0;
    int iB = 0;
    int softResult = 0;
    int softResultWeight = 0;
    while (iA < strA.Length && iB < strB.Length) {
        bool isDigitA = Char.IsDigit(strA[iA]);
        bool isDigitB = Char.IsDigit(strB[iB]);
        if (isDigitA != isDigitB) {
            return cmp.Compare(strA, iA, strB, iB, options);
        }
        else if (!isDigitA && !isDigitB) {
            int jA = iA + 1;
            int jB = iB + 1;
            while (jA < strA.Length && !Char.IsDigit(strA[jA])) jA++;
            while (jB < strB.Length && !Char.IsDigit(strB[jB])) jB++;
            int cmpResult = cmp.Compare(strA, iA, jA - iA, strB, iB, jB - iB, options);
            if (cmpResult != 0) {
                // Certain strings may be considered different due to "soft" differences that are
                // ignored if more significant differences follow, e.g. a hyphen only affects the
                // comparison if no other differences follow
                string sectionA = strA.Substring(iA, jA - iA);
                string sectionB = strB.Substring(iB, jB - iB);
                if (cmp.Compare(sectionA + "1", sectionB + "2", options) ==
                    cmp.Compare(sectionA + "2", sectionB + "1", options))
                {
                    return cmp.Compare(strA, iA, strB, iB, options);
                }
                else if (softResultWeight < 1) {
                    softResult = cmpResult;
                    softResultWeight = 1;
                }
            }
            iA = jA;
            iB = jB;
        }
        else {
            char zeroA = (char)(strA[iA] - (int)Char.GetNumericValue(strA[iA]));
            char zeroB = (char)(strB[iB] - (int)Char.GetNumericValue(strB[iB]));
            int jA = iA;
            int jB = iB;
            while (jA < strA.Length && strA[jA] == zeroA) jA++;
            while (jB < strB.Length && strB[jB] == zeroB) jB++;
            int resultIfSameLength = 0;
            do {
                isDigitA = jA < strA.Length && Char.IsDigit(strA[jA]);
                isDigitB = jB < strB.Length && Char.IsDigit(strB[jB]);
                int numA = isDigitA ? (int)Char.GetNumericValue(strA[jA]) : 0;
                int numB = isDigitB ? (int)Char.GetNumericValue(strB[jB]) : 0;
                if (isDigitA && (char)(strA[jA] - numA) != zeroA) isDigitA = false;
                if (isDigitB && (char)(strB[jB] - numB) != zeroB) isDigitB = false;
                if (isDigitA && isDigitB) {
                    if (numA != numB && resultIfSameLength == 0) {
                        resultIfSameLength = numA < numB ? -1 : 1;
                    }
                    jA++;
                    jB++;
                }
            }
            while (isDigitA && isDigitB);
            if (isDigitA != isDigitB) {
                // One number has more digits than the other (ignoring leading zeros) - the longer
                // number must be larger
                return isDigitA ? 1 : -1;
            }
            else if (resultIfSameLength != 0) {
                // Both numbers are the same length (ignoring leading zeros) and at least one of
                // the digits differed - the first difference determines the result
                return resultIfSameLength;
            }
            int lA = jA - iA;
            int lB = jB - iB;
            if (lA != lB) {
                // Both numbers are equivalent but one has more leading zeros
                return lA > lB ? -1 : 1;
            }
            else if (zeroA != zeroB && softResultWeight < 2) {
                softResult = cmp.Compare(strA, iA, 1, strB, iB, 1, options);
                softResultWeight = 2;
            }
            iA = jA;
            iB = jB;
        }
    }
    if (iA < strA.Length || iB < strB.Length) {
        return iA < strA.Length ? 1 : -1;
    }
    else if (softResult != 0) {
        return softResult;
    }
    return 0;
}

The signature matches the Comparison<string> delegate:

string[] files = Directory.GetFiles(@"C:\");
Array.Sort(files, CompareNatural);

Here's a wrapper class for use as IComparer<string>:

public class CustomComparer<T> : IComparer<T> {
    private Comparison<T> _comparison;

    public CustomComparer(Comparison<T> comparison) {
        _comparison = comparison;
    }

    public int Compare(T x, T y) {
        return _comparison(x, y);
    }
}

Example:

string[] files = Directory.EnumerateFiles(@"C:\")
    .OrderBy(f => f, new CustomComparer<string>(CompareNatural))
    .ToArray();

Here's a good set of filenames I use for testing:

Func<string, string> expand = (s) => { int o; while ((o = s.IndexOf('\\')) != -1) { int p = o + 1;
    int z = 1; while (s[p] == '0') { z++; p++; } int c = Int32.Parse(s.Substring(p, z));
    s = s.Substring(0, o) + new string(s[o - 1], c) + s.Substring(p + z); } return s; };
string encodedFileNames =
    "KDEqLW4xMiotbjEzKjAwMDFcMDY2KjAwMlwwMTcqMDA5XDAxNyowMlwwMTcqMDlcMDE3KjEhKjEtISox" +
    "LWEqMS4yNT8xLjI1KjEuNT8xLjUqMSoxXDAxNyoxXDAxOCoxXDAxOSoxXDA2NioxXDA2NyoxYSoyXDAx" +
    "NyoyXDAxOCo5XDAxNyo5XDAxOCo5XDA2Nio9MSphMDAxdGVzdDAxKmEwMDF0ZXN0aW5nYTBcMzEqYTAw" +
    "Mj9hMDAyIGE/YTAwMiBhKmEwMDIqYTAwMmE/YTAwMmEqYTAxdGVzdGluZ2EwMDEqYTAxdnNmcyphMSph" +
    "MWEqYTF6KmEyKmIwMDAzcTYqYjAwM3E0KmIwM3E1KmMtZSpjZCpjZipmIDEqZipnP2cgMT9oLW4qaG8t" +
    "bipJKmljZS1jcmVhbT9pY2VjcmVhbT9pY2VjcmVhbS0/ajBcNDE/ajAwMWE/ajAxP2shKmsnKmstKmsx" +
    "KmthKmxpc3QqbTAwMDNhMDA1YSptMDAzYTAwMDVhKm0wMDNhMDA1Km0wMDNhMDA1YSpuMTIqbjEzKm8t" +
    "bjAxMypvLW4xMipvLW40P28tbjQhP28tbjR6P28tbjlhLWI1Km8tbjlhYjUqb24wMTMqb24xMipvbjQ/" +
    "b240IT9vbjR6P29uOWEtYjUqb245YWI1Km/CrW4wMTMqb8KtbjEyKnAwMCpwMDEqcDAxwr0hKnAwMcK9" +
    "KnAwMcK9YSpwMDHCvcK+KnAwMipwMMK9KnEtbjAxMypxLW4xMipxbjAxMypxbjEyKnItMDAhKnItMDAh" +
    "NSpyLTAwIe+8lSpyLTAwYSpyLe+8kFwxIS01KnIt77yQXDEhLe+8lSpyLe+8kFwxISpyLe+8kFwxITUq" +
    "ci3vvJBcMSHvvJUqci3vvJBcMWEqci3vvJBcMyE1KnIwMCEqcjAwLTUqcjAwLjUqcjAwNSpyMDBhKnIw" +
    "NSpyMDYqcjQqcjUqctmg2aYqctmkKnLZpSpy27Dbtipy27Qqctu1KnLfgN+GKnLfhCpy34UqcuClpuCl" +
    "rCpy4KWqKnLgpasqcuCnpuCnrCpy4KeqKnLgp6sqcuCppuCprCpy4KmqKnLgqasqcuCrpuCrrCpy4Kuq" +
    "KnLgq6sqcuCtpuCtrCpy4K2qKnLgrasqcuCvpuCvrCpy4K+qKnLgr6sqcuCxpuCxrCpy4LGqKnLgsasq" +
    "cuCzpuCzrCpy4LOqKnLgs6sqcuC1puC1rCpy4LWqKnLgtasqcuC5kOC5lipy4LmUKnLguZUqcuC7kOC7" +
    "lipy4LuUKnLgu5UqcuC8oOC8pipy4LykKnLgvKUqcuGBgOGBhipy4YGEKnLhgYUqcuGCkOGClipy4YKU" +
    "KnLhgpUqcuGfoOGfpipy4Z+kKnLhn6UqcuGgkOGglipy4aCUKnLhoJUqcuGlhuGljCpy4aWKKnLhpYsq" +
    "cuGnkOGnlipy4aeUKnLhp5UqcuGtkOGtlipy4a2UKnLhrZUqcuGusOGutipy4a60KnLhrrUqcuGxgOGx" +
    "hipy4bGEKnLhsYUqcuGxkOGxlipy4bGUKnLhsZUqcuqYoFwx6pilKnLqmKDqmKUqcuqYoOqYpipy6pik" +
    "KnLqmKUqcuqjkOqjlipy6qOUKnLqo5UqcuqkgOqkhipy6qSEKnLqpIUqcuqpkOqplipy6qmUKnLqqZUq" +
    "cvCQkqAqcvCQkqUqcvCdn5gqcvCdn50qcu+8kFwxISpy77yQXDEt77yVKnLvvJBcMS7vvJUqcu+8kFwx" +
    "YSpy77yQXDHqmKUqcu+8kFwx77yO77yVKnLvvJBcMe+8lSpy77yQ77yVKnLvvJDvvJYqcu+8lCpy77yV" +
    "KnNpKnPEsSp0ZXN02aIqdGVzdNmi2aAqdGVzdNmjKnVBZS0qdWFlKnViZS0qdUJlKnVjZS0xw6kqdWNl" +
    "McOpLSp1Y2Uxw6kqdWPDqS0xZSp1Y8OpMWUtKnVjw6kxZSp3ZWlhMSp3ZWlhMip3ZWlzczEqd2Vpc3My" +
    "KndlaXoxKndlaXoyKndlacOfMSp3ZWnDnzIqeSBhMyp5IGE0KnknYTMqeSdhNCp5K2EzKnkrYTQqeS1h" +
    "Myp5LWE0KnlhMyp5YTQqej96IDA1MD96IDIxP3ohMjE/ejIwP3oyMj96YTIxP3rCqTIxP1sxKl8xKsKt" +
    "bjEyKsKtbjEzKsSwKg==";
string[] fileNames = Encoding.UTF8.GetString(Convert.FromBase64String(encodedFileNames))
    .Replace("*", ".txt?").Split(new[] { "?" }, StringSplitOptions.RemoveEmptyEntries)
    .Select(n => expand(n)).ToArray();

Belch answered 13/8, 2011 at 1:45 Comment(6)

The digit sections need to be compared section-wise, i.e., 'abc12b' should be less than 'abc123'. – Pralltriller 18/2, 2013 at 22:14

You could try the following data: public string[] filenames = { "-abc12.txt", "abc12.txt", "1abc_2.txt", "a0000012.txt", "a0000012c.txt", "a000012.txt", "a000012b.txt", "a012.txt", "a0000102.txt", "abc1_2.txt", "abc12.txt", "abc12b.txt", "abc123.txt", "abccde.txt", "b0000.txt", "b00001.txt", "b0001.txt", "b001.txt", "c0000.txt", "c0000c.txt", "c00001.txt", "c000b.txt", "d0.20.2b.txt", "d0.1000c.txt", "d0.2000y.txt", "d0.20000.2b.txt", "e0000120000012", "e0000012000012"}; – Pralltriller 21/2, 2013 at 1:59

@XichenLi Thanks for the good test case. If you let Windows Explorer sort those files, you'll get different results depending on which version of Windows you're using. My code sorts those names identically to Server 2003 (and presumably XP), but different than Windows 8. If I get a chance I'll try to figure out how Windows 8 is doing it and update my code. – Belch 21/2, 2013 at 15:40

Has bug. Index Out Of Range – Rattan 18/3, 2013 at 6:23

Great solution! When I benchmarked it in a normal scenario with about 10,000 files, it was faster than Matthew's regex example, and about the same performance as StrCmpLogicalW(). There is a minor bug in the above code: the "while (strA[jA] == zeroA) jA++;" and "while (strB[jB] == zeroB) jB++;" should be "while (jA < strA.Length && strA[jA] == zeroA) jA++;" and "while (jB < strB.Length && strB[jB] == zeroB) jB++;". Otherwise, strings containing only zeros will throw an exception. – Afrikaner 8/12, 2015 at 4:27

Kuroki, I just found out the same problem. I wish I had paid attention to your comment. It occurs when there are two strings in which a number starts at the same position but one ends with 0 (e.g., {"0", "1"} or {"a.000", "a.001"}. The fix is, of course, adding the check as you suggested. – Medicaid 9/1, 2016 at 9:29

Matthews Horsleys answer is the fastest method which doesn't change behaviour depending on which version of windows your program is running on. However, it can be even faster by creating the regex once, and using RegexOptions.Compiled. I also added the option of inserting a string comparer so you can ignore case if needed, and improved readability a bit.

    public static IEnumerable<T> OrderByNatural<T>(this IEnumerable<T> items, Func<T, string> selector, StringComparer stringComparer = null)
    {
        var regex = new Regex(@"\d+", RegexOptions.Compiled);

        int maxDigits = items
                      .SelectMany(i => regex.Matches(selector(i)).Cast<Match>().Select(digitChunk => (int?)digitChunk.Value.Length))
                      .Max() ?? 0;

        return items.OrderBy(i => regex.Replace(selector(i), match => match.Value.PadLeft(maxDigits, '0')), stringComparer ?? StringComparer.CurrentCulture);
    }

Use by

var sortedEmployees = employees.OrderByNatural(emp => emp.Name);

This takes 450ms to sort 100,000 strings compared to 300ms for the default .net string comparison - pretty fast!

Nigelniger answered 11/3, 2014 at 10:46 Comment(1)

This is worth reading wrt the above - Compilation and Reuse in Regular Expressions – Woodnote 14/6, 2016 at 15:37

Pure C# solution for linq orderby:

http://zootfroot.blogspot.com/2009/09/natural-sort-compare-with-linq-orderby.html

public class NaturalSortComparer<T> : IComparer<string>, IDisposable
{
    private bool isAscending;

    public NaturalSortComparer(bool inAscendingOrder = true)
    {
        this.isAscending = inAscendingOrder;
    }

    #region IComparer<string> Members

    public int Compare(string x, string y)
    {
        throw new NotImplementedException();
    }

    #endregion

    #region IComparer<string> Members

    int IComparer<string>.Compare(string x, string y)
    {
        if (x == y)
            return 0;

        string[] x1, y1;

        if (!table.TryGetValue(x, out x1))
        {
            x1 = Regex.Split(x.Replace(" ", ""), "([0-9]+)");
            table.Add(x, x1);
        }

        if (!table.TryGetValue(y, out y1))
        {
            y1 = Regex.Split(y.Replace(" ", ""), "([0-9]+)");
            table.Add(y, y1);
        }

        int returnVal;

        for (int i = 0; i < x1.Length && i < y1.Length; i++)
        {
            if (x1[i] != y1[i])
            {
                returnVal = PartCompare(x1[i], y1[i]);
                return isAscending ? returnVal : -returnVal;
            }
        }

        if (y1.Length > x1.Length)
        {
            returnVal = 1;
        }
        else if (x1.Length > y1.Length)
        { 
            returnVal = -1; 
        }
        else
        {
            returnVal = 0;
        }

        return isAscending ? returnVal : -returnVal;
    }

    private static int PartCompare(string left, string right)
    {
        int x, y;
        if (!int.TryParse(left, out x))
            return left.CompareTo(right);

        if (!int.TryParse(right, out y))
            return left.CompareTo(right);

        return x.CompareTo(y);
    }

    #endregion

    private Dictionary<string, string[]> table = new Dictionary<string, string[]>();

    public void Dispose()
    {
        table.Clear();
        table = null;
    }
}

Sino answered 12/3, 2010 at 14:52 Comment(5)

That code is ultimately from codeproject.com/KB/recipes/NaturalComparer.aspx (which is not LINQ-oriented). – Prearrange 31/8, 2010 at 21:19

The blog post credits Justin Jones (codeproject.com/KB/string/NaturalSortComparer.aspx) for the IComparer, not Pascal Ganaye. – Sino 1/9, 2010 at 9:59

Minor note, this solution ignores spaces which is not the same as what windows does, and is not as good as Matthew Horsley's code below. So you might get 'string01' 'string 01' 'string 02' 'string02' for example (which looks ugly). If you remove the stripping of spaces, it orders the strings backwards i.e. 'string01' comes before 'string 01', which may or may not be acceptable. – Nigelniger 7/3, 2014 at 17:43

This worked for addresses, ie "1 Smith Rd", "10 Smith Rd", "2 Smith Rd", etc - Sorted naturally. Yes! Nice one! – Personage 12/8, 2015 at 15:18

By the way, I noticed (and comments on that linked page also seem to indicate) that the Type argument <T> is completely unnecessary. – Abortifacient 2/6, 2016 at 19:20

My solution:

void Main()
{
    new[] {"a4","a3","a2","a10","b5","b4","b400","1","C1d","c1d2"}.OrderBy(x => x, new NaturalStringComparer()).Dump();
}

public class NaturalStringComparer : IComparer<string>
{
    private static readonly Regex _re = new Regex(@"(?<=\D)(?=\d)|(?<=\d)(?=\D)", RegexOptions.Compiled);

    public int Compare(string x, string y)
    {
        x = x.ToLower();
        y = y.ToLower();
        if(string.Compare(x, 0, y, 0, Math.Min(x.Length, y.Length)) == 0)
        {
            if(x.Length == y.Length) return 0;
            return x.Length < y.Length ? -1 : 1;
        }
        var a = _re.Split(x);
        var b = _re.Split(y);
        int i = 0;
        while(true)
        {
            int r = PartCompare(a[i], b[i]);
            if(r != 0) return r;
            ++i;
        }
    }

    private static int PartCompare(string x, string y)
    {
        int a, b;
        if(int.TryParse(x, out a) && int.TryParse(y, out b))
            return a.CompareTo(b);
        return x.CompareTo(y);
    }
}

Results:

1
a2
a3
a4
a10
b4
b5
b400
C1d
c1d2

Federicofedirko answered 24/7, 2012 at 5:9 Comment(4)

I like it. It's easy to understand and doesn't require Linq. – Carpology 2/7, 2016 at 20:31

This throws an IndexOutOfRangeException for an array {"1", "01"}. – Wendell 30/3, 2021 at 6:39

Yeah, we definitely need while (i < a.Length && i < b.Length) instead of while(true). It'll then fall out of the while when comparing "1" with "01", but also when comparing "1" with "1 a", so we need a little thought about tie-breakers afterwards too. – Sinuation 18/11, 2021 at 10:46

Added a bit of completion as noted in other comments, and handling for files as well at this gist - gist.github.com/ChuckSavage/e1c61839300abd2d90873b01da9d7575 – Filmore 14/1, 2022 at 0:33

Here's a version for .NET Core 2.1+ / .NET 5.0+, using spans to avoid allocations

public class NaturalSortStringComparer : IComparer<string>
{
    public static NaturalSortStringComparer Ordinal { get; } = new NaturalSortStringComparer(StringComparison.Ordinal);
    public static NaturalSortStringComparer OrdinalIgnoreCase { get; } = new NaturalSortStringComparer(StringComparison.OrdinalIgnoreCase);
    public static NaturalSortStringComparer CurrentCulture { get; } = new NaturalSortStringComparer(StringComparison.CurrentCulture);
    public static NaturalSortStringComparer CurrentCultureIgnoreCase { get; } = new NaturalSortStringComparer(StringComparison.CurrentCultureIgnoreCase);
    public static NaturalSortStringComparer InvariantCulture { get; } = new NaturalSortStringComparer(StringComparison.InvariantCulture);
    public static NaturalSortStringComparer InvariantCultureIgnoreCase { get; } = new NaturalSortStringComparer(StringComparison.InvariantCultureIgnoreCase);

    private readonly StringComparison _comparison;

    public NaturalSortStringComparer(StringComparison comparison)
    {
        _comparison = comparison;
    }

    public int Compare(string x, string y)
    {
        // Let string.Compare handle the case where x or y is null
        if (x is null || y is null)
            return string.Compare(x, y, _comparison);

        var xSegments = GetSegments(x);
        var ySegments = GetSegments(y);

        while (xSegments.MoveNext() && ySegments.MoveNext())
        {
            int cmp;

            // If they're both numbers, compare the value
            if (xSegments.CurrentIsNumber && ySegments.CurrentIsNumber)
            {
                var xValue = long.Parse(xSegments.Current);
                var yValue = long.Parse(ySegments.Current);
                cmp = xValue.CompareTo(yValue);
                if (cmp != 0)
                    return cmp;
            }
            // If x is a number and y is not, x is "lesser than" y
            else if (xSegments.CurrentIsNumber)
            {
                return -1;
            }
            // If y is a number and x is not, x is "greater than" y
            else if (ySegments.CurrentIsNumber)
            {
                return 1;
            }

            // OK, neither are number, compare the segments as text
            cmp = xSegments.Current.CompareTo(ySegments.Current, _comparison);
            if (cmp != 0)
                return cmp;
        }

        // At this point, either all segments are equal, or one string is shorter than the other

        // If x is shorter, it's "lesser than" y
        if (x.Length < y.Length)
            return -1;
        // If x is longer, it's "greater than" y
        if (x.Length > y.Length)
            return 1;

        // If they have the same length, they're equal
        return 0;
    }

    private static StringSegmentEnumerator GetSegments(string s) => new StringSegmentEnumerator(s);

    private struct StringSegmentEnumerator
    {
        private readonly string _s;
        private int _start;
        private int _length;

        public StringSegmentEnumerator(string s)
        {
            _s = s;
            _start = -1;
            _length = 0;
            CurrentIsNumber = false;
        }

        public ReadOnlySpan<char> Current => _s.AsSpan(_start, _length);
        
        public bool CurrentIsNumber { get; private set; }

        public bool MoveNext()
        {
            var currentPosition = _start >= 0
                ? _start + _length
                : 0;

            if (currentPosition >= _s.Length)
                return false;

            int start = currentPosition;
            bool isFirstCharDigit = Char.IsDigit(_s[currentPosition]);

            while (++currentPosition < _s.Length && Char.IsDigit(_s[currentPosition]) == isFirstCharDigit)
            {
            }

            _start = start;
            _length = currentPosition - start;
            CurrentIsNumber = isFirstCharDigit;

            return true;
        }
    }
}

Winstonwinstonn answered 24/2, 2021 at 16:7 Comment(7)

This is fantastic. Why is it not part of the .NET libraries? I imagine a natural sort would be a fairly common requirement when performing sorts. – Netta 29/4, 2022 at 22:59

Seems logical to keep isFirstCharDigit as a field in StringSegmentEnumerator, then you don't need to TryParse unnecessarily. Also the series of if in the big while could be simplified to

cmp = yIsNumber.CompareTo(xIsNumber); if (cmp != 0) return cmp; cmp = xValue.CompareTo(yValue); if (cmp != 0) return cmp; cmp = xSegments.Current.CompareTo(....

and the final length check could be simplified to return x.Length.CompareTo(y.Length); – Mcclean 22/7, 2022 at 0:46

@Mcclean thanks for the suggestions. I'll look into it, but please don't make drastic changes to the code without prior agreement. – Winstonwinstonn 28/7, 2022 at 13:26

Sorry about that. After no response to my comment for a few days, I assumed this answer had been abandoned. Then check over the changes and decide for yourself. I think there are possibly other improvements, such as removing _currentPosition and calculating it off _start + _length, but didn't want to go all out. – Mcclean 28/7, 2022 at 13:32

@Mcclean I see your point about isFirstCharDigit, but it's only relevant because you added the ParseNumber method. In fact, I think it might be even better to store the number directly. Regarding the other changes: the trick with .HasValue.CompareTo(...) is clever, but to be honest I think it makes the code harder to understand. And you're right about _currentPosition, it can be easily calculated, making the struct smaller. – Winstonwinstonn 28/7, 2022 at 18:30

It's quite standard to waterfall the results of CompareTo and very easy to understand. The boilerplate is cmp = x1.CompareTo(y1); if (cmp != 0) return cmp; cmp = x2.CompareTo(y2); if (cmp != 0) return cmp; etc. Either way, great answer, I think it's worth upvoting, and improving where possible. – Mcclean 28/7, 2022 at 19:41

Thanks for greate solution! To avoid NumberOverflowException for long number string, add check like if (isFirstCharDigit && (currentPosition - start) >= 18) break; – Digestible 19/4 at 3:25

You do need to be careful -- I vaguely recall reading that StrCmpLogicalW, or something like it, was not strictly transitive, and I have observed .NET's sort methods to sometimes get stuck in infinite loops if the comparison function breaks that rule.

A transitive comparison will always report that a < c if a < b and b < c. There exists a function that does a natural sort order comparison that does not always meet that criterion, but I can't recall whether it is StrCmpLogicalW or something else.

Atabrine answered 1/9, 2009 at 19:24 Comment(4)

Do you have any proof of this statement? After googling around, I can't find any indication that it is true. – Prearrange 31/8, 2010 at 21:3

I've experienced those infinite loops with StrCmpLogicalW. – Supernational 28/12, 2010 at 18:48

@mhenry: See connect.microsoft.com/VisualStudio/feedback/details/236900/… and codeproject.com/KB/cs/StringCompareBug.aspx – Kingcup 25/1, 2011 at 15:1

Visual Studio feedback item 236900 no longer exists, but here's a more up-to-date one that confirms the problem: connect.microsoft.com/VisualStudio/feedback/details/774540/… It also gives a work-around: CultureInfo has a property CompareInfo, and the object it returns can supply you with SortKey objects. These, in turn, can be compared and guarantee transitivity. – Atabrine 8/5, 2015 at 20:23

This is my code to sort a string having both alpha and numeric characters.

First, this extension method:

public static IEnumerable<string> AlphanumericSort(this IEnumerable<string> me)
{
    return me.OrderBy(x => Regex.Replace(x, @"\d+", m => m.Value.PadLeft(50, '0')));
}

Then, simply use it anywhere in your code like this:

List<string> test = new List<string>() { "The 1st", "The 12th", "The 2nd" };
test = test.AlphanumericSort();

How does it works ? By replaceing with zeros:

  Original  | Regex Replace |      The      |   Returned
    List    | Apply PadLeft |    Sorting    |     List
            |               |               |
 "The 1st"  |  "The 001st"  |  "The 001st"  |  "The 1st"
 "The 12th" |  "The 012th"  |  "The 002nd"  |  "The 2nd"
 "The 2nd"  |  "The 002nd"  |  "The 012th"  |  "The 12th"

Works with multiples numbers:

 Alphabetical Sorting | Alphanumeric Sorting
                      |
 "Page 21, Line 42"   | "Page 3, Line 7"
 "Page 21, Line 5"    | "Page 3, Line 32"
 "Page 3, Line 32"    | "Page 21, Line 5"
 "Page 3, Line 7"     | "Page 21, Line 42"

Hope that's will help.

Dalia answered 27/10, 2016 at 17:26 Comment(2)

"How does it works ? By replaceing with zeros... "The 001st"" I mean, to be fair, you're using a lot more zeroes than that ("The 00000000000000000000000000000000000000000000000001st"), and it's essentially the same answer as Matthew Horsley's (afaict, though yours is naïve about the max number length, which could be bad), but it's good code golf and an excellent explanation. ;^D – Shinny 12/11, 2020 at 22:34

The nice thing is a variation of this can be added to LINQ queries: .OrderBy(j => Regex.Replace(j.MyField, @"\d+", m => m.Value.PadLeft(50, '0'))) Works like a charm. – Allembracing 29/10, 2021 at 16:55

Here's a relatively simple example that doesn't use P/Invoke and avoids any allocation during execution.

Feel free to use the code from here, or if it's easier there's a NuGet package:

https://www.nuget.org/packages/NaturalSort

https://github.com/drewnoakes/natural-sort

internal sealed class NaturalStringComparer : IComparer<string>
{
    public static NaturalStringComparer Instance { get; } = new NaturalStringComparer();

    public int Compare(string x, string y)
    {
        // sort nulls to the start
        if (x == null)
            return y == null ? 0 : -1;
        if (y == null)
            return 1;

        var ix = 0;
        var iy = 0;

        while (true)
        {
            // sort shorter strings to the start
            if (ix >= x.Length)
                return iy >= y.Length ? 0 : -1;
            if (iy >= y.Length)
                return 1;

            var cx = x[ix];
            var cy = y[iy];

            int result;
            if (char.IsDigit(cx) && char.IsDigit(cy))
                result = CompareInteger(x, y, ref ix, ref iy);
            else
                result = cx.CompareTo(y[iy]);

            if (result != 0)
                return result;

            ix++;
            iy++;
        }
    }

    private static int CompareInteger(string x, string y, ref int ix, ref int iy)
    {
        var lx = GetNumLength(x, ix);
        var ly = GetNumLength(y, iy);

        // shorter number first (note, doesn't handle leading zeroes)
        if (lx != ly)
            return lx.CompareTo(ly);

        for (var i = 0; i < lx; i++)
        {
            var result = x[ix++].CompareTo(y[iy++]);
            if (result != 0)
                return result;
        }

        return 0;
    }

    private static int GetNumLength(string s, int i)
    {
        var length = 0;
        while (i < s.Length && char.IsDigit(s[i++]))
            length++;
        return length;
    }
}

It doesn't ignore leading zeroes, so 01 comes after 2.

Corresponding unit test:

public class NumericStringComparerTests
{
    [Fact]
    public void OrdersCorrectly()
    {
        AssertEqual("", "");
        AssertEqual(null, null);
        AssertEqual("Hello", "Hello");
        AssertEqual("Hello123", "Hello123");
        AssertEqual("123", "123");
        AssertEqual("123Hello", "123Hello");

        AssertOrdered("", "Hello");
        AssertOrdered(null, "Hello");
        AssertOrdered("Hello", "Hello1");
        AssertOrdered("Hello123", "Hello124");
        AssertOrdered("Hello123", "Hello133");
        AssertOrdered("Hello123", "Hello223");
        AssertOrdered("123", "124");
        AssertOrdered("123", "133");
        AssertOrdered("123", "223");
        AssertOrdered("123", "1234");
        AssertOrdered("123", "2345");
        AssertOrdered("0", "1");
        AssertOrdered("123Hello", "124Hello");
        AssertOrdered("123Hello", "133Hello");
        AssertOrdered("123Hello", "223Hello");
        AssertOrdered("123Hello", "1234Hello");
    }

    private static void AssertEqual(string x, string y)
    {
        Assert.Equal(0, NaturalStringComparer.Instance.Compare(x, y));
        Assert.Equal(0, NaturalStringComparer.Instance.Compare(y, x));
    }

    private static void AssertOrdered(string x, string y)
    {
        Assert.Equal(-1, NaturalStringComparer.Instance.Compare(x, y));
        Assert.Equal( 1, NaturalStringComparer.Instance.Compare(y, x));
    }
}

Inaccurate answered 15/12, 2016 at 15:53 Comment(1)

This seems to be the best solution to be honest. Thank you for the NuGet packet. – Aureliaaurelian 29/2 at 9:9

Adding to Greg Beech's answer (because I've just been searching for that), if you want to use this from Linq you can use the OrderBy that takes an IComparer. E.g.:

var items = new List<MyItem>();

// fill items

var sorted = items.OrderBy(item => item.Name, new NaturalStringComparer());

Turquoise answered 23/9, 2009 at 15:36 Comment(0)

I've actually implemented it as an extension method on the StringComparer so that you could do for example:

StringComparer.CurrentCulture.WithNaturalSort() or
StringComparer.OrdinalIgnoreCase.WithNaturalSort().

The resulting IComparer<string> can be used in all places like OrderBy, OrderByDescending, ThenBy, ThenByDescending, SortedSet<string>, etc. And you can still easily tweak case sensitivity, culture, etc.

The implementation is fairly trivial and it should perform quite well even on large sequences.

I've also published it as a tiny NuGet package, so you can just do:

Install-Package NaturalSort.Extension

The code including XML documentation comments and suite of tests is available in the NaturalSort.Extension GitHub repository.

The entire code is this (if you cannot use C# 7 yet, just install the NuGet package):

public static class StringComparerNaturalSortExtension
{
    public static IComparer<string> WithNaturalSort(this StringComparer stringComparer) => new NaturalSortComparer(stringComparer);

    private class NaturalSortComparer : IComparer<string>
    {
        public NaturalSortComparer(StringComparer stringComparer)
        {
            _stringComparer = stringComparer;
        }

        private readonly StringComparer _stringComparer;
        private static readonly Regex NumberSequenceRegex = new Regex(@"(\d+)", RegexOptions.Compiled | RegexOptions.CultureInvariant);
        private static string[] Tokenize(string s) => s == null ? new string[] { } : NumberSequenceRegex.Split(s);
        private static ulong ParseNumberOrZero(string s) => ulong.TryParse(s, NumberStyles.None, CultureInfo.InvariantCulture, out var result) ? result : 0;

        public int Compare(string s1, string s2)
        {
            var tokens1 = Tokenize(s1);
            var tokens2 = Tokenize(s2);

            var zipCompare = tokens1.Zip(tokens2, TokenCompare).FirstOrDefault(x => x != 0);
            if (zipCompare != 0)
                return zipCompare;

            var lengthCompare = tokens1.Length.CompareTo(tokens2.Length);
            return lengthCompare;
        }
        
        private int TokenCompare(string token1, string token2)
        {
            var number1 = ParseNumberOrZero(token1);
            var number2 = ParseNumberOrZero(token2);

            var numberCompare = number1.CompareTo(number2);
            if (numberCompare != 0)
                return numberCompare;

            var stringCompare = _stringComparer.Compare(token1, token2);
            return stringCompare;
        }
    }
}

Gillyflower answered 20/11, 2017 at 20:44 Comment(0)

Here is a naive one-line regex-less LINQ way (borrowed from python):

var alphaStrings = new List<string>() { "10","2","3","4","50","11","100","a12","b12" };
var orderedString = alphaStrings.OrderBy(g => new Tuple<int, string>(g.ToCharArray().All(char.IsDigit)? int.Parse(g) : int.MaxValue, g));
// Order Now: ["2","3","4","10","11","50","100","a12","b12"]

Istanbul answered 23/4, 2018 at 13:19 Comment(2)

Removed Dump() and assigned to var and this works like a charm! – Lorelle 8/10, 2018 at 23:26

@ArneS: It was written in LinQPad; and I forgot to remove the Dump(). Thanks for pointing out. – Istanbul 9/10, 2018 at 1:41

Inspired by Michael Parker's solution, here is an IComparer implementation that you can drop in to any of the linq ordering methods:

private class NaturalStringComparer : IComparer<string>
{
    public int Compare(string left, string right)
    {
        int max = new[] { left, right }
            .SelectMany(x => Regex.Matches(x, @"\d+").Cast<Match>().Select(y => (int?)y.Value.Length))
            .Max() ?? 0;

        var leftPadded = Regex.Replace(left, @"\d+", m => m.Value.PadLeft(max, '0'));
        var rightPadded = Regex.Replace(right, @"\d+", m => m.Value.PadLeft(max, '0'));

        return string.Compare(leftPadded, rightPadded);
    }
}

Wolfie answered 13/9, 2018 at 16:40 Comment(0)

Expanding on a couple of the previous answers and making use of extension methods, I came up with the following that doesn't have the caveats of potential multiple enumerable enumeration, or performance issues concerned with using multiple regex objects, or calling regex needlessly, that being said, it does use ToList(), which can negate the benefits in larger collections.

The selector supports generic typing to allow any delegate to be assigned, the elements in the source collection are mutated by the selector, then converted to strings with ToString().

    private static readonly Regex _NaturalOrderExpr = new Regex(@"\d+", RegexOptions.Compiled);

    public static IEnumerable<TSource> OrderByNatural<TSource, TKey>(
        this IEnumerable<TSource> source, Func<TSource, TKey> selector)
    {
        int max = 0;

        var selection = source.Select(
            o =>
            {
                var v = selector(o);
                var s = v != null ? v.ToString() : String.Empty;

                if (!String.IsNullOrWhiteSpace(s))
                {
                    var mc = _NaturalOrderExpr.Matches(s);

                    if (mc.Count > 0)
                    {
                        max = Math.Max(max, mc.Cast<Match>().Max(m => m.Value.Length));
                    }
                }

                return new
                {
                    Key = o,
                    Value = s
                };
            }).ToList();

        return
            selection.OrderBy(
                o =>
                String.IsNullOrWhiteSpace(o.Value) ? o.Value : _NaturalOrderExpr.Replace(o.Value, m => m.Value.PadLeft(max, '0')))
                     .Select(o => o.Key);
    }

    public static IEnumerable<TSource> OrderByDescendingNatural<TSource, TKey>(
        this IEnumerable<TSource> source, Func<TSource, TKey> selector)
    {
        int max = 0;

        var selection = source.Select(
            o =>
            {
                var v = selector(o);
                var s = v != null ? v.ToString() : String.Empty;

                if (!String.IsNullOrWhiteSpace(s))
                {
                    var mc = _NaturalOrderExpr.Matches(s);

                    if (mc.Count > 0)
                    {
                        max = Math.Max(max, mc.Cast<Match>().Max(m => m.Value.Length));
                    }
                }

                return new
                {
                    Key = o,
                    Value = s
                };
            }).ToList();

        return
            selection.OrderByDescending(
                o =>
                String.IsNullOrWhiteSpace(o.Value) ? o.Value : _NaturalOrderExpr.Replace(o.Value, m => m.Value.PadLeft(max, '0')))
                     .Select(o => o.Key);
    }

Alb answered 23/9, 2014 at 20:25 Comment(0)

A version that's easier to read/maintain.

public class NaturalStringComparer : IComparer<string>
{
    public static NaturalStringComparer Instance { get; } = new NaturalStringComparer();

    public int Compare(string x, string y) {
        const int LeftIsSmaller = -1;
        const int RightIsSmaller = 1;
        const int Equal = 0;

        var leftString = x;
        var rightString = y;

        var stringComparer = CultureInfo.CurrentCulture.CompareInfo;

        int rightIndex;
        int leftIndex;

        for (leftIndex = 0, rightIndex = 0;
             leftIndex < leftString.Length && rightIndex < rightString.Length;
             leftIndex++, rightIndex++) {
            var leftChar = leftString[leftIndex];
            var rightChar = rightString[leftIndex];

            var leftIsNumber = char.IsNumber(leftChar);
            var rightIsNumber = char.IsNumber(rightChar);

            if (!leftIsNumber && !rightIsNumber) {
                var result = stringComparer.Compare(leftString, leftIndex, 1, rightString, leftIndex, 1);
                if (result != 0) return result;
            } else if (leftIsNumber && !rightIsNumber) {
                return LeftIsSmaller;
            } else if (!leftIsNumber && rightIsNumber) {
                return RightIsSmaller;
            } else {
                var leftNumberLength = NumberLength(leftString, leftIndex, out var leftNumber);
                var rightNumberLength = NumberLength(rightString, rightIndex, out var rightNumber);

                if (leftNumberLength < rightNumberLength) {
                    return LeftIsSmaller;
                } else if (leftNumberLength > rightNumberLength) {
                    return RightIsSmaller;
                } else {
                    if(leftNumber < rightNumber) {
                        return LeftIsSmaller;
                    } else if(leftNumber > rightNumber) {
                        return RightIsSmaller;
                    }
                }
            }
        }

        if (leftString.Length < rightString.Length) {
            return LeftIsSmaller;
        } else if(leftString.Length > rightString.Length) {
            return RightIsSmaller;
        }

        return Equal;
    }

    public int NumberLength(string str, int offset, out int number) {
        if (string.IsNullOrWhiteSpace(str)) throw new ArgumentNullException(nameof(str));
        if (offset >= str.Length) throw new ArgumentOutOfRangeException(nameof(offset), offset, "Offset must be less than the length of the string.");

        var currentOffset = offset;

        var curChar = str[currentOffset];

        if (!char.IsNumber(curChar))
            throw new ArgumentException($"'{curChar}' is not a number.", nameof(offset));

        int length = 1;

        var numberString = string.Empty;

        for (currentOffset = offset + 1;
            currentOffset < str.Length;
            currentOffset++, length++) {

            curChar = str[currentOffset];
            numberString += curChar;

            if (!char.IsNumber(curChar)) {
                number = int.Parse(numberString);

                return length;
            }
        }

        number = int.Parse(numberString);

        return length;
    }
}

Alyss answered 10/10, 2019 at 18:13 Comment(0)

We had a need for a natural sort to deal with text with the following pattern:

"Test 1-1-1 something"
"Test 1-2-3 something"
...

For some reason when I first looked on SO, I didn't find this post and implemented our own. Compared to some of the solutions presented here, while similar in concept, it could have the benefit of maybe being simpler and easier to understand. However, while I did try to look at performance bottlenecks, It is still a much slower implementation than the default OrderBy().

Here is the extension method I implement:

public static class EnumerableExtensions
{
    // set up the regex parser once and for all
    private static readonly Regex Regex = new Regex(@"\d+|\D+", RegexOptions.Compiled | RegexOptions.Singleline);

    // stateless comparer can be built once
    private static readonly AggregateComparer Comparer = new AggregateComparer();

    public static IEnumerable<T> OrderByNatural<T>(this IEnumerable<T> source, Func<T, string> selector)
    {
        // first extract string from object using selector
        // then extract digit and non-digit groups
        Func<T, IEnumerable<IComparable>> splitter =
            s => Regex.Matches(selector(s))
                      .Cast<Match>()
                      .Select(m => Char.IsDigit(m.Value[0]) ? (IComparable) int.Parse(m.Value) : m.Value);
        return source.OrderBy(splitter, Comparer);
    }

    /// <summary>
    /// This comparer will compare two lists of objects against each other
    /// </summary>
    /// <remarks>Objects in each list are compare to their corresponding elements in the other
    /// list until a difference is found.</remarks>
    private class AggregateComparer : IComparer<IEnumerable<IComparable>>
    {
        public int Compare(IEnumerable<IComparable> x, IEnumerable<IComparable> y)
        {
            return
                x.Zip(y, (a, b) => new {a, b})              // walk both lists
                 .Select(pair => pair.a.CompareTo(pair.b))  // compare each object
                 .FirstOrDefault(result => result != 0);    // until a difference is found
        }
    }
}

The idea is to split the original strings into blocks of digits and non-digits ("\d+|\D+"). Since this is a potentially expensive task, it is done only once per entry. We then use a comparer of comparable objects (sorry, I can't find a more proper way to say it). It compares each block to its corresponding block in the other string.

I would like feedback on how this could be improved and what the major flaws are. Note that maintainability is important to us at this point and we are not currently using this in extremely large data sets.

Zoroastrianism answered 21/3, 2013 at 23:52 Comment(2)

This crashes when it tries to compare strings that are structurally different - eg comparing "a-1" to "a-2" works fine, but comparing "a" to "1" is not, because "a".CompareTo(1) throws an exception. – Dominickdominie 29/10, 2013 at 18:22

@jimrandomh, you are correct. This approach was specific to our patterns. – Zoroastrianism 8/11, 2013 at 16:25

If you want to order by name (natural) with correct order in terms of numeric

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;

public class Book
{
    public int Id { get; set; }
    public string Title { get; set; }
}

public class BookComparer : IComparer<Book>
{
    public int Compare(Book x, Book y)
    {
        var matchX = Regex.Match(x.Title, @"([^\d]+)(\d+)");
        var matchY = Regex.Match(y.Title, @"([^\d]+)(\d+)");

        var result = string.Compare(matchX.Groups[1].Value, matchY.Groups[1].Value, StringComparison.Ordinal);

        if (result == 0)
        {
            // If the prefixes are the same, compare numeric parts
            result = int.Parse(matchX.Groups[2].Value).CompareTo(int.Parse(matchY.Groups[2].Value));
        }

        return result;
    }
}

class Program
{
    static void Main()
    {
        var list = new List<Book>
        {
            new Book { Id = 49, Title = "Test 10" },
            new Book { Id = 50, Title = "Test 101" },
            new Book { Id = 43, Title = "Test 11" },
            new Book { Id = 48, Title = "Test 100" }
        };

        var ordered = list.OrderBy(m => m, new BookComparer()).Select(x => x.Title);

        var str = string.Join(", ", ordered);
        Console.WriteLine(str);
    }
}

Phelloderm answered 8/12, 2023 at 10:47 Comment(0)

-3

Let me explain my problem and how i was able to solve it.

Problem:- Sort files based on FileName from FileInfo objects which are retrieved from a Directory.

Solution:- I selected the file names from FileInfo and trimed the ".png" part of the file name. Now, just do List.Sort(), which sorts the filenames in Natural sorting order. Based on my testing i found that having .png messes up sorting order. Have a look at the below code

var imageNameList = new DirectoryInfo(@"C:\Temp\Images").GetFiles("*.png").Select(x =>x.Name.Substring(0, x.Name.Length - 4)).ToList();
imageNameList.Sort();

P answered 15/11, 2018 at 16:13 Comment(1)

Can i know the reason for -1 on this answer ? – P 15/11, 2018 at 21:8

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags