Can Excel Sort Differently Than Its Default U.S. Character Set?
Asked Answered
M

2

7

My question is basically the opposite of THIS ONE (which had a database-based solution I can't use here).

I use SAP, which sorts characters this way:

0-9, A-Z, _

but I'm downloading data into Excel and manipulating ranges dependent on correct SAP character set sort order.

How can I force Excel to sort the same way as SAP, with underscore coming last.

After attempting a Custom Sort List of single characters in Excel's Sort feature, Excel still/always sorts like this:

_, 0-9, A-Z

Is there any way to get Excel to sort like SAP? I'm capable of doing Excel macros, if needed.

Alternatively, if anyone knows how to get native SAP tables to sort like Excel in the SAP interface, that would take care of this problem, as well.

Microbe answered 8/5, 2020 at 0:42 Comment(6)
Some general thoughts: the SAP order is generally based on the "code values", and SAP is now mostly based on Unicode, it's why you have the order 0-9, A-Z, _ (the order may also depend on the locale but it's rare). I guess a VBA subroutine could be created to do the same with the function asc() (asc("0") < asc("A") < asc("_")); it could read all cell values, select distinct values, sort them internally, create a "custom order" and do the sorting based on this custom order. Performance is low: with a column of 4000 cells, each having a unique value, the sorting takes 8 seconds.Synchrocyclotron
I realize that Excel sorts based on the locale (regional settings), so a and A have the same weight in French (order in English: _ < 0 < 9 < a91 < A92 < a93). This can be done identically in SAP, if you use ABAP SORT table AS TEXT (order in English: _ < 0 < 9 < a91 < A92 < a93). I guess you can do the same with SQL. Conclusion: it's more simple to change the order of table rows in ABAP or SQL than in Excel.Synchrocyclotron
Thank you, Sandra. Unfortunately, I am using SAP strictly as a user, viewing tables via transaction SE16 and other similar user presentations. I believe I would need to have a Developer's Key, etc. in order to use SQL. I'm not working to present my SAP results to others, just for me (to aid in complex configuration).Microbe
But your answer does perhaps explain why SAP consistently sorts "_" at the bottom, EXCEPT when one clicks a Sort button. Which actually drives me crazy because I need SAP to be consistent. But, no, that does not make this a solution for me. It's not about presentation, it's about entering ranges SAP will interpret later within a transaction, so entering my ranges as a SORT AS TEXT order will result in impossible ranges and inaccurate results. :(Microbe
Is there perhaps a locale I could set Excel to which has "_" at the bottom, do a sort, then reset my locale?Microbe
I think this one's over folks. Sandra invented & provided an excellent solution. I feel this one will stand for all who need this for a long time. Thanks to Sandra and to everyone who supports this site.Microbe
S
2

The principle of the following solution is to insert a new column in which the cells have a formula which calculates a "sortable code" of each cell of the column that you want to sort.

If you sort this new column, the rows will be sorted in the ASCII order (0-9, A-Z, _).

It should be able to handle any number of rows. On my laptop, the calculation of cells takes 1 minute for 130.000 rows. There are two VBA functions, one for ASCII and one for EBCDIC. It's very easy to define other character sets.

Steps:

  • Create a module in your Excel workbook and place the code below.
  • Close the VB editor otherwise it will run slowly.
  • In the worksheet that you want to sort, insert one column for each column you want to sort, for instance let's say the sort is to be done for column A, create a new column B, in the cell B1 insert the formula =SortableCodeASCII(A1) and do the same for all the cells of column B (up to the last row of column A).
  • Make sure that the calculation of formulas is over (it takes 1 minute for 130.000 rows on my laptop), otherwise if you sort, the order will be incorrect because formulas are not yet calculated. You see the progress indicator (percentage) on the status bar at the bottom of the Excel window. If you don't see it, press Ctrl+Alt+F9.
  • Sort on column B. The values in column A should be sorted according to the ASCII order (0-9, A-Z, _)

Good luck!

Option Compare Text 'to make true "a" = "A", "_" < "0", etc.
Option Base 0 'to start arrays at index 0 (LBound(array) = 0)
Dim SortableCharactersASCII() As String
Dim SortableCharactersEBCDIC() As String
Dim SortableCharactersTEST() As String

Sub ResetSortableCode()
    'Run this subroutine if you change anything in the code of this module
    'to regenerate the arrays SortableCharacters*
    Erase SortableCharactersASCII
    Erase SortableCharactersEBCDIC
    Erase SortableCharactersTEST
    Call SortableCodeASCII("")
    Call SortableCodeEBCDIC("")
    Call SortableCodeTEST("")
End Sub

Function SortableCodeASCII(text As String)
    If (Not Not SortableCharactersASCII) = 0 Then
        SortableCharactersASCII = getSortableCharacters( _
            orderedCharacters:=" !""#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}" & ChrW(126) & ChrW(127))
    End If
    SortableCodeASCII = getSortableCode(text, SortableCharactersASCII)
End Function

Function SortableCodeEBCDIC(text As String)
    If (Not Not SortableCharactersEBCDIC) = 0 Then
        SortableCharactersEBCDIC = getSortableCharacters( _
            orderedCharacters:=" ¢.<(+|&!$*);-/¦,%_>?`:#@'=""abcdefghi±jklmnopqr~stuvwxyz^[]{ABCDEFGHI}JKLMNOPQR\STUVWXYZ0123456789")
    End If
    SortableCodeEBCDIC = getSortableCode(text, SortableCharactersEBCDIC)
End Function

Function SortableCodeTEST(text As String)
    If (Not Not SortableCharactersTEST) = 0 Then
        SortableCharactersTEST = getSortableCharacters( _
            orderedCharacters:="ABCDEF 0123456789_")
    End If
    SortableCodeTEST = getSortableCode(text, SortableCharactersTEST)
End Function

Function getSortableCharacters(orderedCharacters As String) As String()

    'Each character X is assigned another character Y so that sort by character Y will
    'sort character X in the desired order.

    maxAscW = 0
    For i = 1 To Len(orderedCharacters)
         If AscW(Mid(orderedCharacters, i, 1)) > maxAscW Then
            maxAscW = AscW(Mid(orderedCharacters, i, 1))
         End If
    Next

    Dim aTemp() As String
    ReDim aTemp(maxAscW)
    j = 0
    For i = 1 To Len(orderedCharacters)
        'Was a character with same "sort weight" previously processed ("a" = "A")
        For i2 = 1 To i - 1
            If AscW(Mid(orderedCharacters, i, 1)) <> AscW(Mid(orderedCharacters, i2, 1)) _
                And Mid(orderedCharacters, i, 1) = Mid(orderedCharacters, i2, 1) Then
                'If two distinct characters are equal when case is ignored (e.g. "a" and "A")
                '(this is possible only because directive "Option Compare Text" is defined at top of module)
                'then only one should be used (either "a" or "A" but not both), so that the Excel sorting
                'does not vary depending on sorting option "Ignore case".
                Exit For
            End If
        Next
        If i2 = i Then
            'NO
            aTemp(AscW(Mid(orderedCharacters, i, 1))) = Format(j, "000")
            j = j + 1
        Else
            'YES "a" has same weight as "A"
            aTemp(AscW(Mid(orderedCharacters, i, 1))) = aTemp(AscW(Mid(orderedCharacters, i2, 1)))
        End If
    Next
    'Last character is for any character of input text which is not in orderedCharacters
    aTemp(maxAscW) = Format(j, "000")

    getSortableCharacters = aTemp

End Function

Function getOrderedCharactersCurrentLocale(numOfChars As Integer) As String

    'Build a string of characters, ordered according to the LOCALE order.
    '    (NB: to order by LOCALE, the directive "Option Compare Text" must be at the beginning of the module)
    'Before sorting, the placed characters are: ChrW(0), ChrW(1), ..., ChrW(numOfChars-1), ChrW(numOfChars).
    'Note that some characters are not used: for those characters which have the same sort weight
    '    like "a" and "A", only the first one is kept.
    'For debug, you may define constdebug=48 so that to use "printable" characters in sOrder:
    '    ChrW(48) ("0"), ChrW(49) ("1"), ..., ChrW(numOfChars+47), ChrW(numOfChars+48).

    sOrder = ""
    constdebug = 0 'Use 48 to help debugging (ChrW(48) = "0")
    i = 34
    Do Until Len(sOrder) = numOfChars
        Select Case constdebug + i
            Case 0, 7, 14, 15: i = i + 1
        End Select
        sCharacter = ChrW(constdebug + i)
        'Search order of character in current locale
        iOrder = 0
        For j = 1 To Len(sOrder)
            If AscW(sCharacter) <> AscW(Mid(sOrder, j, 1)) And sCharacter = Mid(sOrder, j, 1) Then
                'If two distinct characters are equal when case is ignored (e.g. "a" and "A")
                '("a" = "A" can be true only because directive "Option Compare Text" is defined at top of module)
                'then only one should be used (either "a" or "A" but not both), so that the Excel sorting
                'does not vary depending on sorting option "Ignore case".
                iOrder = -1
                Exit For
            ElseIf Mid(sOrder, j, 1) <= sCharacter Then
                'Compare characters based on the LOCALE order, that's possible because
                'the directive "Option Compare Text" has been defined.
                iOrder = j
            End If
        Next
        If iOrder = 0 Then
            sOrder = ChrW(constdebug + i) & sOrder
        ElseIf iOrder = Len(sOrder) Then
            sOrder = sOrder & ChrW(constdebug + i)
        ElseIf iOrder >= 1 Then
            sOrder = Left(sOrder, iOrder) & ChrW(constdebug + i) & Mid(sOrder, iOrder + 1)
        End If
        i = i + 1
    Loop
    'Last character is for any character of input text which is not in orderedCharacters
    sOrder = sOrder & ChrW(constdebug + numOfChars)

    getOrderedCharactersCurrentLocale = sOrder

End Function

Function getSortableCode(text As String, SortableCharacters() As String) As String

    'Used to calculate a sortable text such a way it fits a given order of characters.
    'Example: instead of order _, 0-9, Aa-Zz you may want 0-9, Aa-Zz, _
    'Will work only if Option Compare Text is defined at the beginning of the module.

    getSortableCode = ""
    For i = 1 To Len(text)
        If AscW(Mid(text, i, 1)) < UBound(SortableCharacters) Then
            If SortableCharacters(AscW(Mid(text, i, 1))) <> "" Then
                getSortableCode = getSortableCode & SortableCharacters(AscW(Mid(text, i, 1)))
            Else
                'Character has not an order sequence defined -> last in order
                getSortableCode = getSortableCode & SortableCharacters(UBound(SortableCharacters))
            End If
        Else
            'Character has not an order sequence defined -> last in order
            getSortableCode = getSortableCode & SortableCharacters(UBound(SortableCharacters))
        End If
    Next

    'For two texts "a1" and "A1" having the same sortable code, appending the original text allows using the sort option "Ignore Case"/"Respecter la casse"
    getSortableCode = getSortableCode & " " & text

End Function
Synchrocyclotron answered 11/5, 2020 at 20:32 Comment(8)
I am very sorry for my delay in feedback; my immediate need was over and I didn't have time to test this ... until now. -- And my reply is that this approach is ABSOLUTELY BRILLIANT! I still haven't figured out what exactly you did, but I get the out-of-box concept and it does appear to work perfectly. -- The ONE QUESTION I have is how exactly to call this function when it lives in a central workbook. Just prefix with 'PERSONAL.XLSB'? -- Regardless, you provided me with something that actually fixes this, and I am unspeakably grateful for it. Wish I could repay you.Microbe
Thank you. This approach was already proposed in Stack Overflow or in Super User. What I add to the existing answers is a solution which can be extended to any kind of character sets very easily. If I understand your question correctly, you want an Add-In, which is loaded automatically when you start Excel. Try to see if others have asked the same kind of question in Stack Overflow or Super User. Otherwise, maybe you could ask a new question to explain with more details.Synchrocyclotron
Actually, I'm not yet sure you understood that last question. The code you offered ... I loaded that into VBA, the same as any other Macro. But, the way you call it is the only visible difference. You don't run it like a Macro; instead you use it like a Formula. So are you saying that when you call VBA code from a Formula, you have to load all the code directly into each workbook where you use it? And that you cannot store the code in the central Personal Macro Workbook (like any other Macro) and call the Sub from Workbook X as a formula, drawing it from the Personal Macro Workbook? thxMicrobe
In that case, as I said, please ask a new question.Synchrocyclotron
I would not think that is necessary. It's a follow-up question regarding how best to implement/use the code you posted here. It's relevant to a complete solution for the entire original issue.Microbe
OK, looks like that's the answer then, folks. If you want to use this brilliant code to alter your Excel sorting, just copy that code into the Macros/VBA area of each workbook you need it for and then formula will work. Cheers all!Microbe
I didn't talk about the Personal Macro Workbook. I proposed to use an add-in. I just tried, and the user functions are made available in any workbook. It's discussed here. It took me 2 minutes to do it so no excuse. PS: don't forget to define the code in a module otherwise the functions are not seen.Synchrocyclotron
I'm torn here because I appreciate your clarification that you proposed I create an Add-In, but also feel it unfair to deem this "no excuse." An "arm-chair macro writer" like myself does not necessarily know that Add-Ins can be created by regular users. So "you want an Add-In" meant (to me) "you want to go get something else" because only companies with "real developers" can create Add-Ins. But I had your perfectly good code! So to clarify, when this was first suggested, I interpreted that as a misunderstanding of my request, and went back to the only delivery system I knew (before): PMW. Thx!Microbe
S
0

EDIT: this solution is based on the automatic calculation of a custom order list, but it doesn't work if there are too many distinct values. In my case it worked with a custom order list of maybe a total of 35.000 characters, but it failed for the big list of the original poster.


The following code sorts the requested column(s) by ASCII value, which has this kind of order:

0-9, A-Z, _, a-z

I guess the lower case being separated from the upper case is not an issue as SAP defines values mostly in upper case. If needed, the code can be easily adapted to obtain the custom order 0-9, Aa-Zz, _ (by using UCase and worksheet.Sort.MatchCase = False).

This order is different from the built-in Excel sort order which is based on the locale. For instance, in English, it would be:

_, 0-9, Aa-Zz

The principle is to use a "custom order list" whose values are taken from the Excel column, made unique, and sorted with a QuickSort3 algorithm (subroutine MedianThreeQuickSort1 provided by Ellis Dee at http://www.vbforums.com/showthread.php?473677-VB6-Sorting-algorithms-(sort-array-sorting-arrays)).

Performance notes about the Excel sorting via custom list (I'm not talking about QuickSort3):

  • The more the distinct values in the custom order list, the lower the performance. 4,000 rows having 20 distinct values are sorted immediately, but 4,000 rows having 4,000 distinct values takes 8 seconds to sort!
  • For the same number of distinct values, the performance does not change a lot if there are many rows to sort. 300,000 rows having 6 distinct values takes 3 seconds to sort.
Sub SortByAsciiValue()
  With ActiveSheet.Sort
    .SortFields.Clear
    .SetRange Range("A:A").CurrentRegion
    .SortFields.Add Key:=Columns("A"), Order:=xlAscending, _
        CustomOrder:=DistinctValuesInAsciiOrder(iRange:=Columns("A"), Header:=True)
    .Header = xlYes
    .Apply
  End With
End Sub

Function DistinctValuesInAsciiOrder(iRange As Range, Header As Boolean) As String
    Dim oCell As Range
    Dim oColl As New Collection

    On Error Resume Next
    For Each oCell In iRange.Cells
        Err.Clear
        If Header = True And oCell.Row = iRange.Row Then
        ElseIf oCell.Row > iRange.Worksheet.UsedRange.Rows.Count Then
        Exit For
        Else
        dummy = oColl.Item(oCell.Text)
        If Err.Number <> 0 Then
            oColl.Add oCell.Text, oCell.Text
            totalLength = totalLength + Len(oCell.Text) + 1
        End If
        End If
    Next
    On Error GoTo 0

    If oColl.Count = 0 Then
        Exit Function
    End If

    Dim values() As String
    ReDim values(1)
    ReDim values(oColl.Count - 1 + LBound(values))
    For i = 1 To oColl.Count
        values(i - 1 + LBound(values)) = oColl(i)
    Next
    Call MedianThreeQuickSort1(values)

    ' String concatenation is complex just for better performance (allocate space once)
    DistinctValuesInAsciiOrder = Space(totalLength - 1)
    Mid(DistinctValuesInAsciiOrder, 1, Len(values(LBound(values)))) = values(LBound(values))
    off = 1 + Len(values(LBound(values)))
    For i = LBound(values) + 1 To UBound(values)
        Mid(DistinctValuesInAsciiOrder, off, 1 + Len(values(i))) = "," & values(i)
        off = off + 1 + Len(values(i))
    Next
End Function
Synchrocyclotron answered 8/5, 2020 at 16:8 Comment(11)
Thanks so much, Sandra! Did you just write this?<br/><br/>I will absolutely attempt this because it would be SO worth it to me if it works.<br/><br/>And I'll report back on how slow it runs on 111,405 rows all with unique Col A key fields ... on a 2 year old work-issue laptop. But seconds is basically nothing to me ... I've waited probably 30+ min for some VLOOKUP operations (which I immediately Copy/Pasted Values on when complete) on this 74MB book.Microbe
I also see where I'll need to set the range out to my last "AH" column. A small price to pay.Microbe
Dang it! It ran for 30min before kicking back a Syntax Error (of all things) on line 47 of the QuickSort3 Sub, here: --- If lngLast – plngLeft < plngRight – lngFirst Then --- unless this is obvious to you (knowing this code better), I'll have to find time to debug it. I'm running this in Excel 2016. [No, I never thought to run a Debug>Compile before attempting :( ] OH! I had the tab in Filter mode at the time (but with all filters clear) ... if that matters.Microbe
I noticed, and tried to fix, the 4 nested IFs in that structure lacked END IFs. But 1) that should have been noticed by someone before now and 2) it didn't help anyway ... it just turned each of those lines red, as well.Microbe
The minus sign is an invalid dash character, please remove it and type - manually. I had noticed it too. But the rest of QuickSort3 code is okay for me.Synchrocyclotron
@Microbe - I've waited probably 30+ min for some VLOOKUP operations ... switch to Index/Match. It will perform faster.Brisesoleil
Thanks for the advice, Scott. Do you mean with the =MATCH formula? If so, that doesn't work for my purposes. I don't just want to know the same data is there, I want to bring it over and look at it side-by-side. If not, your advice is perhaps deeper than I've been yet.Microbe
Amazing, Sandra! I never would have seen that about the dash. I've replaced it and it compiles now. I'll try again when I don't need Excel for other things for 30-60min and will let you know.Microbe
@sandra OK, so now: "Overflow" Error 6 on line ".SortFields.Add Key:" of SortByAsciiValue. I'm going to guess this means we've exceeded Excel's limit on custom lists?? I know you did 4K successfully and I'm doing 100K+ unsuccessfully ... briefly, do we know what the limit is? I'm more than willing to 1/2 my data, sort each half, and paste back together. But if the limit is say 5K, I'll have to live with what I have and save this for shorter applications (I do have those occasionally).Microbe
After some Google searching & finding the user interface for "import custom lists," I've learned: 1) custom lists likely have a character limit, not a # of values limit (though I don't know what the current limit is) and 2) this doesn't reach very far into my data because I'm using "table names" and my table names can be 30 char long. After attempting to manually create a long custom list, the result IT SHOWED ME doesn't get very far at all. I'm afraid this probably means that this solution -- while still very, very good and appreciated -- is probably a bust for this question. :(Microbe
@ScottHoltzman I looked into the =INDEX function as well, and while I learned something I didn't know before (thank you), that function seems built around knowing where the data is in the source table that you want to find. Basically what I'm doing here is manual table joins ... merging data from different tables together into the one view where I need them all. I know that's database work, but Access (and SAP itself) consistently fails me with table joining rules and crashing and taking just as long (if not longer). Excel gets the job done, if I walk away or do something else for a while.Microbe

© 2022 - 2024 — McMap. All rights reserved.