Order a "mixed" vector (numbers with letters)
Asked Answered
O

1

16

How can I order a vector like

c("7","10a","10b","10c","8","9","11c","11b","11a","12") -> alph

in

alph
[1] "7","8","9","10a","10b","10c","11a","11b","11c","12"

and use it to sort a data.frame, like

V1 <- c("A","A","B","B","C","C","D","D","E","E")
V2 <- 2:1 
V3 <- alph
df <- data.frame(V1,V2,V3)

and order the row to obtain (order V2 and then V3)

 V1 V2  V3
C  1   9
A  1 10a
B  1 10c
D  1 11b
E  1  12
A  2   7
C  2   8
B  2 10b
E  2 11a
D  2 11c
Overuse answered 5/12, 2013 at 9:49 Comment(1)
Do not use data.frame(cbind(...)), just use data.frame(...) directly. By calling cbind you make a character matrix containing V1, V2 and V3, which probably isn't what you want.Waverley
W
30
> library(gtools)
> mixedsort(alph)

[1] "7"   "8"   "9"   "10a" "10b" "10c" "11a" "11b" "11c" "12" 

To sort a data.frame you use mixedorder instead

> mydf <- data.frame(alph, USArrests[seq_along(alph),])
> mydf[mixedorder(mydf$alph),]

            alph Murder Assault UrbanPop Rape
Alabama        7   13.2     236       58 21.2
California     8    9.0     276       91 40.6
Colorado       9    7.9     204       78 38.7
Alaska       10a   10.0     263       48 44.5
Arizona      10b    8.1     294       80 31.0
Arkansas     10c    8.8     190       50 19.5
Florida      11a   15.4     335       80 31.9
Delaware     11b    5.9     238       72 15.8
Connecticut  11c    3.3     110       77 11.1
Georgia       12   17.4     211       60 25.8

mixedorder on multiple vectors (columns)

Apparently mixedorder cannot handle multiple vectors. I have made a function that circumvents this by converting all character vectors to factors with mixedsorted sorted levels, and pass all vectors on to the standard order function.

multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
    do.call(order, c(
        lapply(list(...), function(l){
            if(is.character(l)){
                factor(l, levels=mixedsort(unique(l)))
            } else {
                l
            }
        }),
        list(na.last = na.last, decreasing = decreasing)
    ))
}

However, in your particular case multi.mixedorder gets you the same result as the standard order, since V2 is numeric.

df <- data.frame(
    V1 = c("A","A","B","B","C","C","D","D","E","E"),
    V2 = 19:10,
    V3 = alph,
    stringsAsFactors = FALSE)

df[multi.mixedorder(df$V2, df$V3),]

   V1 V2  V3
10  E 10  12
9   E 11 11a
8   D 12 11b
7   D 13 11c
6   C 14   9
5   C 15   8
4   B 16 10c
3   B 17 10b
2   A 18 10a
1   A 19   7

Notice that

  • 19:10 is equivalent to c(19:10). c means concat, that is to make one long vector out of many short, but in you case you only have one vector (19:10) so there's no need to concat anything. However, in the case of V1 you have 10 vectors of length 1, so there you need to concat, as you already do.
  • You need stringsAsFactors=FALSE to not convert V1 and V3 to (incorrectly sorted) factors (which is default).
Waverley answered 5/12, 2013 at 9:50 Comment(8)
I tried this solution but i can't figure out how to use it to sort two columns (I edited an example).Overuse
It appears mixedorder does not support multiple columns (how strange!), but I can hack you a roundabout solution. Been wanting to do a thing of this nature before.Waverley
You're right, my example is really bad, sorry. But try your solution with V2 = 2:1 and it's not working anymore... isn't?Overuse
Don't worry, many things in R are not obvious at first :) And you are correct in that it doesn't work if there are ties in V2. I'll take a look at it later today.Waverley
There! Works now. The trick was to convert all characters to factors, which are sorted correctly with the standard order function.Waverley
Great! I made some editing to include all your useful comments. Thanks!Overuse
this helped me thanks.But ggplot insists in plotting in the unsorted formSelwyn
@TiagoBruno Make sure you are using the newest version of ggplot2 that was recently released. The ordering of things has been a pain in older versions and although I have not tried it in the new version yet I know it has been massively rewritten and improved, so I hope they have sorted this out.Waverley

© 2022 - 2024 — McMap. All rights reserved.