In base R, you could split
, then sort
, then paste it all together. strsplit
creates a list, so using lapply
or sapply
to iterate through it:
unlist(lapply(strsplit(vector, ""), \(x) paste(sort(x), collapse = "")))
# or (thanks @Robert Hacken!)
sapply(strsplit(vector, ""), \(x) paste(sort(x), collapse = ""))
# [1] "1234556" "12356789" "01233477"
Among the answers, it looks like lapply
and vapply
are the fastest on the example data, but in longer vectors things seems to even out:
microbenchmark::microbenchmark(
lapply = unlist(lapply(strsplit(vector, ""), \(x) paste(sort(x), collapse = ""))),
sapply = sapply(strsplit(vector, ""), \(x) paste(sort(x), collapse = "")),
Thomas = unname(sapply(vector, \(x) intToUtf8(sort(utf8ToInt(x))))),
Mael_vapply = vapply(strsplit(vector, NULL), \(x) paste(sort(x), collapse = ''), ''),
geotheory = stringr::str_split(vector, '') |>
purrr::map_chr(~ sort(.x) |> paste(collapse=''))
)
expr min lq mean median uq max neval
lapply 69.701 75.6775 95.79383 79.6590 87.3620 1488.421 100
sapply 78.522 88.6830 115.97092 93.4425 102.1400 1942.413 100
Thomas 123.368 143.3830 174.09282 155.7950 165.8135 1900.921 100
Mael_vapply 68.167 76.1265 97.90242 79.9230 84.5445 1672.561 100
geotheory 206.529 224.7750 249.77152 240.1000 264.5795 407.786 100
longvec <- rep(vector, 1e4)
# vector of length 30,000
expr min lq mean median uq max neval
lapply 623.6649 727.6631 940.1815 817.8032 1023.8237 1953.855 100
sapply 589.1852 770.0218 941.9850 836.4549 1013.4217 1963.823 100
Thomas 994.9503 1223.2654 1637.0953 1361.5117 1860.2610 3250.105 100
Mael_vapply 615.7141 759.3814 922.8012 821.4991 998.0893 1807.163 100
geotheory 664.4530 810.4600 984.8990 879.5722 1031.2233 2103.203 100