What are the differences between all of these functions that seem very similar ?
stri_join
,stri_c
, andstri_paste
come from packagestringi
and are pure aliasesstr_c
comes fromstringr
and is juststringi::stri_join
with a parameterignore_null
hardcoded toTRUE
whilestringi::stri_join
has it set toFALSE
by default.stringr::str_join
is a deprecated alias forstr_c
see:
library(stringi)
identical(stri_join, stri_c)
# [1] TRUE
identical(stri_join, stri_paste)
# [1] TRUE
library(stringr)
str_c
# function (..., sep = "", collapse = NULL)
# {
# stri_c(..., sep = sep, collapse = collapse, ignore_null = TRUE)
# }
# <environment: namespace:stringr>
stri_join
is very similar to base::paste
with a few differences enumerated below:
1. sep = ""
by default
So it behaves more like paste0
by default, but paste0
lost its sep
argument.
identical(paste0("a","b") , stri_join("a","b"))
# [1] TRUE
identical(paste("a","b") , stri_join("a","b",sep=" "))
# [1] TRUE
identical(paste("a","b", sep="-"), stri_join("a","b", sep="-"))
# [1] TRUE
str_c
will behave just like stri_join
here.
2. Behavior with NA
if you paste to NA
using stri_join
, the result is NA
, while paste
converts NA
to "NA"
paste0(c("a","b"),c("c",NA))
# [1] "ac" "bNA"
stri_join(c("a","b"),c("c",NA))
# [1] "ac" NA
str_c
will behave just like stri_join
here as well
3. Behavior with length 0
arguments
When a length 0 value is encountered, character(0)
is returned, except if ignore_null
is set to FALSE
, then the value is ignored. It is different from the behavior of paste
which would convert the length 0
value to ""
and thus contain 2 consecutive separators in the output.
stri_join("a",NULL, "b")
# [1] character(0)
stri_join("a",character(0), "b")
# [1] character(0)
paste0("a",NULL, "b")
# [1] "ab"
stri_join("a",NULL, "b", ignore_null = TRUE)
# [1] "ab"
str_c("a",NULL, "b")
# [1] "ab"
paste("a",NULL, "b") # produces double space!
# [1] "a b"
stri_join("a",NULL, "b", ignore_null = TRUE, sep = " ")
# [1] "a b"
str_c("a",NULL, "b", sep = " ")
# [1] "a b"
4. stri_join
warns more
paste(c("a","b"),c("c","d","e"))
# [1] "a c" "b d" "a e"
paste("a","b", sep = c(" ","-"))
# [1] "a b"
stri_join(c("a","b"),c("c","d","e"), sep = " ")
# [1] "a c" "b d" "a e"
# Warning message:
# In stri_join(c("a", "b"), c("c", "d", "e"), sep = " ") :
# longer object length is not a multiple of shorter object length
stri_join("a","b", sep = c(" ","-"))
# [1] "a b"
# Warning message:
# In stri_join("a", "b", sep = c(" ", "-")) :
# argument `sep` should be one character string; taking the first one
5. stri_join
is faster
microbenchmark::microbenchmark(
stringi = stri_join(rep("a",1000000),rep("b",1000),"c",sep=" "),
base = paste(rep("a",1000000),rep("b",1000),"c")
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# stringi 88.54199 93.4477 97.31161 95.17157 96.8879 131.9737 100 a
# base 166.01024 169.7189 178.31065 171.30910 176.3055 215.5982 100 b
© 2022 - 2024 — McMap. All rights reserved.