The answer provided by @fabian-werner is great, but objects can have multiple classes, and "factor" may not necessarily be the first one returned by class(yes)
, so I suggest this small modification to check all class attributes:
safe.ifelse <- function(cond, yes, no) {
class.y <- class(yes)
if ("factor" %in% class.y) { # Note the small condition change here
levels.y = levels(yes)
}
X <- ifelse(cond,yes,no)
if ("factor" %in% class.y) { # Note the small condition change here
X = as.factor(X)
levels(X) = levels.y
} else {
class(X) <- class.y
}
return(X)
}
I have also submitted a request with the R Development team to add a documented option to have base::ifelse() preserve attributes based on user selection of which attributes to preserve. The request is here: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16609 - It has already been flagged as "WONTFIX" on the grounds that it has always been the way it is now, but I have provided a follow-up argument on why a simple addition might save a lot of R users headaches. Perhaps your "+1" in that bug thread will encourage the R Core team to take a second look.
EDIT: Here's a better version that allows the user to specify which attributes to preserve, either "cond" (default ifelse() behaviour), "yes", the behaviour as per the code above, or "no", for cases where the attributes of the "no" value are better:
safe_ifelse <- function(cond, yes, no, preserved_attributes = "yes") {
# Capture the user's choice for which attributes to preserve in return value
preserved <- switch(EXPR = preserved_attributes, "cond" = cond,
"yes" = yes,
"no" = no);
# Preserve the desired values and check if object is a factor
preserved_class <- class(preserved);
preserved_levels <- levels(preserved);
preserved_is_factor <- "factor" %in% preserved_class;
# We have to use base::ifelse() for its vectorized properties
# If we do our own if() {} else {}, then it will only work on first variable in a list
return_obj <- ifelse(cond, yes, no);
# If the object whose attributes we want to retain is a factor
# Typecast the return object as.factor()
# Set its levels()
# Then check to see if it's also one or more classes in addition to "factor"
# If so, set the classes, which will preserve "factor" too
if (preserved_is_factor) {
return_obj <- as.factor(return_obj);
levels(return_obj) <- preserved_levels;
if (length(preserved_class) > 1) {
class(return_obj) <- preserved_class;
}
}
# In all cases we want to preserve the class of the chosen object, so set it here
else {
class(return_obj) <- preserved_class;
}
return(return_obj);
} # End safe_ifelse function
if_else()
in the dplyr package that can substitute forifelse
while retaining correct classes of Date objects - it's posted below as a recent answer. I'm bringing attention to it here as it solves this problem by providing a function that is unit-tested and documented in a CRAN package, unlike many other answers that (as of this comment) were ranked ahead of it. – Toothache