How to skip a paste() argument when its value is NA in R
Asked Answered
C

2

12

I have a data frame with the columns city, state, and country. I want to create a string that concatenates: "City, State, Country". However, one of my cities doesn't have a State (has a NA instead). I want the string for that city to be "City, Country". Here is the code that creates the wrong string:

# define City, State, Country
  city <- c("Austin", "Knoxville", "Salk Lake City", "Prague")
  state <- c("Texas", "Tennessee", "Utah", NA)
  country <- c("United States", "United States", "United States", "Czech Rep")
# create data frame
  dff <- data.frame(city, state, country)
# create full string
  dff["string"] <- paste(city, state, country, sep=", ")

When I display dff$string, I get the following. Note that the last string has a NA,, which is not needed:

> dff["string"]
                               string
1        Austin, Texas, United States
2 Knoxville, Tennessee, United States
3 Salk Lake City, Utah, United States
4               Prague, NA, Czech Rep

What do I do to skip that NA,, including the sep = ", ".

Clarhe answered 4/4, 2014 at 5:15 Comment(1)
There is a general discussion of suppressing NAs in paste here, should you have more than one column containing NAs.Guildsman
S
10

The alternative is to just fix it up afterwards:

gsub("NA, ","",dff$string)

#[1] "Austin, Texas, United States"       
#[2] "Knoxville, Tennessee, United States"
#[3] "Salk Lake City, Utah, United States"
#[4] "Prague, Czech Rep"   

Alternative #2, is to use apply once you have your data.frame called dff:

apply(dff, 1, function(x) paste(na.omit(x),collapse=", ") )
Show answered 4/4, 2014 at 5:29 Comment(2)
I was going to give a two-barrelled answer, but somebody took the first part for their own answer ;-)Show
Actually, I'd be interested in the performance of each, as well as possible gotcha's....Malkamalkah
E
7

Late to the party, but unite provides a one-step approach:

dff %>% unite("string", c(city, state, country), sep=", ", remove = FALSE, na.rm = TRUE)
                              string           city     state       country
1        Austin, Texas, United States         Austin     Texas United States
2 Knoxville, Tennessee, United States      Knoxville Tennessee United States
3 Salk Lake City, Utah, United States Salk Lake City      Utah United States
4                   Prague, Czech Rep         Prague      <NA>     Czech Rep
Enidenigma answered 20/9, 2021 at 14:24 Comment(1)
unite is not in base R. If this is from some package, you need to specify from which package said function comes from.Contrayerva

© 2022 - 2024 — McMap. All rights reserved.