I have a vector of data in the form ‘aaa_9999_1’ where the first part is an alpha-location code, the second is the four digit year, and the final is a unique point identifier. E.g., there are multiple sil_2007_X points, each with a different last digit. I need to split this field, using the “_” character and save only the unique ID number into a new vector. I tried:
oss$point <- unlist(strsplit(oss$id, split='_', fixed=TRUE))[3]
based on a response here: R remove part of string. I get a single response of “1”. If I just run
strsplit(oss$id, split= ‘_’, fixed=TRUE)
I can generate the split list:
> head(oss$point)
[[1]]
[1] "sil" "2007" "1"
[[2]]
[1] "sil" "2007" "2"
[[3]]
[1] "sil" "2007" "3"
[[4]]
[1] "sil" "2007" "4"
[[5]]
[1] "sil" "2007" "5"
[[6]]
[1] "sil" "2007" "6"
Adding the [3] at the end just gives me the [[3]] result: “sil” “2007” “3”. What I want is a vector of the 3rd part (the unique number) of all records. I feel like I’m close to understanding this, but it is taking too much time (like most of a day) on a deadline project. Thanks for any feedback.
gsub()
here, and might just dogsub(".*_.*_", "", mystring)
or even (because regex matching is by default greedy)gsub(".*_", "", mystring)
– Margemargeaux