I have a column in a dataset looking like this:
cluster_id
1
1
1
1
NA
1
NA
NA
2
NA
2
NA
3
NA
NA
3
cluster_id <- c("1","1","1","1","NA","1","NA","NA","2","NA","2","NA","3","NA","NA","3")
The order is already pre-defined before using a time column. What I want is to substitute the NA's that are within each cluster ID, i.e. if there's a row with 2, then an NA, and then a 2 again, I want that NA to become 2. The NA's between numbers stay as NA's. Example:
cluster_id cluster_id_new
1 1
1 1
1 1
1 1
NA 1
1 1
NA NA
NA NA
2 2
NA 2
2 2
NA NA
3 3
NA 3
NA 3
3 3
I found the zoo::na.locf
function in this post, which seems to be close to what I want, but I also need to take in consideration the value after the NA.
Any thoughts?
"NA"
(a character value) orNA
(missing value) or"1"
(character) or1
(numeric) - and are your data in a single vector like the example, or a data frame? – Cariocacluster_id
had an NA as the very first or very last value. (See my answer--I added an NA at the beginning and end ofcluster_id
so that the behavior of the solutions was clear.) – Spotter