I have the following dataframe df (dput
below):
> df
id value
1 1 1
2 2 3
3 3 2
4 NA 1
5 NA 3
6 8 4
7 9 2
8 10 1
9 NA 1
10 NA 3
11 15 2
12 16 1
13 NA 3
14 NA 4
15 NA 2
16 20 1
17 21 1
18 22 3
19 NA 2
20 NA 1
21 NA 3
22 66 4
23 67 2
24 68 1
For each consecutive run of non-NA
"id", I would like to create a unique group ID (variable "id2" in the example). For the NA
in between, the "id2" should be NA
.
In the example, the first run of non-NA
"id" (1, 2 and 3), should all belong to group 1 (id2 = 1). The NA
on row 4-5 should have id2 = NA
. The second run of non-NA
"id" (8, 9, 10) should have id2 = 2, and so on.
The desired output should look like this:
id value id2
1 1 1 1
2 2 3 1
3 3 2 1
4 NA 1 NA
5 NA 3 NA
6 8 4 2
7 9 2 2
8 10 1 2
9 NA 1 NA
10 NA 3 NA
11 15 2 3
12 16 1 3
13 NA 3 NA
14 NA 4 NA
15 NA 2 NA
16 20 1 4
17 21 1 4
18 22 3 4
19 NA 2 NA
20 NA 1 NA
21 NA 3 NA
22 66 4 5
23 67 2 5
24 68 1 5
As you can see, id2 shows the unique id I want for the values of id between NA. It may be that there is only a NA
after the values, like for id's 1, 2 and 3. So I was wondering if anyone knows how to create a unique ID for values between NA
's?
dput
of df:
df <- structure(list(id = c(1, 2, 3, NA, NA, 8, 9, 10, NA, NA, 15,
16, NA, NA, NA, 20, 21, 22, NA, NA, NA, 66, 67, 68), value = c(1,
3, 2, 1, 3, 4, 2, 1, 1, 3, 2, 1, 3, 4, 2, 1, 1, 3, 2, 1, 3, 4,
2, 1)), class = "data.frame", row.names = c(NA, -24L))