Repeat vector when its length is not a multiple of desired total length
Asked Answered
Y

2

4

I have a data frame with 1666 rows. I would like to add a column with a repeating sequence of 1:5 to use with cut() to do cross validation. It would look like this:

   Y      x1       x2       Id1
   1      .15      3.6       1
   0      1.1      2.2       2
   0      .05      3.3       3
   0      .45      2.8       4
   1      .85      3.1       5
   1      1.01     2.9       1
  ...      ...     ...      ...
Yawata answered 6/8, 2012 at 13:42 Comment(0)
B
4

Something, like this?

df <- data.frame(rnorm(1666))
df$cutter <- rep(1:5, length.out=1666)

tail(df)
     rnorm.1666. cutter
1661  0.11693169      1
1662 -1.12508091      2
1663  0.25441847      3
1664 -0.06045037      4
1665 -0.17242921      5
1666 -0.85366242      1
Bussell answered 6/8, 2012 at 13:47 Comment(0)
B
8

Use the length.out argument of rep() (or rep_len, a "faster simplified version"):

> rep(1:5, length.out = 166) # or rep_len(1:5, 166)
# [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2
# [38] 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4
# [75] 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1
# [112] 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3
# [149] 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1

length.out: non-negative integer. The desired length of the output vector

Here is an example using the built-in dataset cars.

str(cars)
'data.frame':   50 obs. of  2 variables:
 $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
 $ dist : num  2 10 4 22 16 10 18 26 34 17 ...

Add grouping column:

cars$group <- rep(1:3, length.out = 50L)

Inspect the result:

head(cars)
  speed dist group
1     4    2     1
2     4   10     2
3     7    4     3
4     7   22     1
5     8   16     2
6     9   10     3

tail(cars)
   speed dist group
45    23   54     3
46    24   70     1
47    24   92     2
48    24   93     3
49    24  120     1
50    25   85     2
Bucket answered 6/8, 2012 at 13:48 Comment(0)
B
4

Something, like this?

df <- data.frame(rnorm(1666))
df$cutter <- rep(1:5, length.out=1666)

tail(df)
     rnorm.1666. cutter
1661  0.11693169      1
1662 -1.12508091      2
1663  0.25441847      3
1664 -0.06045037      4
1665 -0.17242921      5
1666 -0.85366242      1
Bussell answered 6/8, 2012 at 13:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.