Create a sequence of sequences of numbers

Asked 4/1, 2022 at 13:14 Answered 5/1, 2022 at 5:4

I would like to make the following sequence in R, by using rep or any other function.

c(1, 2, 3, 4, 5, 2, 3, 4, 5, 3, 4, 5, 4, 5, 5)

Basically, c(1:5, 2:5, 3:5, 4:5, 5:5).

Sheridan answered 4/1, 2022 at 13:14 Comment(0)

Use sequence.

sequence(5:1, from = 1:5)
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5

The first argument, nvec, is the length of each sequence (5:1); the second, from, is the starting point for each sequence (1:5).

Note: this works only for R >= 4.0.0. From R News 4.0.0:

sequence() [...] gains arguments [e.g. from] to generate more complex sequences.

Lipo answered 4/1, 2022 at 13:20 Comment(1)

@Henrik A very similar question answered some time ago using sequence: https://mcmap.net/q/659147/-using-seq-and-rep-to-create-a-sequence-of-5-integers-that-go-up-by-1-on-each-repetition – Ironbound 4/1, 2022 at 16:9

unlist(lapply(1:5, function(i) i:5))
# [1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5

Some speed tests on all answers provided note the OP mentioned 10K somewhere if I recall correctly

s1 <- function(n) { 
  unlist(lapply(1:n, function(i) i:n))
}

s2 <- function(n) {
  unlist(lapply(seq_len(n), function(i) seq(from = i, to = n, by = 1)))
}

s3 <- function(n) {
  vect <- 0:n
  unlist(replicate(n, vect <<- vect[-1]))
}

s4 <- function(n) {
  m <- matrix(1:n, ncol = n, nrow = n, byrow = TRUE)
  m[lower.tri(m)] <- 0
  c(t(m)[t(m != 0)])
}

s5 <- function(n) {
  m <- matrix(seq.int(n), ncol = n, nrow = n)
  m[lower.tri(m, diag = TRUE)]
}

s6 <- function(n) {
  out <- c()
  for (i in 1:n) { 
    out <- c(out, (1:n)[i:n])
  }
  out
}

library(rbenchmark)

n = 5

n = 5L

benchmark(
  "s1" = { s1(n) },
  "s2" = { s2(n) },
  "s3" = { s3(n) },
  "s4" = { s4(n) },
  "s5" = { s5(n) },
  "s6" = { s6(n) },
  replications = 1000,
  columns = c("test", "replications", "elapsed", "relative")
)

Do not get fooled by some "fast" solutions using hardly any function that takes time to be called, and differences are multiplied by 1000x replications.

  test replications elapsed relative
1   s1         1000    0.05      2.5
2   s2         1000    0.44     22.0
3   s3         1000    0.14      7.0
4   s4         1000    0.08      4.0
5   s5         1000    0.02      1.0
6   s6         1000    0.02      1.0

n = 1000

n = 1000L

benchmark(
  "s1" = { s1(n) },
  "s2" = { s2(n) },
  "s3" = { s3(n) },
  "s4" = { s4(n) },
  "s5" = { s5(n) },
  "s6" = { s6(n) },
  replications = 10,
  columns = c("test", "replications", "elapsed", "relative")
)

As the poster already mentioned as "not to do", we see the for loop becoming pretty slow compared to any other method, on n = 1000L

  test replications elapsed relative
1   s1           10    0.17    1.000
2   s2           10    0.83    4.882
3   s3           10    0.19    1.118
4   s4           10    1.50    8.824
5   s5           10    0.29    1.706
6   s6           10   28.64  168.471

n = 10000

n = 10000L

benchmark(
  "s1" = { s1(n) },
  "s2" = { s2(n) },
  "s3" = { s3(n) },
  "s4" = { s4(n) },
  "s5" = { s5(n) },
  # "s6" = { s6(n) },
  replications = 10,
  columns = c("test", "replications", "elapsed", "relative")
)

At big n's we see matrix becomes very slow compared to the other methods. Using seq in the apply might be neater, but comes with a trade-off as calling that function n times increases processing time a lot. Although seq_len(n) is nicer than 1:n and is just run once. Interesting to see that the replicate method is the fastest.

  test replications elapsed relative
1   s1           10    5.44    1.915
2   s2           10    9.98    3.514
3   s3           10    2.84    1.000
4   s4           10   72.37   25.482
5   s5           10   35.78   12.599

Chian answered 4/1, 2022 at 14:52 Comment(6)

Careful with this. It will misbehave if you change the first argument without remembering to change the second. For example, unlist(lapply(1:10, function(i) i:5)) isn't right. Changing the second argument to function(i) seq(from = i, to = 5, by = 1) is a lot more verbose, but it's safer. The ultimate version is probably something like output <- function(x) unlist(lapply(seq_len(x), function(i) seq(from = i, to = x, by = 1))). – Sambo 4/1, 2022 at 22:42

Hi @Merijn van Tilborg! Perhaps you could include the sequence answer in the timings as well? Cheers – Pooi 5/1, 2022 at 11:15

I would have if I could, but I have not the R version that supports the from argument. I expect it to be the same speed as s1 or s2 as if we look at the old sequence function it is basically a wrapper of R: sequence function (nvec) unlist(lapply(nvec, seq_len)) – Chian 5/1, 2022 at 11:54

Indeed, but it seems like that is no longer the case, so the timing may actually differ. – Pooi 5/1, 2022 at 12:43

A quick system.time with sequence and n = 10000 suggests that it is about 8-9 times faster than the replicate method. – Pooi 5/1, 2022 at 13:44

This could also be shortened to unlist(lapply(1:5, ':', 5)). – Sitwell 31/10, 2022 at 17:58

Your mention of rep reminded me of replicate, so here's a very stateful solution. I'm presenting this because it's short and unusual, not because it's good. This is very unidiomatic R.

vect <- 0:5
unlist(replicate(5, vect <<- vect[-1]))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5

You can do it with a combination of rep and lapply, but it's basically the same as Merijn van Tilborg's answer.

Of course, the truly fearless unidomatic R user does this and refuses to elaborate further.

mat <- matrix(1:5, ncol = 5, nrow = 5, byrow = TRUE)
mat[lower.tri(mat)] <- 0
c(t(mat)[t(mat != 0)])
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5

Sambo answered 4/1, 2022 at 22:54 Comment(4)

Your matrix alternative can be slightly simplified: m = matrix(seq.int(n), ncol = n, nrow = n); m[lower.tri(m, diag = TRUE)] (less unidiomatic though) – Pooi 5/1, 2022 at 0:7

@Pooi Good job. I knew that something was off when I had to call t twice while using byrow=TRUE. – Sambo 5/1, 2022 at 21:21

I fully understand. I have got lost in the maze of upper/lower.tri/byrow/"to t or not to t" soo many times myself. Your unidiomatic contribution is much appreciated. – Pooi 5/1, 2022 at 21:26

The indexing could be golfed with row(m)>=col(m) – Pooi 6/1, 2022 at 10:26

You could use a loop like so:

out=c();for(i in 1:5){ out=c(out, (1:5)[i:5]) }
out
# [1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5

but that's not a good idea!

Why not use a loop?

Using a loop is:

slower,
less memory efficient, and
harder to read and understand.

By contrast, using a vectorised function like sequence is the opposite (faster, more efficient, and easy to read).

Further info

From ?sequence:

The default method for sequence generates the sequence seq(from[i], by = by[i], length.out = nvec[i]) for each element i in the parallel (and recycled) vectors from, by and nvec. It then returns the result of concatenating those sequences.

and about the from argument:

from: each element specifies the first element of a sequence.

Also, since the vector used in the loop is not preallocated, it will require more memory, and will also be slower.

Tighe answered 5/1, 2022 at 5:4 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Why not use a loop?

Further info

Recommended topics

Hot tags