I would like to make the following sequence in R, by using rep
or any other function.
c(1, 2, 3, 4, 5, 2, 3, 4, 5, 3, 4, 5, 4, 5, 5)
Basically, c(1:5, 2:5, 3:5, 4:5, 5:5)
.
I would like to make the following sequence in R, by using rep
or any other function.
c(1, 2, 3, 4, 5, 2, 3, 4, 5, 3, 4, 5, 4, 5, 5)
Basically, c(1:5, 2:5, 3:5, 4:5, 5:5)
.
Use sequence
.
sequence(5:1, from = 1:5)
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
The first argument, nvec
, is the length of each sequence (5:1
); the second, from
, is the starting point for each sequence (1:5
).
Note: this works only for R >= 4.0.0. From R News 4.0.0:
sequence()
[...] gains arguments [e.g.from
] to generate more complex sequences.
unlist(lapply(1:5, function(i) i:5))
# [1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
Some speed tests on all answers provided note the OP mentioned 10K somewhere if I recall correctly
s1 <- function(n) {
unlist(lapply(1:n, function(i) i:n))
}
s2 <- function(n) {
unlist(lapply(seq_len(n), function(i) seq(from = i, to = n, by = 1)))
}
s3 <- function(n) {
vect <- 0:n
unlist(replicate(n, vect <<- vect[-1]))
}
s4 <- function(n) {
m <- matrix(1:n, ncol = n, nrow = n, byrow = TRUE)
m[lower.tri(m)] <- 0
c(t(m)[t(m != 0)])
}
s5 <- function(n) {
m <- matrix(seq.int(n), ncol = n, nrow = n)
m[lower.tri(m, diag = TRUE)]
}
s6 <- function(n) {
out <- c()
for (i in 1:n) {
out <- c(out, (1:n)[i:n])
}
out
}
library(rbenchmark)
n = 5
n = 5L
benchmark(
"s1" = { s1(n) },
"s2" = { s2(n) },
"s3" = { s3(n) },
"s4" = { s4(n) },
"s5" = { s5(n) },
"s6" = { s6(n) },
replications = 1000,
columns = c("test", "replications", "elapsed", "relative")
)
Do not get fooled by some "fast" solutions using hardly any function that takes time to be called, and differences are multiplied by 1000x replications.
test replications elapsed relative
1 s1 1000 0.05 2.5
2 s2 1000 0.44 22.0
3 s3 1000 0.14 7.0
4 s4 1000 0.08 4.0
5 s5 1000 0.02 1.0
6 s6 1000 0.02 1.0
n = 1000
n = 1000L
benchmark(
"s1" = { s1(n) },
"s2" = { s2(n) },
"s3" = { s3(n) },
"s4" = { s4(n) },
"s5" = { s5(n) },
"s6" = { s6(n) },
replications = 10,
columns = c("test", "replications", "elapsed", "relative")
)
As the poster already mentioned as "not to do", we see the for
loop becoming pretty slow compared to any other method, on n = 1000L
test replications elapsed relative
1 s1 10 0.17 1.000
2 s2 10 0.83 4.882
3 s3 10 0.19 1.118
4 s4 10 1.50 8.824
5 s5 10 0.29 1.706
6 s6 10 28.64 168.471
n = 10000
n = 10000L
benchmark(
"s1" = { s1(n) },
"s2" = { s2(n) },
"s3" = { s3(n) },
"s4" = { s4(n) },
"s5" = { s5(n) },
# "s6" = { s6(n) },
replications = 10,
columns = c("test", "replications", "elapsed", "relative")
)
At big n's we see matrix becomes very slow compared to the other methods. Using seq in the apply might be neater, but comes with a trade-off as calling that function n times increases processing time a lot. Although seq_len(n) is nicer than 1:n and is just run once. Interesting to see that the replicate method is the fastest.
test replications elapsed relative
1 s1 10 5.44 1.915
2 s2 10 9.98 3.514
3 s3 10 2.84 1.000
4 s4 10 72.37 25.482
5 s5 10 35.78 12.599
unlist(lapply(1:10, function(i) i:5))
isn't right. Changing the second argument to function(i) seq(from = i, to = 5, by = 1)
is a lot more verbose, but it's safer. The ultimate version is probably something like output <- function(x) unlist(lapply(seq_len(x), function(i) seq(from = i, to = x, by = 1)))
. –
Sambo sequence
answer in the timings as well? Cheers –
Pooi R: sequence function (nvec) unlist(lapply(nvec, seq_len))
–
Chian system.time
with sequence
and n = 10000 suggests that it is about 8-9 times faster than the replicate
method. –
Pooi unlist(lapply(1:5, ':', 5))
. –
Sitwell Your mention of rep
reminded me of replicate
, so here's a very stateful solution. I'm presenting this because it's short and unusual, not because it's good. This is very unidiomatic R.
vect <- 0:5
unlist(replicate(5, vect <<- vect[-1]))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
You can do it with a combination of rep
and lapply
, but it's basically the same as Merijn van Tilborg's answer.
Of course, the truly fearless unidomatic R user does this and refuses to elaborate further.
mat <- matrix(1:5, ncol = 5, nrow = 5, byrow = TRUE)
mat[lower.tri(mat)] <- 0
c(t(mat)[t(mat != 0)])
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
m = matrix(seq.int(n), ncol = n, nrow = n)
; m[lower.tri(m, diag = TRUE)]
(less unidiomatic though) –
Pooi t
twice while using byrow=TRUE
. –
Sambo upper/lower.tri
/byrow
/"to t
or not to t
" soo many times myself. Your unidiomatic contribution is much appreciated. –
Pooi row(m)>=col(m)
–
Pooi You could use a loop like so:
out=c();for(i in 1:5){ out=c(out, (1:5)[i:5]) }
out
# [1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
but that's not a good idea!
Using a loop is:
By contrast, using a vectorised function like sequence
is the opposite (faster, more efficient, and easy to read).
From ?sequence
:
The default method for sequence generates the sequence
seq(from[i], by = by[i], length.out = nvec[i])
for each elementi
in the parallel (and recycled) vectorsfrom
,by
andnvec
. It then returns the result of concatenating those sequences.
and about the from
argument:
from: each element specifies the first element of a sequence.
Also, since the vector used in the loop is not preallocated, it will require more memory, and will also be slower.
© 2022 - 2024 — McMap. All rights reserved.
sequence
: https://mcmap.net/q/659147/-using-seq-and-rep-to-create-a-sequence-of-5-integers-that-go-up-by-1-on-each-repetition – Ironbound