loop R multiple samples from single dataset
Asked Answered
R

2

5

I am attempting to create a simple loop in R, where I have a large dataset and I want to create multiple smaller samples from this dataset and export them to excel:

I thought it would work like this, but it doesn't:

 idorg <- c(1,2,3,4,5)
 x <- c(14,20,21,16,17)
 y <- c(31,21,20,50,13)
 dataset <- cbind (idorg,x,y)


 for (i in 1:4)
 {
 attempt[i] <- dataset[sample(1:nrow(dataset), 3, replace=FALSE),]
 write.table(attempt[i], "C:/Users/me/Desktop/WWD/Excel/dataset[i].xls", sep='\t')
 }

In Stata you would need to preserve and restore your data when doing a loop like this, but is this also necessary in R?

Rotman answered 16/10, 2012 at 7:17 Comment(1)
Why is this being voted to close? IMO this is a perfectly suitable Q for this site.Alveraalverez
P
5

You have following problems:

  1. attempt is not declared, so attempt[i] cannot be assigned to. Either make it a matrix to fill up within the loop (if you want to keep the samples), or use it as a temporary variable attempt.
  2. The file name is take literary, you need to use paste() or sprintf() to include the value of the variable i in the file name.

Here is a working version of the code:

idorg <- c(1,2,3,4,5)
x <- c(14,20,21,16,17)
y <- c(31,21,20,50,13)
dataset <- cbind (idorg,x,y)

for (i in 1:4)  {
  attempt <- dataset[sample(1:nrow(dataset), 3, replace=FALSE),]
  write.table(attempt, sprintf( "C:/Users/me/Desktop/WWD/Excel/dataset[%d].xls", i ), sep='\t')
}

Will Excel be able to read such a tab-separated table? I'm not sure; I would make a comma separated table and save it as .csv.

Presurmise answered 16/10, 2012 at 7:23 Comment(0)
G
2

Unlike Stata, you don't need to preserve and restore your data for this kind of operation in R.

I think January's solution solves your problem, but I wanted to share another alternative: using lapply() to get a list of all the samples of the dataset:

set.seed(1) # So you can reproduce these results
temp <- setNames(lapply(1:4,
                        function(x) { 
                          x <- dataset[sample(1:nrow(dataset),
                                              3, replace = FALSE), ]; x }),
                 paste0("attempt.", 1:4))

This has created a list() named "temp" that comprises four data.frames.

temp
# $attempt.1
#      idorg  x  y
# [1,]     2 20 21
# [2,]     5 17 13
# [3,]     4 16 50
# 
# $attempt.2
#      idorg  x  y
# [1,]     5 17 13
# [2,]     1 14 31
# [3,]     3 21 20
# 
# $attempt.3
#      idorg  x  y
# [1,]     5 17 13
# [2,]     3 21 20
# [3,]     2 20 21
# 
# $attempt.4
#      idorg  x  y
# [1,]     1 14 31
# [2,]     5 17 13 
# [3,]     4 16 50

Lists are very convenient in R. You can now use lapply() to do other fun things, like if you wanted to find out the row sums, you can do lapply(temp, rowSums). Or, if you wanted to output separate CSV files (readable by Excel), you can do something like this:

lapply(names(temp), function(x) write.csv(temp[[x]],
                             file = paste0(x, ".csv")))
Gladdie answered 16/10, 2012 at 9:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.