foreach-loop (R/doParallel package) fails with big number of iterations
Asked Answered
I

0

0

I have the following R-code:

library(doParallel)
cl <- makeCluster(detectCores()-4, outfile = "")
registerDoParallel(cl)

calc <- function(i){
...
#returns a dataframe
}

system.time(
res<- foreach( i = 1:106800, .verbose = TRUE) %dopar% calc(i)
)

stopCluster(cl)

If I run that code from 1:5, it finishes successfully. The same happens if I run that code from 106000 - 106800. But it fails if I run the full vector 1-106800, or even 100000-106800 (these are not the very exact numbers I am working with but better readable) with this error message:

...
got results for task 6813
numValues: 6814, numResults: 6813, stopped: TRUE
returning status FALSE
got results for task 6814
numValues: 6814, numResults: 6814, stopped: TRUE
calling combine function
evaluating call object to combine results:
  fun(accum, result.6733, result.6734, result.6735, result.6736, 
    result.6737, result.6738, result.6739, result.6740, result.6741, 
    result.6742, result.6743, result.6744, result.6745, result.6746, 
    result.6747, result.6748, result.6749, result.6750, result.6751, 
    result.6752, result.6753, result.6754, result.6755, result.6756, 
    result.6757, result.6758, result.6759, result.6760, result.6761, 
    result.6762, result.6763, result.6764, result.6765, result.6766, 
    result.6767, result.6768, result.6769, result.6770, result.6771, 
    result.6772, result.6773, result.6774, result.6775, result.6776, 
    result.6777, result.6778, result.6779, result.6780, result.6781, 
    result.6782, result.6783, result.6784, result.6785, result.6786, 
    result.6787, result.6788, result.6789, result.6790, result.6791, 
    result.6792, result.6793, result.6794, result.6795, result.6796, 
    result.6797, result.6798, result.6799, result.6800, result.6801, 
    result.6802, result.6803, result.6804, result.6805, result.6806, 
    result.6807, result.6808, result.6809, result.6810, result.6811, 
    result.6812, result.6813, result.6814)
returning status TRUE
Error in calc(i) : 
  task 1 failed - "object of type 'S4' is not subsettable"

I have no clue why I get this error message. Unfortunately, I cannot provide a running example as I cannot reproduce it with some simple code. Is a single job failing? If yes, how can I find which one fails? Or any other ideas how to troubleshoot?

Implosion answered 18/8, 2015 at 8:54 Comment(6)
Can you check with tryCatch to see if it is a single job failing?Ambagious
where would you put the tryCatch?Implosion
Inside foreach loop.Ambagious
I think, I found the problem: After adding a library to the foreach-call it seems to run. I just don't get, why this was not a problem with few jobs...Implosion
Aah yes. You need to specify the dependent libraries and send to each process spawned.Ambagious
but why does it not complain with less jobs???Implosion

© 2022 - 2024 — McMap. All rights reserved.