error: object '.doSnowGlobals' not found?
Asked Answered
B

5

12

I'm trying to parallelize a code on 4 nodes(type = "SOCK"). Here is my code.

library(itertools)
library(foreach)
library(doParallel)
library(parallel)

workers <- ip address of 4 nodes
cl = makePSOCKcluster(workers, master="ip address of master")
registerDoParallel(cl)

z <- read.csv("ProcessedData.csv", header=TRUE, as.is=TRUE)
z <- as.matrix(z)


system.time({
  chunks <- getDoParWorkers()
  b <- foreach (these = isplitIndices(nrow(z),
                                      chunks=chunks),
                .combine = c) %dopar% {
                  a <- rep(0, length(these))
                  for (i in 1:length(these)) {
                    a[i] <- mean(z[these[i],])
                  }
                  a
                }
})

I get this error:

4 nodes produced errors; first error: object '.doSnowGlobals' not found.

This code runs fine if I'm using doMC i.e using the same machine's cores. But when I try to use other computers for parallel computing I get the above error. When I change it to registerDoSNOW the error persists.

Does snow and DoSNOW work in a cluster? I could create nodes on the localhost using snow but not on the cluster. Anyone out there using snow?

Barbaric answered 1/8, 2014 at 11:44 Comment(0)
D
9

To set the library path on each worker you can run:

clusterEvalQ(cl, .libPaths("Your library path"))
Donatelli answered 19/10, 2016 at 18:34 Comment(2)
perfect. Why is that on some system this is necessary but on some it is not?Lobworm
This actually helped when I had the problem using a SSH server from my universityUnlike
A
6

You can get this error if any of the workers are unable to load the doParallel package. You can make that happen by installing doParallel into some directory and pointing the master to it via ".libPaths":

> .libPaths('~/R/lib.test')
> library(doParallel)
> cl <- makePSOCKcluster(3, outfile='')
starting worker pid=26240 on localhost:11566 at 13:47:59.470
starting worker pid=26248 on localhost:11566 at 13:47:59.667
starting worker pid=26256 on localhost:11566 at 13:47:59.864
> registerDoParallel(cl)
> foreach(i=1:10) %dopar% i
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Warning: namespace ‘doParallel’ is not available and has been replaced
by .GlobalEnv when processing object ‘’
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  3 nodes produced errors; first error: object '.doSnowGlobals' not found

The warning happens when a function from doParallel is deserialized on a worker. The error happens when the function is executed and tries to access .doSnowGlobal which is defined in the doParallel namespace, not in .GlobalEnv.

You can also verify that doParallel is available on the workers by executing:

> clusterEvalQ(cl, library(doParallel))
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  3 nodes produced errors; first error: there is no package called ‘doParallel’
Angwantibo answered 15/9, 2014 at 18:15 Comment(1)
This explains why it happens, but how do you fix it?Winslow
G
5

A specific case of @Steve Weston's answer is when your workers aren't able to load a given package (eg doParallel) because the package is inside a Packrat project. Install the packages to the system library, or somewhere else that a worker will be able to find them.

Grearson answered 27/4, 2016 at 21:48 Comment(1)
I found from this thread that I needed to respecify the libpaths once inside the foreach package. It was then able to find the packages required along with the .doSnowGlobals. Having done a bit more research into my issue, packrat itself was not the issue, but the fact that I'd loaded R with packrat and then used setwd() to a subfolder. When the worker is launched it does not find the .Rprofile in the project dir and does not load the packrat libraries.Freshwater
V
2

I encountered the same problem today, and I tried all the answers above, none of which worked for me. Then I simply reinstalled the doSNOW package, and magically, the problem was solved.

Valeriavalerian answered 21/1, 2019 at 6:44 Comment(0)
W
2

So none of these fixes worked for me at all. In my particular case, I use a custom R library location. Parallel processing worked if my working directory was the base directory where my custom libraries folder was located, but it failed if I used setwd() to change the working directory.

This custom library location was not being passed on to worker nodes, so they were looking in R's default library directory for packages that were not there. The fix by @Nat did not work for me; the worker nodes still could not find my custom library folder. What did work was:

Before sending jobs to nodes:

paths <- .libPaths()

I then sent jobs to nodes, along with the argument paths. Then, inside the worker function I simply called:

.libPaths(paths)
Winslow answered 2/6, 2022 at 5:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.