Parallelizing keras models in R using doParallel
Asked Answered
B

1

8

I'm trying to ensemble several neural networks using keras for R. In order to do so, I would like to parallelize the training of the different networks by using a "foreach" loop.

models <- list()
x_bagged <- list()
y_bagged <- list()

n_nets = 2
bag_frac <-0.7
len <- nrow(x_train)

for(i in 1:n_nets){
    sam <- sample(len, floor(bag_frac*len), replace=FALSE)
    x_bagged[[i]] <- x_train[sam,]
    y_bagged[[i]] <- y_train[sam]

    models[[i]] <- keras_model_sequential() 

models[[i]] %>% 
  layer_dense(units = 100, input_shape = ncol(x_train), activation = "relu", kernel_initializer = 'glorot_normal') %>% 
  layer_batch_normalization() %>%
  layer_dense(units = 100, activation = custom_activation, kernel_initializer = 'glorot_normal') %>%
  layer_dense(units = 1, activation = 'linear', kernel_initializer = 'glorot_normal')


    models[[i]] %>% compile(
  loss = "MSE",
    optimizer= optimizer_sgd(lr=0.01)
    )
    }


library(foreach)
library(doParallel)
cl<-makeCluster(2)
registerDoParallel(cl)
nep <- 10

 foreach(i = 1:n_nets,.packages=c("keras")) %dopar% { 
         models[[i]] %>% keras::fit(
  x_bagged[[i]], y_bagged[[i]], 
  epochs = nep,
  validation_split = 0.1,
  batch_size =256,
  verbose=1
)
} 
stopCluster(cl)

I have no problems running the code using %do% instead of %dopar%; however, when i try to fit the nets simultaneously on multiple cores, i get the following error:

Error in {: task 1 failed - "'what' must be a function or character string" Traceback:

  1. foreach(i = 1:n_reti, .packages = c("keras")) %dopar% { . models[[i]] %>% keras::fit(x_bagged[[i]], y_bagged[[i]], .
    epochs = nep, validation_split = 0.1, batch_size = 256, .
    verbose = 1) . }
  2. e$fun(obj, substitute(ex), parent.frame(), e$data)

Does anyone kindly know how I can overcome this error? Is there any alternative way to parallelize the training of the models on R?

Thank you in advance!

Boddie answered 9/7, 2018 at 9:34 Comment(3)
It would be easier to help with a minimal, reproducible example that allows to reproduce your error (see also link).Biagio
I got this mistake when i tried to predict from unfitted model.Althing
I'm having a similar issue now. I suspect it's because the TensorFlow backend wasn't designed to support parallel processing, and using Reticulate as an intermediary makes it even more complex.Mureil
G
4

Although this question is quite old, I got the same issue so I'm posting the solution here. The problem is that the Keras model object can not be transferred to the workers before being serialised. A quick workaround would be to serialise the models before sending them to the workers and then unserialising them on the nodes locally:

library(foreach)
library(doParallel)
cl<-makeCluster(2)
registerDoParallel(cl)
nep <- 10

# Serialize models before sending them to the workers
models_par <- lapply(models_par, keras::serialize_model)

# Now send the models, not just the indices
foreach(model = models_par,.packages=c("keras")) %dopar% { 

  # Unserialize locally
  model_local <- keras::unserialize_model(model)
  model_local %>% keras::fit(
    x_bagged[[i]], y_bagged[[i]], 
    epochs = nep,
    validation_split = 0.1,
    batch_size =256,
    verbose=1
  )

  # Serialize before sending back to master
  keras::serialize_model(model_local)
} 
stopCluster(cl)
Gaul answered 11/3, 2020 at 18:49 Comment(2)
I am trying to do predictions from a keras model (model %>% predict(newdata)) using your example and serializing the model beforehand, then unserializing in the foreach loop but I get the following error "Error in unserialize(socklist[[n]]) : error reading from connection". Look like multiple processes are trying to unserialize at the same time...Honeymoon
Were you able to make predictions with forach and doparallel. I have run into the same issue.Grandam

© 2022 - 2024 — McMap. All rights reserved.