What triggers "Ancestor must be an environment" error?
Asked Answered
K

1

6

I am running a parallelized calculation using foreach to work on a lot of time series simultaneously. Among those calculations (within a function called compute_slope() I do something like this

lBd <- floor(TMax^delta) # lower bound
uBd <-  ceiling(m * TMax^delta) # upper bound
    
# process is a tibble with columns `n` and `variance`
process %>% 
  dplyr::filter(between(n, lBd, uBd)) %>% 
  lm(data = ., log(variance) ~ log(n)) %>% 
  coefficients() %>% 
  .[2]

So, this is something pretty straightforward: With parameters TMax, delta and m I truncate a time series on the left and on the right (using filter()) and then I run a linear regression on the truncated time series. For some strange reason, most of the time everything works out nicely but sometimes (I suspect that error happens more likely for longer time series, i.e TMax is larger, but that has been sort of irregular too) I get

✖ Problem with `filter()` input `..1`.
ℹ Input `..1` is `between(n, lBd, uBd)`.
✖ `ancestor` must be an environment"

I have really no clue how to interpret this error. I also have a hard time replicating this "ancestor" error but so far no luck. For instance, I have tried

library(tidyverse)
# This is the straightforward use-case and should work (it does here)
mpg %>% filter(between(hwy, 30, 31))
#> # A tibble: 11 x 11
#>    manufacturer model    displ  year   cyl trans   drv     cty   hwy fl    class
#>    <chr>        <chr>    <dbl> <int> <int> <chr>   <chr> <int> <int> <chr> <chr>
#>  1 audi         a4         2    2008     4 manual~ f        20    31 p     comp~
#>  2 audi         a4         2    2008     4 auto(a~ f        21    30 p     comp~
#>  3 chevrolet    malibu     2.4  2008     4 auto(l~ f        22    30 r     mids~
#>  4 hyundai      sonata     2.4  2008     4 auto(l~ f        21    30 r     mids~
#>  5 hyundai      sonata     2.4  2008     4 manual~ f        21    31 r     mids~
#>  6 nissan       altima     2.5  2008     4 auto(a~ f        23    31 r     mids~
#>  7 toyota       camry      2.4  2008     4 manual~ f        21    31 r     mids~
#>  8 toyota       camry      2.4  2008     4 auto(l~ f        21    31 r     mids~
#>  9 toyota       camry s~   2.4  2008     4 manual~ f        21    31 r     comp~
#> 10 toyota       camry s~   2.4  2008     4 auto(s~ f        22    31 r     comp~
#> 11 toyota       corolla    1.8  1999     4 auto(l~ f        24    30 r     comp~

# bounds are undefined
mpg %>% filter(between(hwy, x, 31))
#> Error: Problem with `filter()` input `..1`.
#> i Input `..1` is `between(hwy, x, 31)`.
#> x object 'x' not found


# bounds are functions
mpg %>% filter(between(hwy, slice, 31))
#> Error: Problem with `filter()` input `..1`.
#> i Input `..1` is `between(hwy, slice, 31)`.
#> x cannot coerce type 'closure' to vector of type 'double'

In each case, a different (interpretable) error message was created. I suspect that the error message results from something weird happening as part of the parallel processing but I am not sure what that could be. In any case, examples for this ancestor error would be appreciated. Maybe from there I can work my way back to what goes awry in my calculations.

Update

I still cannot figure out what is going on with the parallelizations even after adding a traceback to the script. This is what it delivers

Error in { : 
  task 34 failed - "Problem with `mutate()` column `grid_estimates`.
ℹ `grid_estimates = map(data, ~estimate_var_on_grid(process = ., TMax = TMax, grid = grid))`.
✖ Problem with `mutate()` column `slope`.
ℹ `slope = map2_dbl(m, delta, ~compute_slope(process, .x, .y, TMax))`.
✖ could not find function "::""
Calls: compute_metrics_on_stable_splits ... tibble -> tibble_quos -> eval_tidy -> %dopar% -> <Anonymous>
11: (function () 
    traceback(2))()
10: stop(simpleError(msg, call = expr))
9: e$fun(obj, substitute(ex), parent.frame(), e$data)
8: foreach(i = itx, .packages = c("tidyverse", "yardstick", "rsample"), 
       .export = #vector of exports removed for legibility
) %dopar% {
       i %>% 
         pull(splits) %>% 
         .[[1]] %>% 
         train_and_test(., train_grid = grid, my_mset = my_mset, 
                   method = method, TMax = TMax_eval)
       }
   }
7: eval_tidy(xs[[j]], mask)
6: tibble_quos(xs, .rows, .name_repair)
5: tibble(metrics = .)
4: list2(...)
3: bind_cols(select(splits, alpha), .)
2: foreach(i = itx, .packages = c("tidyverse", "yardstick", "rsample"), 
       .export = #vector of exports removed for legibility
) %dopar% {
       i %>% 
         pull(splits) %>% 
         .[[1]] %>% 
         train_and_test(., train_grid = grid, my_mset = my_mset, 
                   method = method, TMax = TMax_eval)
       }
   } %>% 
     tibble(metrics = .) %>% 
     bind_cols(select(splits, alpha), .)
1: compute_metrics_on_stable_splits(method = method, grid = grid, 
       my_mset = metric_set(accuracy, mcc, sens, spec), TMax_eval = TMax_eval, 
       v = 40)

The error is now could not find function "::" which is as weird as the ancestor error. At other times I also received

'rho' must be an environment not pairlist: detected in C-level eval

Apparently, the error can be different even though the code in the script stays the same. At this point any clue would be appreciated. What is weird is that in some cases the exact same code either fails with a changing error message or sometimes completes (and if I wouldn't need to run more computations with this script, then I would already be happy with the results I get when the code finishes successfully).

Session Info

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.2 (Ootpa)

Matrix products: default
BLAS/LAPACK: /pfs/data5/software_uc2/all/toolkit/Intel_OneAPI/mkl/2021.4.0/lib/intel64/libmkl_intel_lp64.so.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] yardstick_0.0.9   doParallel_1.0.16 iterators_1.0.13  foreach_1.5.1
 [5] forcats_0.5.1     stringr_1.4.0     dplyr_1.0.7       purrr_0.3.4
 [9] readr_2.1.1       tidyr_1.1.4       tibble_3.1.6      ggplot2_3.3.5
[13] tidyverse_1.3.1

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.1 haven_2.4.3      colorspace_2.0-2 vctrs_0.3.8
 [5] generics_0.1.1   utf8_1.2.2       rlang_0.4.12     pillar_1.6.4
 [9] glue_1.5.1       withr_2.4.3      DBI_1.1.1        dbplyr_2.1.1
[13] modelr_0.1.8     readxl_1.3.1     lifecycle_1.0.1  plyr_1.8.6
[17] munsell_0.5.0    gtable_0.3.0     cellranger_1.1.0 rvest_1.0.2
[21] codetools_0.2-18 tzdb_0.2.0       fansi_0.5.0      broom_0.7.10
[25] Rcpp_1.0.7       scales_1.1.1     backports_1.4.0  jsonlite_1.7.2
[29] fs_1.5.1         hms_1.1.1        stringi_1.7.6    grid_4.1.2
[33] cli_3.1.0        tools_4.1.2      magrittr_2.0.1   crayon_1.4.2
[37] pkgconfig_2.0.3  ellipsis_0.3.2   xml2_1.3.3       pROC_1.18.0
[41] reprex_2.0.1     lubridate_1.8.0  assertthat_0.2.1 httr_1.4.2
[45] rstudioapi_0.13  R6_2.5.1         compiler_4.1.2
Kemberlykemble answered 6/12, 2021 at 9:58 Comment(12)
Could you list a the packages that you are using (sessionInfo()). Perhaps a grep on their source code might find the culprit. Also, its useful to add a tracebackHyaena
@Kemberlykemble in the second last error it states that x is not defined in your environment and in last error you have used a function slice instead of value.Solidarity
@Isa, yes this is the point. I was trying to find out what has to go wrong for the "ancestor" error to appear. The examples I was trying out yielded different errors.Kemberlykemble
@DonaldSeinen, this is a great idea. I was already thinking about doing something like this but I don't know how I can get to the source code of all packages (including internal functions) in order to do this. Is there something like a "grab source code"-function?Kemberlykemble
@Kemberlykemble see this post for finding source code of functions, here is between and filter, but neither contains the ancestor error. Have you managed to narrow it down? Again, please add the sessionInfo() to reduce the size of the haystack a bit.Hyaena
I added the sessionInfo() in case it helps. Still no luck finding the source of the error message.Kemberlykemble
Take a look here: #30249083 if you use %>% inside a %dopar% loop, you have to add a reference to load package dplyr (or magrittr, which dplyr loads).Veronica
Tidyverse was already exported to the workers but to be sure I also added dplyr and magrittr to the list of packages in foreach. Didn't help. I also rewrote the filter function with base R syntax to avoid dplyr. Now I get a bad generic call environment error message. Don't know if that helps...Kemberlykemble
the %>% op makes debugging harder. I suggest to try to rewrite your code without it to see if it leads to more understandable messages. And does the error happen without the foreach ? Or with only one core ?Oscitancy
The error happens only with dopar. And only sometimes. The exact same script sometimes completes without a error and sometimes it errors out after a few hours. I am not sure if there is a solution to this but I also don't understand how such a thing is possibleKemberlykemble
@Kemberlykemble I've also sometimes received this error with one script I run. It was in dplyr::mutate IIRC. Did you ever try reporting it as an R bug or asking other forums?Godart
@MichaelMcFarlane I tried asking on Github but no answer. Nevertheless, I just added an "answer" that helped me. Maybe this could help you too.Kemberlykemble
K
1

I am really not sure if this is a definite answer and if the problem might occur at any time again but by now I believe that the problem comes from the parallel computation using foreach. This is probably not specific to the foreach package but rather a consequence of race conditions and I suspect that it is also a Heisenbug.

That being said, what helped the most was making sure that the foreach-loop does not terminate when there is some kind of error in one of the worker. More precisely I have set the .errorhandling-argument to "pass". So that the loop will in an error occurs, simply write the error message of that iteration into a list and collect the other results in the same list too.

In principle, the code looks like this

results <- foreach (
  i = itx, # itx is an iterator created via iter()
  .errorhandling = 'pass',
  .packages = #packages,
  .export = #exports
) %dopar% {
  # Code for parallel computation here
}

Interestingly, once I have added the errorhandling option, no more errors occurred and I could run the script multiple consecutive times without a hitch. Thus, my believe that we got a Heisenbug over here.

Kemberlykemble answered 10/3, 2022 at 18:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.