RcppArmadillo's sample() is ambiguous after updating R
Asked Answered
P

1

10

I commonly work with a short Rcpp function that takes as input a matrix where each row contains K probabilities that sum to 1. The function then randomly samples for each row an integer between 1 and K corresponding to the provided probabilities. This is the function:

// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadilloExtensions/sample.h>

using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
  int n = x.nrow();
  IntegerVector result(n);
  for ( int i = 0; i < n; ++i ) {
    result[i] = RcppArmadillo::sample(choice_set, 1, false, x(i, _))[0];
  }
  return result;
}

I recently updated R and all packages. Now I cannot compile this function anymore. The reason is not clear to me. Running

library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp("sample_matrix.cpp")

throws the following error:

error: call of overloaded 'sample(Rcpp::IntegerVector&, int, bool, Rcpp::Matrix<14>::Row)' is ambiguous

This basically tells me that my call to RcppArmadillo::sample() is ambiguous. Can anyone enlighten me as to why this is the case?

Peerage answered 16/12, 2019 at 10:42 Comment(0)
D
10

There are two things happening here, and two parts to your problem and hence the answer.

The first is "meta": why now? Well we had a bug let in the sample() code / setup which Christian kindly fixed for the most recent RcppArmadillo release (and it is all documented there). In short, the interface for the very probability argument giving you trouble here was changed as it was not safe for re-use / repeated use. It is now.

Second, the error message. You didn't say what compiler or version you use but mine (currently g++-9.3) is actually pretty helpful with the error. It is still C++ so some interpretative dance is needed but in essence it clearly stating you called with Rcpp::Matrix<14>::Row and no interface is provided for that type. Which is correct. sample() offers a few interface, but none for a Row object. So the fix is, once again, simple. Add a line to aid the compiler by making the row a NumericVector and all is good.

Fixed code

#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>

// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
  int n = x.nrow();
  IntegerVector result(n);
  for ( int i = 0; i < n; ++i ) {
    Rcpp::NumericVector z(x(i, _));
    result[i] = RcppArmadillo::sample(choice_set, 1, false, z)[0];
  }
  return result;
}

Example

R> Rcpp::sourceCpp("answer.cpp")        # no need for library(Rcpp)   
R> 
Dissemblance answered 16/12, 2019 at 13:2 Comment(3)
Why does this still give different results compared to base::sample with same set.seed()/set.seed(, sample.kind="Rounding"). See my recent answer.Biserrate
Presumably because the implementation is not identical. It is also non-trivial, but the files for all three approaches (there is also a sample in Rcpp now) are open source so someone with time and interest -- maybe you? -- could drill down and debug.Dissemblance
I've implemented the function using Rcpp::sample, gives identical results now, I think it's perfect, thanks a ton!Biserrate

© 2022 - 2024 — McMap. All rights reserved.