I would like to access R from within a Python program. I am aware of Rpy2, pyrserve and PypeR.
What are the advantages or disadvantages of these three options?
I would like to access R from within a Python program. I am aware of Rpy2, pyrserve and PypeR.
What are the advantages or disadvantages of these three options?
I know one of the 3 better than the others, but in the order given in the question:
rpy2:
pyrserve:
pyper:
edit: Windows support for rpy2
From the paper in the Journal of Statistical Software on PypeR:
RPy presents a simple and efficient way of accessing R from Python. It is robust and very convenient for frequent interaction operations between Python and R. This package allows Python programs to pass Python objects of basic data types to R functions and return the results in Python objects. Such features make it an attractive solution for the cases in which Python and R interact frequently. However, there are still limitations of this package as listed below.
Performance:
RPy may not behave very well for large-size data sets or for computation-intensive duties. A lot of time and memory are inevitably consumed in producing the Python copy of the R data because in every round of a conversation RPy converts the returned value of an R expression into a Python object of basic types or NumPy array. RPy2, a recently developed branch of RPy, uses Python objects to refer to R objects instead of copying them back into Python objects. This strategy avoids frequent data conversions and improves speed. However, memory consumption remains a problem. [...] When we were implementing WebArray (Xia et al. 2005), an online platform for microarray data analysis, a job consumed roughly one quarter more computational time if running R through RPy instead of through R's command-line user interface. Therefore, we decided to run R in Python through pipes in subsequent developments, e.g., WebArrayDB (Xia et al. 2009), which retained the same performance as achieved when running R independently. We do not know the exact reason for such a difference in performance, but we noticed that RPy directly uses the shared library of R to run R scripts. In contrast, running R through pipes means running the R interpreter directly.
Memory:
R has been denounced for its uneconomical use of memory. The memory used by large- size R objects is rarely released after these objects are deleted. Sometimes the only way to release memory from R is to quit R. RPy module wraps R in a Python object. However, the R library will stay in memory even if the Python object is deleted. In other words, memory used by R cannot be released until the host Python script is terminated.
Portability:
As a module with extensions written in C, the RPy source package has to be compiled with a specific R version on POSIX (Portable Operating System Interface for Unix) systems, and the R must be compiled with the shared library enabled. Also, the binary distributions for Windows are bound to specic combinations of different versions of Python/R, so it is quite frequent that a user has difficulty in finding a distribution that ts the user's software environment.
From a developer's prospective, we used to use rpy/rpy2 to provide statistical and drawing functions to our Python-based application. It has caused huge problems in delivering our application because rpy/rpy2 needs to be compiled for specific combinations of Python and R, which makes it infeasible for us to provide binary distributions that work out of box unless we bundle R as well. Because rpy/rpy2 are not particularly easy to install, we ended up replacing relevant parts with native Python modules such as matplotlib. We would have switched to pyrserve if we had to use R because we could start a R server locally and connect to it without worrying about the version of R.
in pyper, i can't pass large matrix from python to r instance with assign(). however, i don't have issue with rpy2. it is just my experience.
pip installable
pre-alpha alternative called widediaper
available at github.com/endrebak/widediaper but it is still very alpha. Of course, it is so very simple that it just might work; it sure does for me. –
Pence © 2022 - 2024 — McMap. All rights reserved.