Security issues with CRAN packages [closed]
Asked Answered
A

0

22

[ Edit: June 2013 ] A paper has appeared on ArXiv describing this issue in greater detail, and suggesting some solutions: http://arxiv.org/abs/1303.4808. It will appear in the Journal of Statistical Software later in 2013.

I have a cronjob on my Ubuntu servers that downloads and installs every source package from CRAN. However on the same server I started to notice some irregular activity. It might be totally unrelated, but it got me thinking about if there could be a possibility that some CRAN packages contain malicious code.

The process of creating and publishing a cran package is extremely easy. Maybe a little too easy. You upload your package to the FTP, Kurt will do a check, and publish it. With the volume of R packages that is being uploaded every day, it is reasonable to assume that there is no extensive auditing of the package going on. Also there is no signing of a package using a private key, like most distro packages. Even the email address in the DESCRIPTION is rarely verified.

Now it would not be very hard to include some code that installs a rootkit, either at compile time or at run time. Compile time is probably more vulnerable, because I install my packages using sudo, which I probably should stop doing. But also at runtime a lot can be done. The linux kernel has had several security vulnerabilities lately, and I have confirmed myself that it can be extremely easy to obtain root via a privilege escalation exploit, on a completely up to date system. As R usually has internet access, the malicious code does not even have to be included in the package, it can simply be downloaded from somewhere using wget or download.file().

That said, are R users considering this at all? Or is the philosophy mostly that you should only download packages from people you trust? Still without signing the packages that is not very reliable. What could be a safer approach to installing cran packages? I have considered something like a separate machine for building packages and then copying the binaries, and always running R in a sandbox. That is a little cumbersome though.

Ardath answered 14/1, 2012 at 21:13 Comment(13)
A first step would be to create a specific user, say "R", to build and install R and R packages: malicious code would not have direct root access (the escalation problems remain, of course).Friction
Vote to close: Not an R programming question, more an R policy question, therefore doesn't belong on stackoverflow, also discussive.Stopper
There is the question of how many R users would install the malicious package. I certainly wouldn't replace a long-standing and useful package like ggplot2 or lme4 with a package of new and unknown usefulness. The biggest risk I can see would be in specific areas, where packages don't already exist, and by definition there are only a few people working in that area. Wouldn't they all know each other, so a package created by someone they have never heard of would raise eyebrows?Ripon
It would be cute to create a malicious package that was spelled nearly the same as a popular package -- for example there is no longer a ggplot on CRAN, just ggplot2 ... although I guess you'd have to get Kurt Hornik to accept it ...Melanism
This is pretty nonsensical, Jeroen. Vincent is correct in pointing out that you can simply run this as a user rather than root (and should, if you insist on "all" CRAN packages probably use a chroot or virtual machine anyway; see my answer to your other question). And Spacedman also raises a valid point as this has nuttin' to do with R per se. You'd have the issue with CPAN, PyPy, and whatever other public repo you look at. So I vote to close too.Laryngeal
Hmz. I could move it to serverfault but there are not a lot of R users over there.Ardath
@Dirk I do think the question is somewhat specific to R as CRAN is tightly integrated with R. It is hard to use R without touching CRAN. Also I am pretty sure that creating an R user is not going to solve the problem. MySQL runs under user 'mysql' but if you run a buggy cms with sql injection vulnerabilities, you'll be hacked within an hour. I am going to look into chroot. Would this work at runtime or only during installation of packages?Ardath
Runtime. It's like a virtual machine, only simpler. The BSD variants have something called jail. This is really just up to you as your own sysadmin to set your machine up in a way that cannot harm you, and has nothing to do with R packages per se or how they are distributed. Remember, you decided to set up a cron job to install (semi-)random code.Laryngeal
Isn't there also a problem in installing every R package? Seems like it would be just as bad to install every package for Debian or Red Hat. Why install that which you don't need, don't use, and don't monitor?Haskins
I do need everything :-) It's for my opencpu framework.Ardath
@Jeroen I would recommend moving to Serverfault. There are reasonable R administration topics to be discussed. It doesn't hurt to pop into the SO R chatroom when you've posted an R question outside of SO & CV, to let folks know to look over at SF or another site.Haskins
As for this question, I wouldn't say that this is the biggest risk, not least because code served on CRAN is associated with a maintainer/author. Other risks in production environments include locking down shell access, locking down ports/servers, and a number of other things.Haskins
@Jeroen, are you asking about Linux only, (missing tags and title), or all OSes? R installs on Windows are often painful and sometimes it's easier to install as administrator; that's in theory a security hole.Sox

© 2022 - 2024 — McMap. All rights reserved.