The recent questions regarding the use of require versus :: raised the question about which programming styles are used when programming in R, and what their advantages/disadvantages are. Browsing through the source code or browsing on the net, you see a lot of different styles displayed.
The main trends in my code :
heavy vectorization I play a lot with the indices (and nested indices), which results in rather obscure code sometimes but is generally a lot faster than other solutions. eg:
x[x < 5] <- 0
instead ofx <- ifelse(x < 5, x, 0)
I tend to nest functions to avoid overloading the memory with temporary objects that I need to clean up. Especially with functions manipulating large datasets this can be a real burden. eg :
y <- cbind(x,as.numeric(factor(x)))
instead ofy <- as.numeric(factor(x)) ; z <- cbind(x,y)
I write a lot of custom functions, even if I use the code only once in eg. an
sapply
. I believe it keeps it more readible without creating objects that can remain lying around.I avoid loops at all costs, as I consider vectorization to be a lot cleaner (and faster)
Yet, I've noticed that opinions on this differ, and some people tend to back away from what they would call my "Perl" way of programming (or even "Lisp", with all those brackets flying around in my code. I wouldn't go that far though).
What do you consider good coding practice in R?
What is your programming style, and how do you see its advantages and disadvantages?
x[x < 5] <- 0
), esp. on grouped data, I'd lean towardsdata.table
's:=
operator. Is your priority fast code, dense compact code, or legibility at slight performance penalty? Also, please show some examples of your custom functions so people can comment. – Titania