Create a cluster of co-workers' Windows 7 PCs for parallel processing in R?

Asked 23/2, 2013 at 9:29 Answered 17/6, 2013 at 22:59

Solved r parallel-processing cluster-computing mpi

I am running the termstrc yield curve analysis package in R across 10 years of daily bond price data for 5 different countries. This is highly compute intensive, it takes 3200 seconds per country on a standard lapply, and if I use foreach and %dopar% (with doSNOW) on my 2009 i7 mac, using all 4 cores (8 with hyperthreading) I get this down to 850 seconds. I need to re-run this analysis every time I add a country (to compute inter-country spreads), and I have 19 countries to go, with many more credit yield curves to come in the future. The time taken is starting to look like a major issue. By the way, the termstrc analysis function in question is accessed in R but is written in C.

Now, we're a small company of 12 people (read limited budget), all equipped with 8GB ram, i7 PCs, of which at least half are used for mundane word processing / email / browsing style tasks, that is, using 5% maximum of their performance. They are all networked using gigabit (but not 10-gigabit) ethernet.

Could I cluster some of these underused PCs using MPI and run my R analysis across them? Would the network be affected? Each iteration of the yield curve analysis function takes about 1.2 seconds so I'm assuming that if the granularity of parallel processing is to pass a whole function iteration to each cluster node, 1.2 seconds should be quite large compared with the gigabit ethernet lag?

Can this be done? How? And what would the impact be on my co-workers. Can they continue to read their emails while I'm taxing their machines?

I note that Open MPI seems not to support Windows anymore, while MPICH seems to. Which would you use, if any?

Perhaps run an Ubuntu virtual machine on each PC?

Evangelicalism answered 23/2, 2013 at 9:29 Comment(3)

Virtual machines are notorious memory hogs, not to mention that they are practically just a layer on top of another layer (think I/O flow through). Your coworkers won't thank you when they notice that 50% of their memory is being chunked out for something that you couldn't possible use efficiently - even if all they are doing is Word/email. Even Chrome can get up to 2gb nowadays on 64bit systems if you open enough windows. – Suzannasuzanne 23/2, 2013 at 10:46

Gotcha - though I doubt they would even notice to be honest. Just seems a waste to see 99% of CPU cycles idling when I have good use for them! BTW VM Ware Fusion on my Mac exacts about a 25% performance penalty versus "native" R (that is running the same routine on Win 64 in a VM, with 4 processors and 8 out of 16gb assigned) so it's not that bad, though I agree on the RAM. – Evangelicalism 23/2, 2013 at 13:6

Did you find a working answer to your question? I'm working on the same problem here. – Nellnella 30/3, 2016 at 5:29

Yes you can. There are a number of ways. One of the easiest is to use redis as a backend (as easy as calling sudo apt-get install redis-server on an Ubuntu machine; rumor has that you could have a redis backend on a windows machine too).

By using the doRedis package, you can very easily en-queue jobs on a task queue in redis, and then use one, two, ... idle workers to query the queue. Best of all, you can easily mix operating systems so yes, your co-workers' windows machines qualify. Moreover, you can use one, two, three, ... clients as you see fit and need and scale up or down. The queue does not know or care, it simply supplies jobs.

Bost of all, the vignette in the doRedis has working examples of a mix of Linux and Windows clients to make a bootstrapping example go faster.

Infract answered 23/2, 2013 at 19:40 Comment(3)

This looks very interesting. Indeed I googled around on Redis and find that it's probably going to solve another problem that I have, that is, sharing large amounts of timeseries data amongst many computers (please tell me if I'm misguided here). On the original question: will I be able, using doRedis, to ensure that the R instance on the other PCs does not hog all their CPU resource? Can I for example limit it to 4 out of 8 computer cores? I ask because if I give doSNOW all 8 cores on my mac or PC, nothing else runs acceptably anymore despite the multitasking OS. – Evangelicalism 24/2, 2013 at 0:1

Yes, each client should be able to control its own limits. – Infract 24/2, 2013 at 0:3

I will add that I have happily been using doRedis now since you answered the question (so for about a year), and it works very well indeed (though sometimes I have to shutdown the R sessions that it creates on the co-worker machines, manually, once the jobs are over) – Evangelicalism 16/2, 2014 at 11:22

Perhaps not the answer you were looking for, but - this is one of those situations where an alternative is sooo much better that it's hard to ignore.

The cost of AWS clusters is ridiculously low (my emphasis) for exactly these types of computing problems. You pay only for what you use. I can guarantee you that you will save money (at the very least in opportunity costs) by not spending the time trying to convert 12 windows machines into a cluster. For your purposes, you could probably even do this for free. (IIRC, they still offer free computing time on clusters)

References:

Some of these instances are so powerful you probably wouldn't even need to figure out how to setup your work on a cluster (given your current description). As you can see from the references costs are ridiculously low, ranging from 1-4$ per hour of compute time.

Suzannasuzanne answered 23/2, 2013 at 10:38 Comment(2)

Wow - hadn't even thought about the cloud. Okay - I'll give this a shot. At the kind of price points that you're talking about it would indeed be interesting. – Evangelicalism 23/2, 2013 at 13:9

Having thought about this, because a large part of my work involves paramaterizing the function and re-running it, it is quite possible to do 5 hours of work a day on this even in a big could based parallel installation. Let's say $2.50 per hour = $12.50 per day, 20 days per month, we're talking $250 per month. I wouldn't describe it as "ridiculously" low though I guess if I'm getting tons of computer power for it will indeed be cost effective. – Evangelicalism 24/2, 2013 at 6:4

What about OpenCL?

This would require rewriting the C code, but would allow potentially large speedups. The GPU has immense computing power.

Noguchi answered 17/6, 2013 at 22:59 Comment(2)

I would dearly love to use openCL. I am back to taking 2 hours per country for optimization, using 5x4-core computers clustered using doRedis. Don't get me wrong, doRedis is great, as it would otherwise take over 9 hours, but it seems to me that massive teraflops of computing horsepower are being left idle. I think I would need the uniroot function to use openCL. What are the ways of using openCL on R without being an indepth C programmer, anyway? – Evangelicalism 23/6, 2013 at 11:31

I don't know, sorry. I have never used OpenCL - just heard about it. What you could do is look for which parts of the algorithm are the biggest computing hogs (by profiling), and see if there are GPU-accelerated libraries available for any of them. – Noguchi 12/9, 2013 at 12:48

Recommended topics

Hot tags