Should I prefer hadoop vs condor when working with R?
Asked Answered
M

1

6

I am looking for ways to send works for multiple computers on my University computer grid.

Currently it is running Condor and also offers Hadoop.

My question is thus, should I try and interface with R to Hadoop or to the Conder for my projects?

For the discussion, let's assume we are talking about Embarrassingly parallel tasks.

p.s: I've seen the resources described in the CRAN task views.

Miranda answered 4/11, 2010 at 10:21 Comment(4)
I doubt that Hadoop is running on top on Condor, Hadoop has it's own file system (HDFS) and Map/Reduce framework.Dumm
Thanks khmarbaise - I am unaware of the underlaying system, your comment is helpful to know.Miranda
Hadoop does run on top of Condor. You can use Condor to match Hadoop workers to machines which then start up and process your Hadoop work loads. Condor's scheduling system is far more powerful than anything Hadoop offers natively. See: hadoopblog.blogspot.com/2009/07/hadoop-and-condor.htmlWaldon
Condor also has built-in support for HDFS as of the 7.4.x release: cs.wisc.edu/condor/manual/v7.4/3_13Setting_Up.html#33968Waldon
W
6

You can do both.

You can use HDFS for your data sets and Condor for your job scheduling. Using Condor to place executors on machines and HDFS + Hadoops Map-Reduce features to process your data (assuming your problem is map-reduce mappable). Then you're using the most appropriate tool for the job: Condor is a job scheduler, and as such does that work better than Hadoop. And Hadoop's HDFS and M-R framework are things Condor doesn't have (but are really helpful for jobs running on Condor to use).

I would personally look at has HDFS to share data among jobs that run discretely as Condor jobs. Especially in a university environment, where shared compute resources are not 100% reliable and can come and go at will, Condor's resilience in this type of set up is going to make getting work done a whole lot easier.

Waldon answered 3/12, 2010 at 17:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.