Using Multicore in R for a pentium 4 HT machine
Asked Answered
T

2

4

I am using a Pentium 4 HT machine at office for running R, some of the code requires plyr package, which I usually need to wait for 6-7 minutes for the script to finish running, while I saw my processor is only half utilized.

I have heard of using Multicore package in R for better utilizing the multicore processor, is my case suitable for this?

Thanks!

Tumultuous answered 23/8, 2010 at 7:40 Comment(2)
You've accepted a solution, and I'm curious: how are you going to implement this?Titled
i accept it because it clears out some of my misunderstanding, i dun think there's anything i can implement here, thxTumultuous
V
6

There is a bunch of packages out there to do multicoring. See doMPI, doSNOW, doMC and doSMP. They are all front ends for other programs that run parallelization (like MPI/OpenMPI, multicore package...). On Windows, I've had good experience with doSMP while on Linux doMC looks promising (with some support for windows emerging, but some people have doubts about emulation of "fork").

That being said, I concur with Vince's comments about need to write plyr function to use the power of parallel computing. You could write your own function that emulate plyr (or edit plyr) that uses %dopar% (see foreach package as well).

Two "CPU usage history" windows could mean two cores or multi-threading. For instance, I have an i7-920 processor with 4 cores, but I see 8 history windows, because each core is multi-threaded.

Excuse my vocabulary and/or logic, but I would be that fish in Vince's post when it comes to these sort of things.

alt text

Veliger answered 23/8, 2010 at 9:16 Comment(5)
Upvote for knowing about hardware... but would that make you the fish? I'm lost in my own analogy :-)Titled
thanks, I think mine is the 'multi-threading' case, and my case is a "NO GO" for multi-core stuff, right?Tumultuous
@Tumultuous no; Intel's "hyper threading" is just about emulating two cores by one (core does the second task when the first is blocked by something), so basically it is just multi-core (light, but still). Multi-threading means that your program runs in few processes, that then OS can distribute among cores to run concurrently.Seaborg
@Tumultuous - YES, at least in my opinion. I just edited my post to explain why. If you factor in what I've talked about and it's still worth it, go for it! (or if you just want to learn!)Titled
er, That's yes to the "NO GO" part.Titled
T
1

This may sound like a silly question, but does your processor have more than one core? It was my understanding P4's didn't, but I have as much knowledge about hardware as a fish does astrophysics.

When you say your "process is only half utilized", do you mean that you are monitoring two cores and only one is being used, or a single core is being half used? If it's the latter, your application is probably memory bound (and probably hitting swap space), not CPU, so parallelization won't help.

Also, it doesn't look like the plyr package uses the multicore package, so you would have to explicitly rewrite parts of plyr to get parallelization. But, if parts of plyr were embarrassingly parallel, I bet they'd already be parallelized.

So I don't think your problem is CPU bound, I think it's memory-bound (and hitting swap). Monitor your memory, and maybe move it to a higher memory machine.

Hope this helps!

Edit:

@Vince As I wrote on romunov's answer; HT core will execute 2 processes faster than one (yet slower than 2 cores), so it is worth making parallel. Also even memory bound process will also take 100% of core. (my emphasis)

Worth making parallel? There's much more that goes into that equation. Countless times when exploring Python's multiprocessing and threading modules I've rewritten entire programs - even "easily parallelizable" ones - and they've run slower. Why? There are fixed costs to opening new threads, processes, shuffling data around to different processes, etc. It's just not that simple; parallelization, in my experience, has never been the magic bullet it's being talked about here. I think these answers are misleading.

First off, we're talking about parallelizing a task that takes "6-7 minutes". Unless the OP knows his/her data is going to grow a lot, parallelization isn't even worth the wall clock time it takes to program. In the time it takes to implement the parallel version, perhaps he/she could have done 100 non-parallel runs. In my work environment, that wall clock time matters. These calculations need to be factored in to the runtime equation (unless you're doing it for learning/fun)

Second, if it is hitting swap space, the largest slow down isn't the CPU, it's disk I/O. Even if there was an easy way to shuffle around plyr code to get some parts parallelized (which I doubt), doing so on an I/O-bound process would speed things up trivially compared to adding more memory.

As an example, I once ran a command from the reshape package that demonstrated this exact behavior. It was on a multicore OS X machine with 4GB of memory, and in seconds it was crawling (well, my whole computer was crawling!) with 60-70% CPU across two cores, and all 4GB of memory used. I let it run as an experiment for an hour, then killed R and saw my memory jump back to 3GB free. I shuffled it to a 512GB RAM server (yes, we are lucky enough to have that), and it finished in 7 minutes. No amount of core usage changed.

Titled answered 23/8, 2010 at 8:9 Comment(5)
i used winxp's task manager to monitors the CPU process, I can see two charts record CPU processes, so I guess it means it has two cores. Maybe I'm really asking stupid question. Thanks again.Tumultuous
Multicore should look like this: ixbt.com/cpu/images/intel-pentium-xe-955/taskman.gifTitled
I have 2 windows for the "CPU usage history", but the curve is not that extreme as yours, with one on the top and 3 at the bottom, both of mine are more or less in the middle.Tumultuous
And memory usage? I bet it's that :-)Titled
As I wrote on romunov's answer; HT core will execute 2 processes faster than one (yet slower than 2 cores), so it is worth making parallel. Also even memory bound process will also take 100% of core.Seaborg

© 2022 - 2024 — McMap. All rights reserved.