Python, multithreading too slow, multiprocess
Asked Answered
M

1

7

I'm a multiprocessing newbie,

I know something about threading but I need to increase the speed of this calculation, hopefully with multiprocessing:

Example Description: sends string to a thread, alters string + benchmark test, send result back for printing.

from threading import Thread

class Alter(Thread):
    def __init__(self, word):
        Thread.__init__(self)
        self.word = word
        self.word2 = ''

    def run(self):
        # Alter string + test processing speed
        for i in range(80000):
            self.word2 = self.word2 + self.word

# Send a string to be altered
thread1 = Alter('foo')
thread2 = Alter('bar')
thread1.start()
thread2.start()

#wait for both to finish
while thread1.is_alive() == True: pass
while thread2.is_alive() == True: pass


print(thread1.word2)
print(thread2.word2)

This is currently takes about 6 seconds and I need it to go faster.
I have been looking into multiprocessing and cannot find something equivalent to the above code. I think what I am after is pooling but examples I have found have been hard to understand. I would like to take advantage of all cores (8 cores) multiprocessing.cpu_count() but I really just have scraps of useful information on multiprocessing and not enough to duplicate the above code. If anyone can point me in the right direction or better yet, provide an example that would be greatly appreciated. Python 3 please

Molybdenite answered 8/1, 2012 at 2:44 Comment(6)
don't busy-wait for thread to complete. use Thread.join()!Epileptic
why not? I have done this in most of my coding and if you can provide a good reason, i will change it all :)Molybdenite
well it's at least as good as busy-waiting and probably makes it passively wait until the thread is terminated without eating the cpu (although i can't find it in the docs i'd wager cpython doesn't busy-wait in it's Thread.join()).Epileptic
**this probably depends on platform anyway.Epileptic
ok that makes sense, I think I'll use this join() moving forwardMolybdenite
@Rhys: Busy-waiting like that will hold the GIL, and prevent your worker threads from running. Also, if you don't have enough CPUs, then a CPU that could be in your worker thread doing useful work, handling an OS interrupt, or dealing with some other process is instead generating heat.Autacoid
Y
10

Just replace threading with multiprocessing and Thread with Process. Threads in Pyton are (almost) never used to gain performance because of the big bad GIL! I explained it in an another SO-post with some links to documentation and a great talk about threading in python.

But the multiprocessing module is intentionally very similar to the threading module. You can almost use it as an drop-in replacement!

The multiprocessing module doesn't AFAIK offer a functionality to enforce the use of a specific amount of cores. It relies on the OS-implementation. You could use the Pool object and limit the worker-onjects to the core-count. Or you could look for an other MPI library like pypar. Under Linux you could use a pipe under the shell to start multiple instances on different cores

Year answered 8/1, 2012 at 2:55 Comment(4)
A good read on how Python handles multiprocessing vs threading on multicore is hereEdwards
@Don, Yes! it seems to work. I must just check how much faster its running. one thing though, the code above does not specify number of cores used ... would this be easy to include?Molybdenite
Hi, @Don, what you mean by big bad GIL? I'm a python newbie.Photofluorography
@mrroy: GIL = global interpreter lock; basically you have "real" threads (hardware/OS supported) bur you don't gain performance, but may even loose some. threading in cpython (2.x) is meant for concurrent I/O operations.Year

© 2022 - 2024 — McMap. All rights reserved.