mpi4py or multiprocessing in Python ?
Asked Answered
D

1

14

I am writing a machine learning toolkit to run algorithm with different settings in parallel (each process run the algorithm for one setting). I am thinking about either to use mpi4py or python's build-in multiprocessing ?

There are a few pros and cons I am considering about.

  1. Easy-to-use:

    • mpi4py: It seems more concepts to learn and a bit more tricks to make it work well
    • multiprocessing: quite easy and clean API
  2. Speed:

    • mpi4py: people say it is more low level, so I am expect it can be faster than python multiprocessing ?
    • multiprocessing: compared with mpi4py, much slower ?
  3. Clean and short code:

    • mpi4py: seems more code to write
    • multiprocessing: preferred, easy to use API

The working context is I am aiming at running the code basically in one computer or a GPU server. Not really targeting at running in different machines in the network (which only MPI can do it).

And since the main goal is doing machine learning, so the parallelization is not really required to be very optimal, the key goal I want to have is to balance easy, clean and quick to maintain code base but at the same time like to exploit the benefits of parallelization.

With the background described above, is it recommended that using multiprocessing should just be enough ? Or is there a very strong reason to use mpi4py ?

Discommodity answered 10/6, 2018 at 19:38 Comment(0)
P
-1

By using mpi4py you can divide the task into multiple threads, but with a single computer with limited performance or number of cores the usability will be limited. However you might find it handy during training.

mpi4py is constructed on top of the MPI-1/2 specifications and provides an object oriented interface which closely follows MPI-2 C++ bindings.

MPI for Python provides MPI bindings for the Python language, allowing programmers to exploit multiple processor computing systems. MPI for Python supports convenient, pickle-based communication of generic Python object as well as fast, near C-speed, direct array data communication of buffer-provider objects

Pottery answered 20/9, 2019 at 10:49 Comment(1)
This is just answering what is mpi4py. But from what I understood the question is about a comaprison between "mpi4py" and "python multiprocessing".Remediless

© 2022 - 2024 — McMap. All rights reserved.