Possible to share in-memory data between 2 separate processes?

Asked 12/8, 2009 at 19:33 Answered 5/11, 2019 at 7:30

I have an xmlrpc server using Twisted. The server has a huge amount of data stored in-memory. Is it possible to have a secondary, separate xmlrpc server running which can access the object in-memory in the first server?

So, serverA starts up and creates an object. serverB starts up and can read from the object in serverA.

* EDIT *

The data to be shared is a list of 1 million tuples.

Java answered 12/8, 2009 at 19:33 Comment(0)

141

Without some deep and dark rewriting of the Python core runtime (to allow forcing of an allocator that uses a given segment of shared memory and ensures compatible addresses between disparate processes) there is no way to "share objects in memory" in any general sense. That list will hold a million addresses of tuples, each tuple made up of addresses of all of its items, and each of these addresses will have be assigned by pymalloc in a way that inevitably varies among processes and spreads all over the heap.

On just about every system except Windows, it's possible to spawn a subprocess that has essentially read-only access to objects in the parent process's space... as long as the parent process doesn't alter those objects, either. That's obtained with a call to os.fork(), that in practice "snapshots" all of the memory space of the current process and starts another simultaneous process on the copy/snapshot. On all modern operating systems, this is actually very fast thanks to a "copy on write" approach: the pages of virtual memory that are not altered by either process after the fork are not really copied (access to the same pages is instead shared); as soon as either process modifies any bit in a previously shared page, poof, that page is copied, and the page table modified, so the modifying process now has its own copy while the other process still sees the original one.

This extremely limited form of sharing can still be a lifesaver in some cases (although it's extremely limited: remember for example that adding a reference to a shared object counts as "altering" that object, due to reference counts, and so will force a page copy!)... except on Windows, of course, where it's not available. With this single exception (which I don't think will cover your use case), sharing of object graphs that include references/pointers to other objects is basically unfeasible -- and just about any objects set of interest in modern languages (including Python) falls under this classification.

In extreme (but sufficiently simple) cases one can obtain sharing by renouncing the native memory representation of such object graphs. For example, a list of a million tuples each with sixteen floats could actually be represented as a single block of 128 MB of shared memory -- all the 16M floats in double-precision IEEE representation laid end to end -- with a little shim on top to "make it look like" you're addressing things in the normal way (and, of course, the not-so-little-after-all shim would also have to take care of the extremely hairy inter-process synchronization problems that are certain to arise;-). It only gets hairier and more complicated from there.

Modern approaches to concurrency are more and more disdaining shared-anything approaches in favor of shared-nothing ones, where tasks communicate by message passing (even in multi-core systems using threading and shared address spaces, the synchronization issues and the performance hits the HW incurs in terms of caching, pipeline stalls, etc, when large areas of memory are actively modified by multiple cores at once, are pushing people away).

For example, the multiprocessing module in Python's standard library relies mostly on pickling and sending objects back and forth, not on sharing memory (surely not in a R/W way!-).

I realize this is not welcome news to the OP, but if he does need to put multiple processors to work, he'd better think in terms of having anything they must share reside in places where they can be accessed and modified by message passing -- a database, a memcache cluster, a dedicated process that does nothing but keep those data in memory and send and receive them on request, and other such message-passing-centric architectures.

Grandmother answered 12/8, 2009 at 22:20 Comment(8)

What an insightful answer! Thx Alex. But there are scenario to require sharing data among processes, mainly for maximizing accessing speed yet minimizing memory consumption. This post suggests multiprocess.Array. #10722415 – Saari 28/2, 2013 at 15:25

Alex why do you say "(which I don't think will cover your use case)"? Is it only because of reference counting? – Estelaestele 3/2, 2015 at 23:10

@max, not just that, also because of read-only access (which also means "no new references"), and no-Windows constraints. – Grandmother 4/2, 2015 at 1:6

There's no way to disable reference counting for selected python objects? After all, I don't really need them garbage collected if they never get copied to the subprocess. – Estelaestele 4/2, 2015 at 2:6

@max, correct -- in CPython, reference counting is always-on (in other implementations like Jython, you'd be struggling with turning off even more complex forms of garbage collection). – Grandmother 4/2, 2015 at 2:26

@AlexMartelli say, we have to treat 5TB data. 1 process are dedicated to load the data from memmap into RAM. Another process will handle the computing. When the data block is 5GB the computing time is similar to I/O time. Thus load and process the data in 5GB block is the fastest way. fewer data will let the computing process wait for data, more data will let the I/O process wait for computing. Say the computing will need about 3 times of the data size, then 15 GB. If we use ringbuffer to transfer data we would need about 20 - 25GB in total. – Ladle 18/9, 2016 at 4:36

@AlexMartelli If we use tmpfs, it certainly exceed the system RAM. If we use database it will be even worse, more data transfer time, queue time or more RAM consumption. Depends on GC are even worse, because the GC can cause hiccups. – Ladle 18/9, 2016 at 4:36

In many other questions about python multiprocessing you will find the message passing to be mentioned as a main bottleneck. I guess avoid python multiprocessing if possible, otherwise do some hacking like finding a way to interpret shared objects as multiprocessing.Array – Zoroastrianism 20/10, 2020 at 13:44

mmap.mmap(0, 65536, 'GlobalSharedMemory')

I think the tag ("GlobalSharedMemory") must be the same for all processes wishing to share the same memory.

http://docs.python.org/library/mmap.html

Ajax answered 12/8, 2009 at 19:36 Comment(5)

This may depend on OS. it works on Windows, just tried it between two processes. – Ajax 12/8, 2009 at 19:38

Sharing objects, versus memory, is harder. Perhaps the multiprocessing module can offer some insights into cross-process data sharing – Ajax 12/8, 2009 at 19:41

I would imagine you'd have to pickle (or similarly encode) anything you wanted to share. – Travel 12/8, 2009 at 19:48

But unpickling will produce a copy of the object in your own memory.... The OP will have to let us know what types of objects are to be shared! – Ajax 12/8, 2009 at 19:56

This ONLY works on windows. tagname is not avlbl on linux. – Zulemazullo 10/10, 2023 at 8:36

There are a couple¹ of third party libraries available for low-level shared memory manipulations in Python:

sysv_ipc
- > For posix non-compliant systems
posix_ipc
- > Works in Windows with cygwin

Both of which are available via pip

[1] Another package, shm, is available but deprecated. See this page for a comparison of the libraries.

Example Code for C to Python Communication c/o Martin O'Hanlon:

shmwriter.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>

int main(int argc, const char **argv)
{
   int shmid;
   // give your shared memory an id, anything will do
   key_t key = 123456;
   char *shared_memory;

   // Setup shared memory, 11 is the size
   if ((shmid = shmget(key, 11, IPC_CREAT | 0666)) < 0)
   {
      printf("Error getting shared memory id");
      exit(1);
   }
   // Attached shared memory
   if ((shared_memory = shmat(shmid, NULL, 0)) == (char *) -1)
   {
      printf("Error attaching shared memory id");
      exit(1);
   }
   // copy "hello world" to shared memory
   memcpy(shared_memory, "Hello World", sizeof("Hello World"));
   // sleep so there is enough time to run the reader!
   sleep(10);
   // Detach and remove shared memory
   shmdt(shmid);
   shmctl(shmid, IPC_RMID, NULL);
}

shmreader.py

import sysv_ipc

# Create shared memory object
memory = sysv_ipc.SharedMemory(123456)

# Read value from shared memory
memory_value = memory.read()

# Find the 'end' of the string and strip
i = memory_value.find('\0')
if i != -1:
    memory_value = memory_value[:i]

print memory_value

Lilylivered answered 6/2, 2015 at 23:1 Comment(4)

Of course you still have to represent the object in shared memory somehow. – Lilylivered 6/2, 2015 at 23:3

Is there a way to use a name instead of integer key for the shared memory here? – Gratia 22/1, 2019 at 8:44

@Gratia "key must be None, IPC_PRIVATE or an integer > 0 and ≤ KEY_MAX. If the key is None, the module chooses a random unused key." - semanchuk.com/philip/sysv_ipc – Lilylivered 25/1, 2019 at 15:32

But you could create a function that deterministically maps strings to integers. As long as an equivalent function is available in both places you need to access the SharedMemory, you can use it to generate your key from a string. – Lilylivered 25/1, 2019 at 15:34

You could use shared_memory in 3.8.

https://docs.python.org/3.8/library/multiprocessing.shared_memory.html#module-multiprocessing.shared_memory

Hierarchize answered 5/11, 2019 at 7:30 Comment(3)

Its not possible with none related process – Anaesthesia 18/11, 2020 at 21:43

not working for independent processes. why so many upvotes? – Prevailing 11/11, 2021 at 13:34

not working for independent processes. why so many upvotes? +1 – Headrail 28/12, 2021 at 1:37

You could write a C library to create and manipulate shared-memory arrays for your specific purpose, and then use ctypes to access them from Python.

Or, put them on the filesystem in /dev/shm (which is tmpfs). You'd save a lot of development effort for very little performance overhead: reads/writes from a tmpfs filesystem are little more than a memcpy.

Predominate answered 13/8, 2009 at 12:6 Comment(0)

Simple actually. You can just use shared memory. This example creates a list of tuples (python) in C++ and shares it with a python process that can then use the list of tuples. To use between two Python processes, just make your access as ACCESS_WRITE on the sender process and call the write method.

C++ (sender process):

#include <windows.h>
#include <stdio.h>
#include <conio.h>
#include <tchar.h>
#include <iostream>
#include <string>

#define BUF_SIZE 256
TCHAR szName[]=TEXT("Global\\MyFileMappingObject");
TCHAR szMsg[]=TEXT("[(1, 2, 3), ('a', 'b', 'c', 'd', 'e'), (True, False), 'qwerty']");

int _tmain(int argc, _TCHAR* argv[])
{
     HANDLE hMapFile;
   LPCTSTR pBuf;

   hMapFile = CreateFileMapping(
                 INVALID_HANDLE_VALUE,    // use paging file
                 NULL,                    // default security
                 PAGE_READWRITE,          // read/write access
                 0,                       // maximum object size (high-order DWORD)
                 BUF_SIZE,                // maximum object size (low-order DWORD)
                 szName);                 // name of mapping object

   if (hMapFile == NULL)
   {
      _tprintf(TEXT("Could not create file mapping object (%d).\n"),
             GetLastError());
      return 1;
   }
   pBuf = (LPTSTR) MapViewOfFile(hMapFile,   // handle to map object
                        FILE_MAP_ALL_ACCESS, // read/write permission
                        0,
                        0,
                        BUF_SIZE);

   if (pBuf == NULL)
   {
      _tprintf(TEXT("Could not map view of file (%d).\n"),
             GetLastError());

       CloseHandle(hMapFile);
       return 1;
   }

   CopyMemory((PVOID)pBuf, szMsg, (_tcslen(szMsg) * sizeof(TCHAR)));
    _getch();

   UnmapViewOfFile(pBuf);

   CloseHandle(hMapFile);
    return 0;
}

Python (receiver process):

import mmap
shmem = mmap.mmap(0,256,"Global\\MyFileMappingObject",mmap.ACCESS_READ)
msg_bytes = shmem.read()
msg_utf16 = msg_bytes.decode("utf-16")
code = msg_utf16.rstrip('\0')
yourTuple = eval(code)

Employment answered 12/5, 2017 at 19:9 Comment(0)

You can use the Python Multiprocessing module.

http://docs.python.org/library/multiprocessing.html#sharing-state-between-processes

Makassar answered 12/8, 2009 at 20:20 Comment(1)

multiprocessing lets you share arrays of primitives, so if you can stick it in that somehow, you're good. – Spragens 31/12, 2011 at 16:43

Why not stick the shared data into memcache server? then both servers can access it quite easily.

Erratic answered 13/8, 2009 at 9:13 Comment(5)

memcache is allowed to throw your data away whenever it wants (e.g. memory slab full), hence the cache part of the name. Don't use it if you care about durability. – Censurable 23/9, 2013 at 16:21

@Censurable redis would work fine though, plus it has native support for some basic objects instead of just strings – Lilylivered 27/9, 2017 at 20:27

Also, don't you pay a serialize/deserialize cost at each access for memcache. – Tetrode 23/3, 2018 at 17:24

@Tetrode I suspect you still pay that cost with shared memory if you want any hope of accessing the data in a predictable and structured way (without a ton of low-level programming). – Lilylivered 9/2, 2019 at 22:42

Depends on your programming language. Python and numpy, for example, allow yo to back data with shared memory without serialization. – Tetrode 11/2, 2019 at 18:41

If you data is simply tuples, and you're willing to access these as either an

(nrows x tuplewidth) np.ndarrays, or
n 1d np.ndarrays

then I highly recommend using numpy's wrapper for memmap.

My understanding is:

you save your numpy arrays as a flat memmap file that holds the raw array contents
each process points an ndarray to the memmap file as its backing data. The documentation link shows how.

This works great for read-only data. If you want read-write, you'll need to uses the multiprocess locks to protect access.

Because memmap uses paging to load the data its a blazingly fast way to access large datasets from disk. In fact, I don't think modern OSs have any faster to loaded data from disk into memory than this -- no serialization is involved.

Tetrode answered 23/3, 2018 at 17:34 Comment(0)

-1

Why not just use a database for the shared data? You have a multitude of lightweight options where you don't need to worry about the concurrency issues: sqlite, any of the nosql/key-value breed of databases, etc.

Cletis answered 13/8, 2009 at 9:3 Comment(1)

You don't have to worry about concurrency issues with a shared database? Source please ;) – Lilylivered 9/2, 2019 at 22:43

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Example Code for C to Python Communication c/o Martin O'Hanlon:

shmwriter.c

shmreader.py

Recommended topics

Hot tags