How to create a memoryview for a non-contiguous memory location?
Asked Answered
U

1

11

I have a fragmented structure in memory and I'd like to access it as a contiguous-looking memoryview. Is there an easy way to do this or should I implement my own solution?

For example, consider a file format that consists of records. Each record has a fixed length header, that specifies the length of the content of the record. A higher level logical structure may spread over several records. It would make implementing the higher level structure easier if it could see it's own fragmented memory location as a simple contiguous array of bytes.

Update:

It seems that python supports this 'segmented' buffer type internally, at least based on this part of the documentation. But this is only the C API.

Update2:

As far as I see, the referenced C API - called old-style buffers - does what I need, but it's now deprecated and unavailable in newer version of Python (3.X). The new buffer protocol - specified in PEP 3118 - offers a new way to represent buffers. This API is more usable in most of the use cases (among them, use cases where the represented buffer is not contiguous in memory), but does not support this specific one, where a one dimensional array may be laid out completely freely (multiple differently sized chunks) in memory.

Unders answered 13/12, 2013 at 11:23 Comment(5)
First, a file is rarely a memory structure. Second, why and what way do you need to access low-level memory structures, and even do it with python? The question seems more suited for example for C or C++, or other languages with dev-controlled memory allocation and access.Stallfeed
Traverse a linked list and print each data?Expert
A file becomes a memory structure if you map it with mmap (available in python too). Yes it is possible to traverse a linked list but the point is that I'd like to access it with a simple interface like memoryview's interface. I think it is possible since the documentation mentions that you can use it to access non-contiguous data (e.g. NumPy arrays).Unders
Easy, no, but it might be possible using so-called PIL-style buffers.Wrapper
re: mmap, segments must have memory page granularity. neat trick though, and yes, that has been done before!Groundmass
W
2

First - I am assuming you are just trying to do this in pure python rather than in a c extension. So I am assuming you have loaded in the different records you are interested in into a set of python objects and your problem is that you want to see the higher level structure that is spread across these objects with bits here and there throughout the objects.

So can you not simply load each of the records into a byte arrays type? You can then use python slicing of arrays to create a new array that has just the data for the high level structure you are interested in. You will then have a single byte array with just the data you are interested in and can print it out or manipulate it in any way that you want to.

So something like:

a = bytearray(b"Hello World") # put your records into byte arrays like this
b = bytearray(b"Stack Overflow")
complexStructure = bytearray(a[0:6]+b[0:]) # Slice and join arrays to form
                                           # new array with just data from your
                                           # high level entity
print complexStructure

Of course you will still ned to know where within the records your high level structure is to slice the arrays correctly but you would need to know this anyway.

EDIT:

Note taking a slice of a list does not copy the data in the list it just creates a new set of references to the data so:

>>> a = [1,2,3]
>>> b = a[1:3]
>>> id(a[1])
140268972083088
>>> id(b[0])
140268972083088

However changes to the list b will not change a as b is a new list. To have the changes automatically change in the original list you would need to make a more complicated object that contained the lists to the original records and hid them in such a way as to be able to decide which list and which element of a list to change or view when a user look to modify/view the complex structure. So something like:

class ComplexStructure():
    def add_records(self,record):
        self.listofrecords.append(record)

    def get_value(self,position):
        listnum,posinlist = ... # formula to figure out which list and where in 
                                # list element of complex structure is
        return self.listofrecords[listnum][record]

    def set_value(self,position,value):
        listnum,posinlist = ... # formula to figure out which list and where in 
                                # list element of complex structure is
        self.listofrecords[listnum][record] = value

Granted this is not the simple way of doing things you were hoping for but it should do what you need.

Waziristan answered 22/12, 2013 at 13:8 Comment(3)
It's a good solution for certain use cases, yes. The main differences from memoryview are: 1. this requires copying data around 2. when the new data structure is modified, it does not affect the original buffer, the modifications have to be copied back. In an ideal solution, I'd like to see these two properties.Unders
molnarg: Added an edit to deal with your extra two properties.Waziristan
Thanks, it seems to be the only solution, since it is probably not doable using the built-in APIs.Unders

© 2022 - 2024 — McMap. All rights reserved.