Fast n-dimensional sparse array in Python / Cython
Asked Answered
M

1

23

I have an application that involves large n-dimensional arrays which are very sparse. scipy.sparse has a useful 'vectorized getting and setting' feature, so that Cython can be used to populate a sparse matrix quickly.

Of course the scipy package can't handle n-dimensions. There are two packages I have found that do n-dimensional sparse arrays in python sparray and ndsparse. However it seems neither has the vectorized getting and setting feature.

So I need either:

  • a python package for n-dimensional arrays with vectorized get and set or
  • a c library for sparse arrays which I can easily access with Cython or
  • some 'roll your own' option which I guess would require a c equivalent to a python dict

For my purpose I think mapping the n-dimension coordinates back to 1 or two dimensions could work. What would be better though is to have a dict equivalent that i can access fast inside a Cython loop. I assume this rules out the python dict.

Wondering if someone could give me an example of how to use the c++ map object from within Cython?

Mcglynn answered 21/11, 2013 at 6:44 Comment(6)
It's questions like this when I disagree with the "asking us to recommend or find a tool, library or favorite off-site resource are off-topic" close reason. The man has obviously done his research, what else can one do?Ap
Sorry I didn't realise this was a problem. I suppose I can argue, I'm not just after a library but any tips on how to do this myself.Mcglynn
The close reason exists because of worthless "developers" who continually post questions that ask nothing but "how do i make zipfile using jquery show me teh codez plz". But in this case, I believe this is a perfectly valid, well-researched question. +1.Ap
@JonathonReinhart: +1 for your comment. I second it.Ungotten
Add another option to your list: transform n-dimensional coordinates to 2-dimensional coordinates, and vice-versa. I suspect this will wreck (or at least diminish) the worth of vectorized getting and setting but you might want to validate that.Phraseology
This kind of question is indeed a problem, because if the rules were followed to the letter, it should be closed.Celibacy
P
5

If you decide to go with the C dict option, you can use the C++ STL's std::map. It's unlikely that you'll find faster or more robust native code that implements a dictionary/map.

cppmap.pyx:

# distutils: language = c++

cdef extern from "<map>" namespace "std":
    cdef cppclass mymap "std::map<int, float>":
        mymap()
        float& operator[] (const int& k)

cdef mymap m = mymap()
cdef int i
cdef float value

for i in range(100):
    value = 3.0 * i**2
    m[i] = value

print m[10]

setup.py:

from distutils.core import setup
from Cython.Build import cythonize
setup(name = "cppmapapp"
  ext_modules = cythonize('*.pyx'))

Command line:

$ python setup.py build
$ cd build/lib.macosx-10.5-x86_64-2.7
$ python -c 'import cppmap'
300.0
Pneumatics answered 27/11, 2013 at 11:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.