Package-specific import hooks in Python
Asked Answered
M

1

16

I'm working on creating a Python module that maps API provided by a different language/framework into Python. Ideally, I would like this to be presented as a single root package that exposes helper methods, and which maps all namespaces in that other framework to Python packages/modules. For the sake of convenience, let's take CLR as an example:

import clr.System.Data
import clr.System.Windows.Forms

Here clr is the magic top-level package which exposes CLR namespaces System.Data and System.Windows.Forms subpackages/submodules (so far as I can see, a package is just a module with child modules/packages; it is still valid to have other kinds of members therein).

I've read PEP-302 and wrote a simple prototype program that achieves a similar effect by installing a custom meta_path hook. The clr module itself is a proper Python module which, when imported, sets __path__ = [] (making it a package, so that import even attempts lookup for submodules at all), and registers the hook. The hook itself intercepts any package load where full name of the package starts with "clr.", dynamically creates the new module using imp.new_module(), registers it in sys.modules, and uses pixie dust and rainbows to fill it with classes and methods from the original API. Here's the code:

clr.py

import sys
import imp

class MyLoader:
    def load_module(self, fullname):
        try:
            return sys.modules[fullname]
        except KeyError:
            pass
        print("--- load ---")
        print(fullname)
        m = imp.new_module(fullname)
        m.__file__ = "clr:" + fullname
        m.__path__ = []
        m.__loader__ = self
        m.speak = lambda: print("I'm " + fullname)
        sys.modules.setdefault(fullname, m)
        return m

class MyFinder:
    def find_module(self, fullname, path = None):
        print("--- find ---")
        print(fullname)
        print(path)
        if fullname.startswith("clr."):
            return MyLoader()            
        return None

print("--- init ---")
__path__ = []
sys.meta_path.append(MyFinder())

test.py

import clr.Foo.Bar.Baz

clr.Foo.speak()
clr.Foo.Bar.speak()
clr.Foo.Bar.Baz.speak()

All in all this seems to work fine. Python guarantees that modules in the chain are imported left to right, so clr is always imported first, and it sets up the hook that allows the remainder of the chain to be imported.

However, I'm wondering if what I'm doing here is overkill. I am, after all, installing a global hook, that will be called for any module import, even though I filter out those that I don't care about. Is there, perhaps, some way to install a hook that will only be called for imports from my particular package, and not others? Or is the above the Right Way to do this kind of thing in Python?

Mourant answered 1/9, 2011 at 9:48 Comment(2)
I'm sure your code is only an example, but just to check -- if you're actually looking to port CLR libraries to Python you should use IronPython. =)Sammysamoan
@katrielalex: indeed, I am quite aware of IronPython. It should be noted though that it doesn't fully solve the problem you mention - namely, it requires you to port your Python code to CLR, rather than use CLR libraries from CPython (e.g. alongside some C libraries). But, yes, CLR in this case is only used as a concrete, easy-to-demonstrate example; the real thing is something else.Mourant
T
5

In general, I think your approach looks fine. I wouldn't worry about it being "global", since the whole point is to specify which paths should be handled by you. Moving this test inside the import logic would just needlessly complicate it, so it's left to the implementer of the hook to decide.

Just one small concern, maybe you could use sys.path_hooks? It appears to be a bit less "powerful" than sys.meta_path

sys.path_hooks is a list of callables, which will be checked in sequence to determine if they can handle a given path item. The callable is called with one argument, the path item. The callable must raise ImportError if it is unable to handle the path item, and return an importer object if it can handle the path item.

Torrens answered 1/9, 2011 at 10:14 Comment(4)
I've looked at __path__ and sys.path, and it seems to be a list of file paths that are used to look up modules. Now, technically, I can set __path__ of my package to whatever I want, so long as I also provide a path hook to handle the same. But it smells like a hack (what if somebody inspects __path__? they are allowed to), and it seems that this doesn't really buy me anything over using meta_path, since the filter is still global. On the other hand, I will now have to make sure that __path__ for all my packages is properly maintained to ensure that imports go through my finderMourant
@Pavel: do you mean path_hooks?Torrens
Yes, the comment referred to path_hooks. From reading the docs, it seems that handler registered with it would be fed whatever is in sys.path (for top-level packages) or the parent package's __path__ (for child packages), so I'd need to fill those with fake values for my dynamically projected modules. I guess I could provide a real path for the clr module, and then pretend that it's a directory with subdirectories for projected modules. But I'm still not sure what that would buy me over the other option.Mourant
@Pavel: I'm also not sure :-) Not familiar with import hooks enough, but from my understanding of their documentation and the PEP, what you're doing here seems to be a good usageTorrens

© 2022 - 2024 — McMap. All rights reserved.