Is there a straightforward way to find all the modules that are part of a python package? I've found this old discussion, which is not really conclusive, but I'd love to have a definite answer before I roll out my own solution based on os.listdir().
Yes, you want something based on pkgutil
or similar -- this way you can treat all packages alike regardless if they are in eggs or zips or so (where os.listdir won't help).
import pkgutil
# this is the package we are inspecting -- for example 'email' from stdlib
import email
package = email
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__):
print "Found submodule %s (is a package: %s)" % (modname, ispkg)
How to import them too? You can just use __import__
as normal:
import pkgutil
# this is the package we are inspecting -- for example 'email' from stdlib
import email
package = email
prefix = package.__name__ + "."
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__, prefix):
print "Found submodule %s (is a package: %s)" % (modname, ispkg)
module = __import__(modname, fromlist="dummy")
print "Imported", module
importer
returned by pkgutil.iter_modules
? Can I use it to import a module instead of using this seemly "hackish" __import__(modname, fromlist="dummy")
? –
Calm m = importer.find_module(modname).load_module(modname)
and then m
is the module, so for example: m.myfunc()
–
Finis _path_
). There should be two on either side, for a total of four (ie __path__
). –
Wellread __path__
attribute." - So we can't get the path from any module, only a package. –
Natty The right tool for this job is pkgutil.walk_packages.
To list all the modules on your system:
import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=None, onerror=lambda x: None):
print(modname)
Be aware that walk_packages imports all subpackages, but not submodules.
If you wish to list all submodules of a certain package then you can use something like this:
import pkgutil
import scipy
package=scipy
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__,
prefix=package.__name__+'.',
onerror=lambda x: None):
print(modname)
iter_modules only lists the modules which are one-level deep. walk_packages gets all the submodules. In the case of scipy, for example, walk_packages returns
scipy.stats.stats
while iter_modules only returns
scipy.stats
The documentation on pkgutil (http://docs.python.org/library/pkgutil.html) does not list all the interesting functions defined in /usr/lib/python2.6/pkgutil.py.
Perhaps this means the functions are not part of the "public" interface and are subject to change.
However, at least as of Python 2.6 (and perhaps earlier versions?) pkgutil comes with a walk_packages method which recursively walks through all the modules available.
walk_packages
is now in the documentation: docs.python.org/library/pkgutil.html#pkgutil.walk_packages –
Geotectonic _
) before and after path
-- that is, use package.__path__
rather than package._path_
. It might be easier to try cutting & pasting the code rather than re-typing it. –
Lindemann package
is pointing to a package, not a module. Modules are files whereas packages are directories. All packages have the __path__
attribute (... unless someone deleted the attribute for some reason.) –
Lindemann This works for me:
import types
for key, obj in nltk.__dict__.iteritems():
if type(obj) is types.ModuleType:
print key
Thanks to all previous answers, I've just merged them all into one function, which can be easily used to retrieve submodules:
def list_submodules(module) -> list[str]:
"""
Args:
module: The module to list submodules from.
"""
# We first respect __all__ attribute if it already defined.
submodules = getattr(module, "__all__", None)
if submodules:
return submodules
# Then, we respect module object itself to get imported submodules.
# Warning: Initially, the module object will respect the `__init__.py`
# file, if its not exists, the object can partially load submoudles
# by coda, so can lead `inspect` to return incomplete submodules list.
import inspect
submodules = [o[0] for o in inspect.getmembers(module)
if inspect.ismodule(o[1])]
if submodules:
return submodules
# Finally we can just scan for submodules via pkgutil.
import pkgutil
# pkgutill will invoke `importlib.machinery.all_suffixes()`
# to determine whether a file is a module, so if you get any
# submoudles that are unexpected to get, you need to check
# this function to do the confirmation.
# If you want to retrive a directory as a submoudle, you will
# need to clarify this by putting a `__init__.py` file in the
# folder, even for Python3.
return [x.name for x in pkgutil.iter_modules(module.__path__)]
Then you can just call it like:
import module
print(list_submodules(module))
path = ...
module = importlib.import_module(path)
print(list_submodules(module))
I was looking for a way to reload all submodules that I'm editing live in my package. It is a combination of the answers/comments above, so I've decided to post it here as an answer rather than a comment.
package=yourPackageName
import importlib
import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__, prefix=package.__name__+'.', onerror=lambda x: None):
try:
modulesource = importlib.import_module(modname)
reload(modulesource)
print("reloaded: {}".format(modname))
except Exception as e:
print('Could not load {} {}'.format(modname, e))
In case you are not only interested in listing module names, but you also want to get a reference to the module
objects, this answer is for you:
To list modules, use either pkgutil.iter_modules
if you need just the direct children of a module, or pkgutil.walk_packages
if you need all descendants of a module. Both return ModuleInfo
tuples.
To import modules, there are various suggestions in the existing answers, most of which are not great choices:
__import__
works if you import a top level module__import__('foo')
, but__import__('foo.bar')
will also return thefoo
module, notfoo.bar
! You can work around this restriction, but it is cumbersome.MetaPathFinder.find_module
: has been deprecated since Python 3.4 and was removed in 3.12MetaPathFinder.find_spec
replacesfind_module
, you can use it by accessing theModuleInfo.module_finder
attribute, but it's a bit verbose:
import pkgutil
submodules = [
module_info.module_finder.find_spec(
f"{my_module.__name__}.{module_info.name}"
).loader.load_module()
for module_info in pkgutil.iter_modules(my_module.__path__)
]
My preferred method is to use importlib.import_module
in combination with pkgutil.iter_modules
:
import importlib
import pkgutil
from types import ModuleType
def get_submodules(module: ModuleType) -> list[ModuleType]:
return [
importlib.import_module(f"{module.__name__}.{module_info.name}")
for module_info in pkgutil.iter_modules(module.__path__)
]
a few notes on this solution:
- you can replace
pkgutil.iter_modules
withpkgutil.walk_packages
if needed importlib.import_module
returns the module specified by the path, not the module at the root of the path, like__import__
- with
f"{module.__name__}.{module_info.name}"
we make sure that all modules are referenced by an absolute path (modules can be loaded with shorter paths if the parent module has been imported before, but this can cause issues if you want to filter or compare modules)
Here's one way, off the top of my head:
>>> import os
>>> filter(lambda i: type(i) == type(os), [getattr(os, j) for j in dir(os)])
[<module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'errno' (built-in)>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'sys' (built-in)>]
It could certainly be cleaned up and improved.
EDIT: Here's a slightly nicer version:
>>> [m[1] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
[<module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'errno' (built-in)>, <module 'sys' (built-in)>]
>>> [m[0] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
['_copy_reg', 'UserDict', 'path', 'errno', 'sys']
NOTE: This will also find modules that might not necessarily be located in a subdirectory of the package, if they're pulled in in its __init__.py
file, so it depends on what you mean by "part of" a package.
© 2022 - 2024 — McMap. All rights reserved.
ls
(ordir
)? – Vivavivacels
? When I want to know the modules in a package, I usels -r
on the filesystem. Or I unzip the egg and usels -r
. Why is that inadequate? What more is required? – Vivavivacels
not adequate? Please focus on why -- in this specific question -- thels
is not adequate. I only want clarification on the meaning of this question. – Vivavivacels
by me in the shell, or actualos.popen("ls").read()
or do you really meanos.listdir
? – Hassiehassinls
in the shell is not adequate. The program should discover itself whenever I or some other dev adds a new plugin by saving a new module (say new.py) inside the plugin subpackage. The program will display a list of discovered plugins. – Hassiehassin