TL;DR: What's the cleanest way to keep implementation details out of a module's namespace?
There are a number of similar questions on this topic already, but none seems to have a satisfactory answer relative to modern tools and language features.
I'm designing a Python package, and I'd like to keep each module's public interface clean, exposing only what's intended, keeping implementation details (especially imports) hidden.
Over the years, I've seen a number of techniques:
Don't worry about it. Just document how to use your package and let consumers of it just ignore the implementation details.
This is just horrible, in my opinion. A well-designed interface should be easily discoverable. Having the implementation details publicly visible makes the interface much more confusing. Even as the author of a package, I don't want to use it when it exposes too much, as it makes autocompletion less useful.
Add an underscore to the beginning of all implementation details.
This is a well-understood convention, and most development tools are smart enough to at least sort underscore-prefixed names to the bottom of autocomplete lists. It works fine if you have a small number of names to treat this way, but as the number of names grows, it becomes more and more tedious and ugly.
Take for example this relatively simple list of imports:
import struct
from abc import abstractmethod, ABC
from enum import Enum
from typing import BinaryIO, Dict, Iterator, List, Optional, Type, Union
Applying the underscore technique, this relatively small list of imports becomes this monstrosity:
import struct as _struct
from abc import abstractmethod as _abstractmethod, ABC as _ABC
from enum import Enum as _Enum
from typing import (
BinaryIO as _BinaryIO,
Dict as _Dict,
Iterator as _Iterator,
List as _List,
Optional as _Optional,
Type as _Type,
Union as _Union
)
Now, I know this problem can be partially mitigated by never doing from
imports, and just importing the entire package, and package-qualifying everything. While that does help this situation, and I realize that some people prefer to do this anyway, it doesn't eliminate the problem, and it's not my preference. There are some packages I prefer to import directly, but I usually prefer to import type names and decorators explicitly so that I can use them unqualified.
There's an additional small problem with the underscore prefix. Take the following publicly exposed class:
class Widget(_ABC):
@_abstractmethod
def implement_me(self, input: _List[int]) -> _Dict[str, object]:
...
A consumer of this package implementing his own Widget
implementation will see that he needs to implement the implement_me
method, and it needs to take a _List
and return a _Dict
. Those aren't actual type names, and now the implementation-hiding mechanism has leaked into my public interface. It's not a big problem, but it does contribute to the ugliness of this solution.
Hide the implementation details inside a function.
This one's definitely hacky, and it doesn't play well with most development tools.
Here's an example:
def module():
import struct
from abc import abstractmethod, ABC
from typing import BinaryIO, Dict, List
def fill_list(r: BinaryIO, count: int, lst: List[int]) -> None:
while count > 16:
lst.extend(struct.unpack("<16i", r.read(16 * 4)))
count -= 16
while count > 4:
lst.extend(struct.unpack("<4i", r.read(4 * 4)))
count -= 4
for _ in range(count):
lst.append(struct.unpack("<i", r.read(4))[0])
def parse_ints(r: BinaryIO) -> List[int]:
count = struct.unpack("<i", r.read(4))[0]
rtn: List[int] = []
fill_list(r, count, rtn)
return rtn
class Widget(ABC):
@abstractmethod
def implement_me(self, input: List[int]) -> Dict[str, object]:
...
return (parse_ints, Widget)
parse_ints, Widget = module()
del module
This works, but it's super hacky, and I don't expect it to operate cleanly in all development environments. ptpython
, for example, fails to provide method signature information for the parse_ints
function. Also, the type of Widget
becomes my_package.module.<locals>.Widget
instead of my_package.Widget
, which is weird and confusing to consumers.
Use __all__
.
This is a commonly given solution to this problem: list the "public" members in the global __all__
variable:
import struct
from abc import abstractmethod, ABC
from typing import BinaryIO, Dict, List
__all__ = ["parse_ints", "Widget"]
def fill_list(r: BinaryIO, count: int, lst: List[int]) -> None:
... # You've seen this.
def parse_ints(r: BinaryIO) -> List[int]:
... # This, too.
class Widget(ABC):
... # And this.
This looks nice and clean, but unfortunately, the only thing __all__
affects is what happens when you use wildcard imports from my_package import *
, which most people don't do, anyway.
Convert the module to a subpackage, and expose the public interface in __init__.py
.
This is what I'm currently doing, and it's pretty clean for most cases, but it can get ugly if I'm exposing multiple modules instead of flattening everything:
my_package/
+--__init__.py
+--_widget.py
+--shapes/
+--__init__.py
+--circle/
| +--__init__.py
| +--_circle.py
+--square/
| +--__init__.py
| +--_square.py
+--triangle/
+--__init__.py
+--_triangle.py
Then my __init__.py
files look kind of like this:
# my_package.__init__.py
from my_package._widget.py import parse_ints, Widget
# my_package.shapes.circle.__init__.py
from my_package.shapes.circle._circle.py import Circle, Sphere
# my_package.shapes.square.__init__.py
from my_package.shapes.square._square.py import Square, Cube
# my_package.shapes.triangle.__init__.py
from my_package.shapes.triangle._triangle.py import Triangle, Pyramid
This makes my interface clean, and works well with development tools, but it makes my directory structure pretty messy if my package isn't completely flat.
Is there a better technique?
__init__.py
files within the sub-directories and instead ensure you add the root folder path to the module-lookup path (I think it wassys.path
). Then you could store the python modules within the "shapes" directory and make it a sub-package (and then simply callfrom shapes import *
or whichever import version you prefer) in themy_package
package. – Nanonshapes
directory. I'm trying not to flatten the namespace there. The package will exposemy_package.shapes.circle.Circle
,my_package.shapes.circle.Sphere
,my_package.shapes.square.Square
, etc. Generally, I do prefer to flatten my namespaces, but there are some cases where it makes sense to expose a hierarchy, and I wanted to make sure to cover that case, especially because that's where my technique gets particularly messy. – Cayman