What does from __future__ import absolute_import actually do?
Asked Answered
I

2

209

I have answered a question regarding absolute imports in Python, which I thought I understood based on reading the Python 2.5 changelog and accompanying PEP. However, upon installing Python 2.5 and attempting to craft an example of properly using from __future__ import absolute_import, I realize things are not so clear.

Straight from the changelog linked above, this statement accurately summarized my understanding of the absolute import change:

Let's say you have a package directory like this:

pkg/
pkg/__init__.py
pkg/main.py
pkg/string.py

This defines a package named pkg containing the pkg.main and pkg.string submodules.

Consider the code in the main.py module. What happens if it executes the statement import string? In Python 2.4 and earlier, it will first look in the package's directory to perform a relative import, finds pkg/string.py, imports the contents of that file as the pkg.string module, and that module is bound to the name "string" in the pkg.main module's namespace.

So I created this exact directory structure:

$ ls -R
.:
pkg/

./pkg:
__init__.py  main.py  string.py

__init__.py and string.py are empty. main.py contains the following code:

import string
print string.ascii_uppercase

As expected, running this with Python 2.5 fails with an AttributeError:

$ python2.5 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

However, further along in the 2.5 changelog, we find this (emphasis added):

In Python 2.5, you can switch import's behaviour to absolute imports using a from __future__ import absolute_import directive. This absolute-import behaviour will become the default in a future version (probably Python 2.7). Once absolute imports are the default, import string will always find the standard library's version.

I thus created pkg/main2.py, identical to main.py but with the additional future import directive. It now looks like this:

from __future__ import absolute_import
import string
print string.ascii_uppercase

Running this with Python 2.5, however... fails with an AttributeError:

$ python2.5 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

This pretty flatly contradicts the statement that import string will always find the std-lib version with absolute imports enabled. What's more, despite the warning that absolute imports are scheduled to become the "new default" behavior, I hit this same problem using both Python 2.7, with or without the __future__ directive:

$ python2.7 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

$ python2.7 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

as well as Python 3.5, with or without (assuming the print statement is changed in both files):

$ python3.5 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print(string.ascii_uppercase)
AttributeError: module 'string' has no attribute 'ascii_uppercase'

$ python3.5 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print(string.ascii_uppercase)
AttributeError: module 'string' has no attribute 'ascii_uppercase'

I have tested other variations of this. Instead of string.py, I have created an empty module -- a directory named string containing only an empty __init__.py -- and instead of issuing imports from main.py, I have cd'd to pkg and run imports directly from the REPL. Neither of these variations (nor a combination of them) changed the results above. I cannot reconcile this with what I have read about the __future__ directive and absolute imports.

It seems to me that this is easily explicable by the following (this is from the Python 2 docs but this statement remains unchanged in the same docs for Python 3):

sys.path

(...)

As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first.

So what am I missing? Why does the __future__ statement seemingly not do what it says, and what is the resolution of this contradiction between these two sections of documentation, as well as between described and actual behavior?

Incorporable answered 16/11, 2015 at 20:18 Comment(1)
See also: docs.python.org/2.5/whatsnew/pep-328.htmlInkstand
F
127

The changelog is sloppily worded. from __future__ import absolute_import does not care about whether something is part of the standard library, and import string will not always give you the standard-library module with absolute imports on.

from __future__ import absolute_import means that if you import string, Python will always look for a top-level string module, rather than current_package.string. However, it does not affect the logic Python uses to decide what file is the string module. When you do

python pkg/script.py

pkg/script.py doesn't look like part of a package to Python. Following the normal procedures, the pkg directory is added to the path, and all .py files in the pkg directory look like top-level modules. import string finds pkg/string.py not because it's doing a relative import, but because pkg/string.py appears to be the top-level module string. The fact that this isn't the standard-library string module doesn't come up.

To run the file as part of the pkg package, you could do

python -m pkg.script

In this case, the pkg directory will not be added to the path. However, the current directory will be added to the path.

You can also add some boilerplate to pkg/script.py to make Python treat it as part of the pkg package even when run as a file:

if __name__ == '__main__' and __package__ is None:
    __package__ = 'pkg'

However, this won't affect sys.path. You'll need some additional handling to remove the pkg directory from the path, and if pkg's parent directory isn't on the path, you'll need to stick that on the path too.

Fechner answered 16/11, 2015 at 20:35 Comment(7)
OK, I mean, I get that. That's exactly the behavior my post is documenting. In the face of that, though, two questions: (1.) If "that's not exactly true", why do the docs categorically say it is? and, (2.) How, then, do you import string if you accidentally shadow it, at least without rifling through sys.modules. Isn't this what from __future__ import absolute_import is intended to prevent? What does it do? (PS, I'm not the downvoter.)Incorporable
Aye, that was me (downvote for 'not useful', wasn't for 'wrong'). It's clear from the bottom section the OP understands how sys.path works, and the actual question hasn't been addressed at all. That is, What does from __future__ import absolute_import actually do?Cutright
@Two-BitAlchemist: 1) The changelog is loosely-worded and non-normative. 2) You stop shadowing it. Even rifling through sys.modules won't get you the standard-library string module if you shadowed it with your own top-level module. from __future__ import absolute_import isn't meant to stop top-level modules from shadowing top-level modules; it's supposed to stop package-internal modules from shadowing top-level modules. If you run the file as part of the pkg package, the package's internal files stop showing up as top-level.Fechner
@Two-BitAlchemist: Answer revised. Is this version more helpful?Fechner
@Fechner Yes, thank you. I have upvoted it. You're probably right that this confusion is entirely due to the wording of the changelog. This problem is largely theoretical to me, since I never really used Python 2.5 and I am careful not to create these situations (naming local modules something I might import from the std lib).Incorporable
I tried python -m pkg/main.py, it show that Import by filename is not supported. why?Dolhenty
@storen: Assuming pkg is a package on the import search path, that should be python -m pkg.main. -m needs a module name, not a file path.Fechner
O
55

The difference between absolute and relative imports come into play only when you import a module from a package and that module imports an other submodule from that package. See the difference:

$ mkdir pkg
$ touch pkg/__init__.py
$ touch pkg/string.py
$ echo 'import string;print(string.ascii_uppercase)' > pkg/main1.py
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkg/main1.py", line 1, in <module>
    import string;print(string.ascii_uppercase)
AttributeError: 'module' object has no attribute 'ascii_uppercase'
>>> 
$ echo 'from __future__ import absolute_import;import string;print(string.ascii_uppercase)' > pkg/main2.py
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>> 

In particular:

$ python2 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 1, in <module>
    from __future__ import absolute_import;import string;print(string.ascii_uppercase)
AttributeError: 'module' object has no attribute 'ascii_uppercase'
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>> 
$ python2 -m pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ

Note that python2 pkg/main2.py has a different behaviour then launching python2 and then importing pkg.main2 (which is equivalent to using the -m switch).

If you ever want to run a submodule of a package always use the -m switch which prevents the interpreter for chaining the sys.path list and correctly handles the semantics of the submodule.

Also, I much prefer using explicit relative imports for package submodules since they provide more semantics and better error messages in case of failure.

Obstinate answered 16/11, 2015 at 20:34 Comment(4)
So essentially it only works for a narrow case where you've avoided the "current directory" problem? That seems to be a much weaker implementation than is described by PEP 328 and the 2.5 changelog. Do you believe the documentation is inaccurate?Incorporable
@Two-BitAlchemist Actually what you are doing is the "narrow case". You launch only a single python file to be executed, but this may trigger hundreds of imports. Submodules of a package simply shouldn't be executed, that's all.Obstinate
why python2 pkg/main2.py has a different behaviour then launching python2 and then importing pkg.main2?Dolhenty
@Dolhenty That's because the behaviour with relative imports changes. When you launch pkg/main2.py python (version 2) does not treat pkg as a package. While using python2 -m pkg.main2 or importing it do take into account that pkg is a package.Obstinate

© 2022 - 2024 — McMap. All rights reserved.