Local import statements in Python
Asked Answered
C

7

60

I think putting the import statement as close to the fragment that uses it helps readability by making its dependencies more clear. Will Python cache this? Should I care? Is this a bad idea?

def Process():
    import StringIO
    file_handle=StringIO.StringIO('hello world')
    #do more stuff

for i in xrange(10): Process()

A little more justification: it's for methods which use arcane bits of the library, but when I refactor the method into another file, I don't realize I missed the external dependency until I get a runtime error.

Carcinoma answered 9/11, 2009 at 4:34 Comment(2)
it would be really nice to have an "with (import StringIO) as moduleName: syntaxPayment
@Bean: if that's so that you don't have to type a long or awkward module name, then there's an easy way: import StringIO and then sio = StringIO. Now you can do file_handle = sio.StringIO('hello world') and save those precious five characters. I'd use this sparingly, though, because it can make code harder to read (the assignment is easy to miss; non-standard module names can be distracting).Agreeable
S
91

The other answers evince a mild confusion as to how import really works.

This statement:

import foo

is roughly equivalent to this statement:

foo = __import__('foo', globals(), locals(), [], -1)

That is, it creates a variable in the current scope with the same name as the requested module, and assigns it the result of calling __import__() with that module name and a boatload of default arguments.

The __import__() function handles conceptually converts a string ('foo') into a module object. Modules are cached in sys.modules, and that's the first place __import__() looks--if sys.modules has an entry for 'foo', that's what __import__('foo') will return, whatever it is. It really doesn't care about the type. You can see this in action yourself; try running the following code:

import sys
sys.modules['boop'] = (1, 2, 3)
import boop
print boop

Leaving aside stylistic concerns for the moment, having an import statement inside a function works how you'd want. If the module has never been imported before, it gets imported and cached in sys.modules. It then assigns the module to the local variable with that name. It does not not not modify any module-level state. It does possibly modify some global state (adding a new entry to sys.modules).

That said, I almost never use import inside a function. If importing the module creates a noticeable slowdown in your program—like it performs a long computation in its static initialization, or it's simply a massive module—and your program rarely actually needs the module for anything, it's perfectly fine to have the import only inside the functions in which it's used. (If this was distasteful, Guido would jump in his time machine and change Python to prevent us from doing it.) But as a rule, I and the general Python community put all our import statements at the top of the module in module scope.

Slipknot answered 11/11, 2009 at 16:31 Comment(3)
it also occasionally saves you from cyclical imports (for example: if you need to import a model in your managers.py file with django and the models.py imports the managers.py file already... which it generally does)Skewness
Besides the clear answer, I'm happy you used 3 not, as 2 or 4 would be confusing.Squamous
@Jiaaro: but, isn't having circular imports in module space usually a sign of some architectural problems (which you can then ignore with local imports)Ragged
E
14

Style aside, it is true that an imported module will only be imported once (unless reload is called on said module). However, each call to import Foo will have implicitly check to see if that module is already loaded (by checking sys.modules).

Consider also the "disassembly" of two otherwise equal functions where one tries to import a module and the other doesn't:

>>> def Foo():
...     import random
...     return random.randint(1,100)
... 
>>> dis.dis(Foo)
  2           0 LOAD_CONST               1 (-1)
              3 LOAD_CONST               0 (None)
              6 IMPORT_NAME              0 (random)
              9 STORE_FAST               0 (random)

  3          12 LOAD_FAST                0 (random)
             15 LOAD_ATTR                1 (randint)
             18 LOAD_CONST               2 (1)
             21 LOAD_CONST               3 (100)
             24 CALL_FUNCTION            2
             27 RETURN_VALUE        
>>> def Bar():
...     return random.randint(1,100)
... 
>>> dis.dis(Bar)
  2           0 LOAD_GLOBAL              0 (random)
              3 LOAD_ATTR                1 (randint)
              6 LOAD_CONST               1 (1)
              9 LOAD_CONST               2 (100)
             12 CALL_FUNCTION            2
             15 RETURN_VALUE        

I'm not sure how much more the bytecode gets translated for the virtual machine, but if this was an important inner loop to your program, you'd certainly want to put some weight on the Bar approach over the Foo approach.

A quick and dirty timeit test does show a modest speed improvement when using Bar:

$ python -m timeit -s "from a import Foo,Bar" -n 200000 "Foo()"
200000 loops, best of 3: 10.3 usec per loop
$ python -m timeit -s "from a import Foo,Bar" -n 200000 "Bar()"
200000 loops, best of 3: 6.45 usec per loop
Erleneerlewine answered 9/11, 2009 at 4:41 Comment(0)
G
13

Please see PEP 8:

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

Please note that this is purely a stylistic choice as Python will treat all import statements the same regardless of where they are declared in the source file. Still I would recommend that you follow common practice as this will make your code more readable to others.

Gca answered 9/11, 2009 at 4:35 Comment(2)
cool link, but: "Two good reasons to break a particular rule: (1) When applying the rule would make the code less readable, even for someone who is used to reading code that follows the rules. ..."Carcinoma
This is only true of top-level imports. An import inside of a function will not be processed until that function is invoked. While I would generally discourage doing this, it can be useful when you have a dependency that is either very expensive to load or may not be available in all environments.Irita
R
9

I've done this, and then wished I hadn't. Ordinarily, if I'm writing a function, and that function needs to use StringIO, I can look at the top of the module, see if it's being imported, and then add it if it's not.

Suppose I don't do this; suppose I add it locally within my function. And then suppose at someone point I, or someone else, adds a bunch of other functions that use StringIO. That person is going to look at the top of the module and add import StringIO. Now your function contains code that's not only unexpected but redundant.

Also, it violates what I think is a pretty important principle: don't directly modify module-level state from inside a function.

Edit:

Actually, it turns out that all of the above is nonsense.

Importing a module doesn't modify module-level state (it initializes the module being imported, if nothing else has yet, but that's not at all the same thing). Importing a module that you've already imported elsewhere costs you nothing except a lookup to sys.modules and creating a variable in the local scope.

Knowing this, I feel kind of dumb fixing all of the places in my code where I fixed it, but that's my cross to bear.

Razorback answered 9/11, 2009 at 22:3 Comment(3)
i bolded that for you because i hadn't thought of that and it's really obvious in retrospect.Carcinoma
If by this you mean "import StringIO" in the middle of a function changes anything at all for the module scope where the function was defined... this is wrong. As I explain in my answer, executing an "import" in the middle of a function does not change anyuthing in the module state where that function was defined. And what's wrong with modifying module-level state from inside a function anyway? There's an entire keyword devoted to doing just that thing: "global".Slipknot
I wish more comments involved someone learning something and admitting their error!Lehrer
N
3

When the Python interpreter hits an import statement, it starts reading all the function definitions in the file that is being imported. This explains why sometimes, imports can take a while.

The idea behind doing all the importing at the start IS a stylistic convention as Andrew Hare points out. However, you have to keep in mind that by doing so, you are implicitly making the interpreter check if this file has already been imported after the first time you import it. It also becomes a problem when your code file becomes large and you want to "upgrade" your code to remove or replace certain dependencies. This will require you to search your whole code file to find all the places where you have imported this module.

I would suggest following the convention and keeping the imports at the top of your code file. If you really do want to keep track of dependencies for functions, then I would suggest adding them in the docstring for that function.

Nik answered 9/11, 2009 at 4:45 Comment(1)
"When the python interpreter hits an import statement, it starts reading all the function definitions in the file" -- (1) it happens only the FIRST time that the module is imported during the current run of the Python interpreter (and the imported module is stashed in sys.modules so that only a name lookup is required the next time (2) "reading function definitions" is NOT what it does; it EXECUTES all module-level code (mostlydef and class statements). Other stuff e.g. more imports, setting up module-level data structures (sometimes from files) can take a while. (3) irrelevant.Turnstone
P
1

I can see two ways when you need to import it locally

  1. For testing purpose or for temporary usage, you need to import something, in that case you should put import at the place of usage.

  2. Sometime to avoid cyclic dependency you will need to import it inside a function but that would mean you have problem else where.

Otherwise always put it at top for efficiency and consistency sake.

Pender answered 9/11, 2009 at 5:3 Comment(0)
L
1

I believe that the strongest use case for local imports is to prevent your program from loading unnecessary modules that aren't being used in its current invocation.

A simple example is a CLI or codebase that has a lot of subcommands, many of which have dramatically different module dependencies.

All these dependencies need to be accessible from the main module, but if the code loaded every module that it might need in every codepath before starting execution, there might be a significant delay and unnecessary memory use.

A local import is only actually imported lazily, if the code that uses it is executed, so if all the large dependencies in a project are local imports, a trivial program using the codebase can start up in millseconds and megabytes, not seconds and gigabytes.

Lehrer answered 8/8, 2023 at 9:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.