Passing arguments with wildcards to a Python script
Asked Answered
I

4

19

I want to do something like this:

c:\data\> python myscript.py *.csv

and pass all of the .csv files in the directory to my python script (such that sys.argv contains ["file1.csv", "file2.csv"], etc.)

But sys.argv just receives ["*.csv"] indicating that the wildcard was not expanded, so this doesn't work.

I feel like there is a simple way to do this, but can't find it on Google. Any ideas?

Ignitron answered 1/1, 2009 at 23:8 Comment(0)
R
26

You can use the glob module, that way you won't depend on the behavior of a particular shell (well, you still depend on the shell not expanding the arguments, but at least you can get this to happen in Unix by escaping the wildcards :-) ).

from glob import glob
filelist = glob('*.csv') #You can pass the sys.argv argument
Richardo answered 1/1, 2009 at 23:15 Comment(6)
It's not "can"; for windows, it's "must".Wordage
Well, you can also use os.walk so it's not strictly a must :PRichardo
@Vinko Vrsalovic: true. os.walk seems more cumbersome than glob. Glob's not required, but it's so perfect a fit for this problem.Wordage
Just what I was looking for :)Dongola
Does anyone know why Windows shells don't handle this for you?Sanctified
@Ryan: Because DOS utilities were built by people who didn't know UNIX and decided to let each utility to handle the expansion instead of making the shell smarter. Awful design, but that's just one among many :-)Richardo
D
17

In Unix, the shell expands wildcards, so programs get the expanded list of filenames. Windows doesn't do this: the shell passes the wildcards directly to the program, which has to expand them itself.

Vinko is right: the glob module does the job:

import glob, sys

for arg in glob.glob(sys.argv[1]):
    print "Arg:", arg
Diaphoresis answered 1/1, 2009 at 23:19 Comment(2)
+1. Note that this also works perfectly fine if sys.argv[1] is a fully specified filename instead of a wildcard. In that case, glob.glob just returns a list containing that very filename.Marguerite
also works if you forward a parameter to your Python script script.py c:\path\*\subdir glob generates a list that I can use later on. Great module!Azotic
N
2

If you have multiple wildcard items passed in (for eg: python myscript.py *.csv *.txt) then, glob(sys.argv[1] may not cut it. You may need something like below.

import sys
from glob import glob

args = [f for l in sys.argv[1:] for f in glob(l)]

This will work even if some arguments dont have wildcard characters in them. (python abc.txt *.csv anotherfile.dat)

Niedersachsen answered 20/9, 2022 at 13:25 Comment(1)
I've never seen two for...in's within a single list comprehension before. What's happening there?Parlando
P
0

If your script is a utility, I suggest you to define a function like this in your .bashrc to call it in a directory:

myscript() {
   python /path/myscript.py "$@"
}

Then the whole list is passed to your python and you can process them like:

for _file in sys.argv[1:]:
    # do something on file
Pentagon answered 15/4, 2021 at 9:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.