Read from File, or STDIN
Asked Answered
R

9

85

I've written a command line utility that uses getopt for parsing arguments given on the command line. I would also like to have a filename be an optional argument, such as it is in other utilities like grep, cut etc. So, I would like it to have the following usage

tool -d character -f integer [filename]

How can I implement the following?

  • if a filename is given, read from the file.
  • if a filename is not given, read from STDIN.
Reincarnation answered 16/11, 2009 at 21:32 Comment(1)
see also unix.stackexchange.com/questions/47098/…Dedicate
S
77

In the simplest terms:

import sys
# parse command line
if file_name_given:
    inf = open(file_name_given)
else:
    inf = sys.stdin

At this point you would use inf to read from the file. Depending on whether a filename was given, this would read from the given file or from stdin.

When you need to close the file, you can do this:

if inf is not sys.stdin:
    inf.close()

However, in most cases it will be harmless to close sys.stdin if you're done with it.

Shopkeeper answered 16/11, 2009 at 21:40 Comment(3)
@thefourtheye: Yes, both those functions will read from either a file or from sys.stdin.Shopkeeper
I found another way to solve this problem, I blogged about it here dfourtheye.blogspot.in/2013/05/… and added an answer to this question as well.Toni
@Toni have deleted their answer; you probably don't need to click through to a blog to discover sys.stdin = open(file_name)Tarr
T
101

The fileinput module may do what you want - assuming the non-option arguments are in args then:

import fileinput
for line in fileinput.input(args):
    print line

If args is empty then fileinput.input() will read from stdin; otherwise it reads from each file in turn, in a similar manner to Perl's while(<>).

Turnbuckle answered 16/11, 2009 at 21:35 Comment(5)
This was just as good of an answer, but isn't quite as generalizable. I will remember to use fileinput next time if appropriate.Reincarnation
Right, but if you're using getargs (as the OP is) then you probably just want to pass the leftover args rather than sys.argv[1:] (which is the default).Turnbuckle
fileinput is a strange and annoying API, it forces you to use flagged arguments on the command line.Britisher
@Britisher It is not a fileinput design fault: distinguishing arguments that are the names of input files from other arguments is an issue that is inherent to the problem domain. Fileinput (especially with argparse) simplifies the use of a common pattern for doing this, which you can choose to use or not, but if have some other way of making the distinction, you can send a slice of sys.argv (or a different array of names altogether) to fileinput.input() - and you do not have to put in a fake sys.argv[0] when you explicitly pass an array.Numerary
If args is an empty sequence, it will read from stdin, yes. If it's None, then it will be as if it weren't supplied; i.e., fileinput.input will do its own parsing of the command line, and treat each token as a filename to open.Leta
S
77

In the simplest terms:

import sys
# parse command line
if file_name_given:
    inf = open(file_name_given)
else:
    inf = sys.stdin

At this point you would use inf to read from the file. Depending on whether a filename was given, this would read from the given file or from stdin.

When you need to close the file, you can do this:

if inf is not sys.stdin:
    inf.close()

However, in most cases it will be harmless to close sys.stdin if you're done with it.

Shopkeeper answered 16/11, 2009 at 21:40 Comment(3)
@thefourtheye: Yes, both those functions will read from either a file or from sys.stdin.Shopkeeper
I found another way to solve this problem, I blogged about it here dfourtheye.blogspot.in/2013/05/… and added an answer to this question as well.Toni
@Toni have deleted their answer; you probably don't need to click through to a blog to discover sys.stdin = open(file_name)Tarr
W
22

I prefer to use "-" as an indicator that you should read from stdin, it's more explicit:

import sys
with open(sys.argv[1], 'r') if sys.argv[1] != "-" else sys.stdin as f:
    pass # do something here
Wellordered answered 25/1, 2015 at 21:0 Comment(2)
Your solution will close sys.stdin, so input function calls after with statement will raise ValueError.Epigynous
@TimofeyBondarev That may be true .. but most frequently the input is only used once in a script. This is a useful construct.Megohm
T
21

I like the general idiom of using a context manager, but the (too) trivial solution ends up closing sys.stdin when you are out of the with statement, which I want to avoid.

Borrowing from this answer, here is a workaround:

import sys
import contextlib

@contextlib.contextmanager
def _smart_open(filename, mode='Ur'):
    if filename == '-':
        if mode is None or mode == '' or 'r' in mode:
            fh = sys.stdin
        else:
            fh = sys.stdout
    else:
        fh = open(filename, mode)
    try:
        yield fh
    finally:
        if filename != '-':
            fh.close()
    
if __name__ == '__main__':
    args = sys.argv[1:]
    if args == []:
        args = ['-']
    for filearg in args:
        with _smart_open(filearg) as handle:
            do_stuff(handle)

I suppose you could achieve something similar with os.dup() but the code I cooked up to do that turned out to be more complex and more magical, whereas the above is somewhat clunky but very straightforward.

Tarr answered 23/4, 2015 at 12:53 Comment(2)
Thanks a lot! This is what exactly I was looking for. Very clear and straight forward solution.Drowsy
This is also a useful bit of code to use when argparse.FileType just gets too annoying (which happens fairly often for me).Blastocoel
H
15

To make use of python's with statement, one can use the following code:

import sys
with open(sys.argv[1], 'r') if len(sys.argv) > 1 else sys.stdin as f:
    # read data using f
    # ......
Hamlin answered 6/11, 2013 at 1:49 Comment(1)
Your solution will close sys.stdin, so input function calls after with statement will raise ValueError.Epigynous
F
13

Not a direct answer but related.

Normally when you write a python script you could use the argparse package. If this is the case you can use:

parser = argparse.ArgumentParser()
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin)

'?'. One argument will be consumed from the command line if possible, and produced as a single item. If no command-line argument is present, the value from default will be produced.

and here we set default to sys.stdin;

so If there is a file it will read it , and if not it will take the input from stdin "Note: that we are using positional argument in the example above"

for more visit: https://docs.python.org/2/library/argparse.html#nargs

Fatty answered 24/5, 2018 at 9:11 Comment(0)
L
8

Switch to argparse (it's also part of the standard library) and use an argparse.FileType with a default value of stdin:

import  argparse, sys

p = argparse.ArgumentParser()
p.add_argument('input', nargs='?',
  type=argparse.FileType(), default=sys.stdin)
args = p.parse_args()

print(args.input.readlines())

This will not let you specify encoding and other parameters for stdin, however; if you want to do that you need to make the argument non-optional and let FileType do its thing with stdin when - is given as an argument:

p.add_argument('input', type=FileType(encoding='UTF-8'))

Take heed that this latter case will not honour binary mode ('b') I/O. If you need only that, you can use the default argument technique above, but extract the binary I/O object, e.g., default=sys.stdout.buffer for stdout. However, this will still break if the user specifies - anyway. (With - stdin/stdout is always wrapped in a TextIOWrapper.)

If you want it to work with -, or have any other arguments you need to provide when opening the file, you can fix the argument if it got wrapped wrong:

p.add_argument('output', type=argparse.FileType('wb'))
args = p.parse_args()
if hasattr(args.output, 'buffer'):
    #   If the argument was '-', FileType('wb') ignores the 'b' when
    #   wrapping stdout. Fix that by grabbing the underlying binary writer.
    args.output = args.output.buffer

(Hat tip to medhat for mentioning add_argument()'s type parameter.)

Lucilelucilia answered 3/9, 2020 at 3:40 Comment(0)
A
3

A KISS solution is:

if file == "-":
    content = sys.stdin.read()
else:
    with open(file) as f:
        content = f.read()
print(content)   # Or whatever you want to do with the content of the file.
Alloplasm answered 6/8, 2022 at 0:20 Comment(1)
+ if you wanted array of file lines, it's: sys.stdin.readlines(), f.readlines()Trimetallic
C
1

Something like:

if input_from_file:
    f = open(file_name, "rt")
else:
    f = sys.stdin
inL = f.readline()
while inL:
    print inL.rstrip()
    inL = f.readline()
Carlile answered 16/11, 2009 at 21:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.