How to avoid buffering in the Python fileinput library
Asked Answered
O

1

6

I've seen this question asked here, but the answers given did not work in my case and was marked duplicate.

I dug in the source code (/usr/lib/python3.2/fileinput.py) and saw that readlines(bufsize) was being used internally to load a buffer. No shell or other piping shenanigans.

Officer answered 21/2, 2013 at 21:13 Comment(1)
Actually, I think you may want python -u on top of whatever else you need. You want to remove any underlying Python-and/or-stdio buffering on stdin, and also remove any higher-level line-reading buffer, right?Jolie
O
5

What worked for me was simply setting FileInput(bufsize=1). The file.readlines() documentation does state "The optional size argument, if given, is an approximate bound on the total number of bytes in the lines returned." In practice, I get exactly one new line every time rather than having to fill a buffer.

with fileinput.input(bufsize=1) as f:
    for line in f:
        print("One line in, one line out!")
Officer answered 21/2, 2013 at 21:13 Comment(7)
It seems like this is actually guaranteed to work, as long as fileinput uses readlines(self._bufsize). Unfortunately, that in itself isn't documented to be true, but if you only care about CPython 3.2 you can be sure it is, and it seems pretty likely to be safe pretty widely beyond that, so if that's good enough, great.Jolie
And if you read through IOBase.readlines (pure Python and C implementations), it will call readline, which will call read 1 byte at a time if there's no buffer or peek. So, I think that cinches it, and you should accept your own answer.Jolie
Also, you might want to file a documentation bug on the fact that fileinput.input doesn't mention what bufsize does at all, and that the language reference should have enough information to guarantee that bufsize=1 (together with unbuffered stdin, when reading from stdin) means unbuffered fileinput.Jolie
@Jolie Where is the best place to file a python documentation bug?Officer
I believe docs bugs go to the same issue tracker (bugs.python.org) as code bugs, although you probably want to read the Python Developer's Guide to make sure I'm remembering right. Also, if you're not sure whether something is a bug, it may be better to bring it up on one of the mailing lists and get wider feedback first.Jolie
More specifically, the Helping with Documentation section gives the details of which mailing list, how to filter on the issue tracker, etc.Jolie
From the fileinput docs: Changed in version 2.7.12: The bufsize parameter is no longer used.Olindaolinde

© 2022 - 2024 — McMap. All rights reserved.