How to get the current open file line in python?
Asked Answered
U

2

7

Suppose you open a file, and do an seek() somewhere in the file, how do you know the current file line ?

(I personally solved with an ad-hoc file class that maps the seek position to the line after scanning the file, but I wanted to see other hints and to add this question to stackoverflow, as I was not able to find the problem anywhere on google)

Undersized answered 27/11, 2009 at 14:58 Comment(4)
I actually posted the class somewhere here on SO... don't know where.Undersized
If you are seeking to a byte offset, there is no way to know the line # without counting the # of \n characters encountered before that position. As to what the most efficient way is with a file, I'm not sure.... Good luck!Leslielesly
yep. maybe there's some library that does this service. I implemented it myself as I said, but I'd prefer to delegate this task to an external library if possible.Undersized
@Stefano - are you looking for #1657799?Marco
V
4

Here's how I would approach the problem, using as much laziness as possible:

from random import randint
from itertools import takewhile, islice

file = "/etc/passwd"
f = open(file, "r")

f.seek(randint(10,250))
pos = f.tell()

print "pos=%d" % pos

def countbytes(iterable):
    bytes = 0
    for item in iterable:
        bytes += len(item)
        yield bytes

print 1+len(list(takewhile(lambda x: x <= pos, countbytes(open(file, "r")))))

For a slightly less readable but much more lazy approach, use enumerate and dropwhile:

from random import randint
from itertools import islice, dropwhile

file = "/etc/passwd"
f = open(file, "r")

f.seek(randint(10,250))
pos = f.tell()

print "pos=%d" % pos

def countbytes(iterable):
    bytes = 0
    for item in iterable:
        bytes += len(item)
        yield bytes

print list(
        islice(
            dropwhile(lambda x: x[1] <= pos, enumerate(countbytes(open(file, "r"))))
            , 1))[0][0]+1
Velour answered 27/11, 2009 at 15:22 Comment(0)
B
6

When you use seek(), python gets to use pointer offsets to jump to the desired position in the file. But in order to know the current line number, you have to examine each character up to that position. So you might as well abandon seek() in favor of read():

Replace

f = open(filename, "r")
f.seek(55)

with

f = open(filename, "r")
line=f.read(55).count('\n')+1
print(line)

Perhaps you do not wish to use f.read(num) since this may require a lot of memory if num is very large. In that case, you could use a generator like this:

import itertools
import operator
line_number=reduce(operator.add,( f.read(1)=='\n' for _ in itertools.repeat(None,num)))
pos=f.tell()

This is equivalent to f.seek(num) with the added benefit of giving you line_number.

Backwater answered 27/11, 2009 at 15:32 Comment(0)
V
4

Here's how I would approach the problem, using as much laziness as possible:

from random import randint
from itertools import takewhile, islice

file = "/etc/passwd"
f = open(file, "r")

f.seek(randint(10,250))
pos = f.tell()

print "pos=%d" % pos

def countbytes(iterable):
    bytes = 0
    for item in iterable:
        bytes += len(item)
        yield bytes

print 1+len(list(takewhile(lambda x: x <= pos, countbytes(open(file, "r")))))

For a slightly less readable but much more lazy approach, use enumerate and dropwhile:

from random import randint
from itertools import islice, dropwhile

file = "/etc/passwd"
f = open(file, "r")

f.seek(randint(10,250))
pos = f.tell()

print "pos=%d" % pos

def countbytes(iterable):
    bytes = 0
    for item in iterable:
        bytes += len(item)
        yield bytes

print list(
        islice(
            dropwhile(lambda x: x[1] <= pos, enumerate(countbytes(open(file, "r"))))
            , 1))[0][0]+1
Velour answered 27/11, 2009 at 15:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.