Python read specific lines of text between two strings
Asked Answered
S

2

6

I am having trouble getting python to read specific lines. What i'm working on is something like this:

lines of data not needed
lines of data not needed
lines of data not needed

--------------------------------------
    ***** REPORT 1 *****
--------------------------------------

[key] lines of interest are here
[key] lines of interest are here
[key] lines of interest are here
[key] lines of interest are here
[key] lines of interest are here      #This can also be the EOF

--------------------------------------    
    ***** REPORT 2 *****
--------------------------------------

lines of data not needed
lines of data not needed
lines of data not needed         #Or this will be the EOF

What I've attempted was something such as:

flist = open("filename.txt").readlines()

for line in flist:
  if line.startswith("\t**** Report 1"):
    break
for line in flist:
  if line.startswith("\t**** Report 2"):
    break
  if line.startswith("[key]"):
    #do stuff with data

However, I have a problem when the file ends without a end delimiter... Such as when report #2 is not displayed. What is a better approach?

Subulate answered 31/7, 2012 at 2:43 Comment(0)
K
12

One slight modification which looks like it should cover your problem:

flist = open("filename.txt").readlines()

parsing = False
for line in flist:
    if line.startswith("\t**** Report 1"):
        parsing = True
    elif line.startswith("\t**** Report 2"):
        parsing = False
    if parsing:
        #Do stuff with data 

If you want to avoid parsing the line "* Report 1"... itself, simply put the start condition after the if parsing, i.e.

flist = open("filename.txt").readlines()

parsing = False
for line in flist:

    if line.startswith("\t**** Report 2"):
        parsing = False
    if parsing:
        #Do stuff with data 
    if line.startswith("\t**** Report 1"):
        parsing = True
Kreiner answered 31/7, 2012 at 2:51 Comment(3)
i like it :) I will give this a try tomorrowSubulate
Or, you could put a continue statement after parsing = True to not parse the '***Report 1****' line as well.Gander
@Gander : While I agree that continuing with the loop is inefficient... why did you suggest continue instead of break?Kreiner
P
2

Here is possible alternative using the itertools module.
Although here the question requires checking for [key], I'm adding also itertool.islice() to show that it is possible to skip few lines after the start-reading marker when the user has some prior information.

from itertools import takewhile, islice, dropwhile

with open('filename.txt') as fid:
    for l in takewhile(lambda x: '***** REPORT 2 *****' not in x, islice(dropwhile(lambda x: '***** REPORT 1 *****' not in x, fid), 1, None)):
        if not '[key]' in l:
            continue
        print(l)
Postliminy answered 29/8, 2019 at 17:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.