Use Python to remove lines in a files that start with an octothorpe?
Asked Answered
E

3

15

This seems like a straight-forward question but I can't seem to pinpoint my problem. I am trying to delete all lines in a file that start with an octothorpe (#) except the first line. Here is the loop I am working with:

for i, line in enumerate(input_file):
    if i > 1:
        if not line.startswith('#'):
            output.write(line)

The above code doesn't seem to work. Does anyone known what my problem is? Thanks!

Expect answered 8/8, 2011 at 17:7 Comment(6)
Can you tell us what it does that isn't right?Consign
I'm going to assume that you want lines that other than whitespace start with an octothorpe. See my answer.Orthohydrogen
Am I the only one that didn't know what an octothorpe was?Batholith
+1 for explaining what an octothorpe is. :-)Rabbinical
Another option, use sed: sed '/^#.*/d' old.txt > new.txtTubuliflorous
Thanks for showing the sed implementation. I'm horrible with sed, but it's amazing how little code is needed! I'd love to get better with using sed/awk someday.Expect
C
19

You aren't writing out the first line:

for i, line in enumerate(input_file):
    if i == 0:
        output.write(line)
    else:
        if not line.startswith('#'):
            output.write(line)

Keep in mind also that enumerate (like most things) starts at zero.

A little more concisely (and not repeating the output line):

for i, line in enumerate(input_file):
    if i == 0 or not line.startswith('#'):
        output.write(line)
Consign answered 8/8, 2011 at 17:9 Comment(1)
Thanks for your patience. Your answer solved my problem. I greatly appreciate the help and thanks for showing me how to use the or not statement!Expect
J
10

I wouldn't bother with enumerate here. You only need it decide which line is the first line and which isn't. This should be easy enough to deal with by simply writing the first line out and then using a for loop to conditionally write additional lines that do not start with a '#'.

def removeComments(inputFileName, outputFileName):

    input = open(inputFileName, "r")
    output = open(outputFileName, "w")

    output.write(input.readline())

    for line in input:
        if not line.lstrip().startswith("#"):
            output.write(line)

    input.close()
    output.close()

Thanks to twopoint718 for pointing out the advantage of using lstrip.

Jobi answered 8/8, 2011 at 17:29 Comment(0)
O
4

Maybe you want to omit lines from the output where the first non-whitespace character is an octothorpe:

for i, line in enumerate(input_file):
    if i == 0 or not line.lstrip().startswith('#'):
        output.write(line)

(note the call to lstrip)

Orthohydrogen answered 8/8, 2011 at 17:17 Comment(3)
Thanks for pointing out the use of lstripmethod. In my case, there is never whitespace before the octothorpe, so I think I am safe with just the startswith method.Expect
This doesn't write the first line.Orts
@Orts I assumed that skipping the first line was the intentional and expected behavior: "I am trying to delete all lines in a file that start with an octothorpe (#) except the first line". I'll change mine to mirror Ned's above.Orthohydrogen

© 2022 - 2024 — McMap. All rights reserved.