Use Python to remove lines in a files that start with an octothorpe?

Asked 8/8, 2011 at 17:7 Answered 8/8, 2011 at 17:29

This seems like a straight-forward question but I can't seem to pinpoint my problem. I am trying to delete all lines in a file that start with an octothorpe (#) except the first line. Here is the loop I am working with:

for i, line in enumerate(input_file):
    if i > 1:
        if not line.startswith('#'):
            output.write(line)

The above code doesn't seem to work. Does anyone known what my problem is? Thanks!

Expect answered 8/8, 2011 at 17:7 Comment(6)

Can you tell us what it does that isn't right? – Consign 8/8, 2011 at 17:8

I'm going to assume that you want lines that other than whitespace start with an octothorpe. See my answer. – Orthohydrogen 8/8, 2011 at 17:18

Am I the only one that didn't know what an octothorpe was? – Batholith 8/8, 2011 at 17:32

+1 for explaining what an octothorpe is. :-) – Rabbinical 8/8, 2011 at 17:35

Another option, use sed: sed '/^#.*/d' old.txt > new.txt – Tubuliflorous 8/8, 2011 at 18:34

Thanks for showing the sed implementation. I'm horrible with sed, but it's amazing how little code is needed! I'd love to get better with using sed/awk someday. – Expect 8/8, 2011 at 18:47

You aren't writing out the first line:

for i, line in enumerate(input_file):
    if i == 0:
        output.write(line)
    else:
        if not line.startswith('#'):
            output.write(line)

Keep in mind also that enumerate (like most things) starts at zero.

A little more concisely (and not repeating the output line):

for i, line in enumerate(input_file):
    if i == 0 or not line.startswith('#'):
        output.write(line)

Consign answered 8/8, 2011 at 17:9 Comment(1)

Thanks for your patience. Your answer solved my problem. I greatly appreciate the help and thanks for showing me how to use the or not statement! – Expect 8/8, 2011 at 17:17

I wouldn't bother with enumerate here. You only need it decide which line is the first line and which isn't. This should be easy enough to deal with by simply writing the first line out and then using a for loop to conditionally write additional lines that do not start with a '#'.

def removeComments(inputFileName, outputFileName):

    input = open(inputFileName, "r")
    output = open(outputFileName, "w")

    output.write(input.readline())

    for line in input:
        if not line.lstrip().startswith("#"):
            output.write(line)

    input.close()
    output.close()

Thanks to twopoint718 for pointing out the advantage of using lstrip.

Jobi answered 8/8, 2011 at 17:29 Comment(0)

Maybe you want to omit lines from the output where the first non-whitespace character is an octothorpe:

for i, line in enumerate(input_file):
    if i == 0 or not line.lstrip().startswith('#'):
        output.write(line)

(note the call to lstrip)

Orthohydrogen answered 8/8, 2011 at 17:17 Comment(3)

Thanks for pointing out the use of lstripmethod. In my case, there is never whitespace before the octothorpe, so I think I am safe with just the startswith method. – Expect 8/8, 2011 at 18:12

This doesn't write the first line. – Orts 8/8, 2011 at 20:31

@Orts I assumed that skipping the first line was the intentional and expected behavior: "I am trying to delete all lines in a file that start with an octothorpe (#) except the first line". I'll change mine to mirror Ned's above. – Orthohydrogen 9/8, 2011 at 0:59

Recommended topics

Hot tags