Delete final line in file with python
Asked Answered
D

11

41

How can one delete the very last line of a file with python?

Input File example:

hello
world
foo
bar

Output File example:

hello
world
foo

I've created the following code to find the number of lines in the file - but I do not know how to delete the specific line number.

    try:
        file = open("file")
    except IOError:
        print "Failed to read file."
    countLines = len(file.readlines())
Deberadeberry answered 10/12, 2009 at 0:57 Comment(6)
Are you trying to actually remove the line from the file, on disk? If so, make sure you understand that files don't have "lines" from the filesystem's point of view. Lines are a convention of programmers and programs. What you see as a "line" is a sequence of bytes somewhere in the middle of lots of other bytes. To remove the last "line", you could truncate the file at the byte corresponding to the first character in the line. That's not difficult (you just have to find it), but there's not much point if the files involved are not many megabytes in size.Gabbard
What if the last line is an empty line?Viper
Last line is not blank. I remove all blank lines with another python snippet (from google).Deberadeberry
? The file contains no blanks lines? The example above is what you should look on, nothing else. The last line is what I need to remove. Why the condescension? I've almost got it with Strawberry's answer.Deberadeberry
The file in question is not in memory - it is as is above.Deberadeberry
There was no condescension in my questions... just puzzlement, and maybe skepticism that you're doing this in a sensible manner. You wrote about the blank line removal. If the file is in memory, it's not a file, it's a list of strings. If you're already using Python on this "file" to remove blank lines, and this is an entirely separate step, then you're processing this data twice, inefficiently. These are all simple facts, but I'll stop now, if you don't like the help.Gabbard
G
23

You could use the above code and then:-

lines = file.readlines()
lines = lines[:-1]

This would give you an array of lines containing all lines but the last one.

Gewirtz answered 10/12, 2009 at 1:1 Comment(9)
Will this work well for large files? E.g. thousands of lines?Deberadeberry
It might not work well for files bigger than a megabyte or two. Depends on your definition of "well". It should be perfectly fine for any desktop use for a few thousand lines.Jumada
Well - Within a second or two.Deberadeberry
Is there no other way to directly delete a specific line? Or is an array the way to go?Deberadeberry
Nazarius: There isn't any way to delete a specific line. You can however truncate a file or append to it. Since you want to delete the last line, you can just truncate.Camarilla
@Deberadeberry an option could be to use os.system("sed '$d' file") to run sed, at the point that a binary will work faster over big files and processing in general. Truncate file seems the most fastest way. Anyway, this question has many usefull options :) +1 for this question.Zanthoxylum
Would this read the complete file from start to end?Stuffing
@Stuffing Yes, in this example it would read all the lines into an array in memory.Gewirtz
This doesn't remove the line from the file - it only removes it from the list lines while the file on disk still has it in place.Rutkowski
P
88

Because I routinely work with many-gigabyte files, looping through as mentioned in the answers didn't work for me. The solution I use:

with open(sys.argv[1], "r+", encoding = "utf-8") as file:

    # Move the pointer (similar to a cursor in a text editor) to the end of the file
    file.seek(0, os.SEEK_END)

    # This code means the following code skips the very last character in the file -
    # i.e. in the case the last line is null we delete the last line
    # and the penultimate one
    pos = file.tell() - 1

    # Read each character in the file one at a time from the penultimate
    # character going backwards, searching for a newline character
    # If we find a new line, exit the search
    while pos > 0 and file.read(1) != "\n":
        pos -= 1
        file.seek(pos, os.SEEK_SET)

    # So long as we're not at the start of the file, delete all the characters ahead
    # of this position
    if pos > 0:
        file.seek(pos, os.SEEK_SET)
        file.truncate()
Pontiff answered 10/12, 2009 at 0:57 Comment(3)
this is the best answer. use "with" statement to save a line :)Bacteriostasis
I ran into some compatibility issues (using Py3) when using this method on files that were used on both mac and windows, because internally Mac uses a different line terminator than Windows (which uses 2: cr and lf). The solution was to open the file in binary read mode ("rb+"), and search for the binary newline character b"\n".Rahmann
If you open the file with "a+" instead of "r+", can you skip the file.seek(0, os.SEEK_END)?Advowson
G
23

You could use the above code and then:-

lines = file.readlines()
lines = lines[:-1]

This would give you an array of lines containing all lines but the last one.

Gewirtz answered 10/12, 2009 at 1:1 Comment(9)
Will this work well for large files? E.g. thousands of lines?Deberadeberry
It might not work well for files bigger than a megabyte or two. Depends on your definition of "well". It should be perfectly fine for any desktop use for a few thousand lines.Jumada
Well - Within a second or two.Deberadeberry
Is there no other way to directly delete a specific line? Or is an array the way to go?Deberadeberry
Nazarius: There isn't any way to delete a specific line. You can however truncate a file or append to it. Since you want to delete the last line, you can just truncate.Camarilla
@Deberadeberry an option could be to use os.system("sed '$d' file") to run sed, at the point that a binary will work faster over big files and processing in general. Truncate file seems the most fastest way. Anyway, this question has many usefull options :) +1 for this question.Zanthoxylum
Would this read the complete file from start to end?Stuffing
@Stuffing Yes, in this example it would read all the lines into an array in memory.Gewirtz
This doesn't remove the line from the file - it only removes it from the list lines while the file on disk still has it in place.Rutkowski
D
11

This doesn't use python, but python's the wrong tool for the job if this is the only task you want. You can use the standard *nix utility head, and run

head -n-1 filename > newfile

which will copy all but the last line of filename to newfile.

Deanedeaner answered 10/12, 2009 at 1:13 Comment(4)
I'd like to keep it cross platform - hence the via python in the question.Deberadeberry
This does not work on Mac OSX: head: illegal line count -- -1Auten
Love it, nice and simple. I'm fine with a linux solution. :DHubby
I suspect the Python version with seek is less RAM hungry and therefore more appropriate for very large files, whereas head is a nice one-liner, but involves reading and copying almost the complete file.Wolsey
F
7

Assuming you have to do this in Python and that you have a large enough file that list slicing isn't sufficient, you can do it in a single pass over the file:

last_line = None
for line in file:
    if last_line:
        print last_line # or write to a file, call a function, etc.
    last_line = line

Not the most elegant code in the world but it gets the job done.

Basically it buffers each line in a file through the last_line variable, each iteration outputs the previous iterations line.

Funda answered 10/12, 2009 at 1:18 Comment(0)
P
5

here is my solution for linux users:

import os 
file_path = 'test.txt'
os.system('sed -i "$ d" {0}'.format(file_path))

no need to read and iterate through the file in python.

Parclose answered 15/11, 2016 at 16:20 Comment(1)
how do you use this to remove last n lines of a file?Pallmall
C
3

On systems where file.truncate() works, you could do something like this:

file = open('file.txt', 'rb')
pos = next = 0
for line in file:
  pos = next # position of beginning of this line
  next += len(line) # compute position of beginning of next line
file = open('file.txt', 'ab')
file.truncate(pos)

According to my tests, file.tell() doesn't work when reading by line, presumably due to buffering confusing it. That's why this adds up the lengths of the lines to figure out positions. Note that this only works on systems where the line delimiter ends with '\n'.

Camarilla answered 10/12, 2009 at 1:15 Comment(3)
Very dangerous on a platform which uses more than one character for "end of line"... as in Windows.Gabbard
Good point. (That was actually why I was originally going to use tell(), but it doesn't work.) In this case opening the file in binary mode should work.Camarilla
I'd also go with truncation, especially for large files.Erebus
D
1

Here's a more general memory-efficient solution allowing the last 'n' lines to be skipped (like the head command):

import collections, fileinput
def head(filename, lines_to_delete=1):
    queue = collections.deque()
    lines_to_delete = max(0, lines_to_delete) 
    for line in fileinput.input(filename, inplace=True, backup='.bak'):
        queue.append(line)
        if lines_to_delete == 0:
            print queue.popleft(),
        else:
            lines_to_delete -= 1
    queue.clear()
Dentilabial answered 10/12, 2009 at 2:41 Comment(0)
P
1

Inspiring from previous posts, I propound this:

with open('file_name', 'r+') as f:
  f.seek(0, os.SEEK_END) 
  while f.tell() and f.read(1) != '\n':
    f.seek(-2, os.SEEK_CUR)
  f.truncate()
Proliferation answered 3/1, 2017 at 6:47 Comment(0)
E
0

Though I have not tested it (please, no hate for that) I believe that there's a faster way of going it. It's more of a C solution, but quite possible in Python. It's not Pythonic, either. It's a theory, I'd say.

First, you need to know the encoding of the file. Set a variable to the number of bytes a character in that encoding uses (1 byte in ASCII). CHARsize (why not). Probably going to be 1 byte with an ASCII file.

Then grab the size of the file, set FILEsize to it.

Assume you have the address of the file (in memory) in FILEadd.

Add FILEsize to FILEadd.

Move backwords (increment by -1***CHARsize**), testing each CHARsize bytes for a \n (or whatever newline your system uses). When you reach the first \n, you now have the position of the beginning of the first line of the file. Replace \n with \x1a (26, the ASCII for EOF, or whatever that is one your system/with the encoding).

Clean up however you need to (change the filesize, touch the file).

If this works as I suspect it would, you're going to save a lot of time, as you don't need to read through the whole file from the beginning, you read from the end.

Ellata answered 10/12, 2009 at 1:36 Comment(2)
Note that the whole \x1a (aka ^Z aka CTRL-Z aka EOF, which is actually SUB in ASCII) thing is totally last century... very few text files are terminated with an actual SUB character any more, and even those are pretty much limited to Windows/DOS systems. And CPM I think.Gabbard
Ah good point - I wasn't sure if it was still in widespread use... can something else be used to salvage this technique?Ellata
W
0

here's another way, without slurping the whole file into memory

p=""
f=open("file")
for line in f:
    line=line.strip()
    print p
    p=line
f.close()
Wrinkly answered 10/12, 2009 at 2:32 Comment(0)
G
0

Here is a solution when you have a file object already opened at some point of your application and you wouldn't like to open it again:

import io

def remove_last_line(file: io.TextIOWrapper):
    """Remove the last line of file without reopening it"""
    # move pointer to first sign
    file.seek(0)

    # store all file lines
    lines = file.readlines() 
    
    # remove its content 
    file.seek(0)
    file.truncate()

    # write all lines but last 
    file.writelines(lines[:-1])
    
    # place pointer at end of the file
    file.seek(0, 2)
Garibaldi answered 17/2 at 21:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.