How to delete a specific line in a text file using Python?
Asked Answered
K

19

221

Let's say I have a text file full of nicknames. How can I delete a specific nickname from this file, using Python?

Kheda answered 17/1, 2011 at 4:38 Comment(1)
Try fileinput as described by @j-f-sebastian here. It seems to allow you to work line-by-line, via a temporary file, all with a simple for syntax.Philbrook
K
281

First, open the file and get all your lines from the file. Then reopen the file in write mode and write your lines back, except for the line you want to delete:

with open("yourfile.txt", "r") as f:
    lines = f.readlines()
with open("yourfile.txt", "w") as f:
    for line in lines:
        if line.strip("\n") != "nickname_to_delete":
            f.write(line)

You need to strip("\n") the newline character in the comparison because if your file doesn't end with a newline character the very last line won't either.

Kumiss answered 17/1, 2011 at 4:44 Comment(7)
why do we have to open and close it twice?Fourhanded
@Ooker: You have to open the file twice (and close it in between) because in the first mode it is "read-only" because you are just reading in the current lines in the file. You then close it and re-open it in "write mode", where the file is writable and you replace the contents of the file sans the line you wanted to remove.Lengthen
Why does Python not allow us to do this in one line?Fourhanded
@Ooker, When you read a line, try to imagine a cursor moving along the line as it's read. Once that line has been read the cursor is now past it. When you try to write into the file you write where the cursor currently is. By re-opening the file you reset the cursor.Gilchrist
I didn't know that it is that complicate. Thank you. If we have a reset cursor function, does that mean we don't need to close and re-open it?Fourhanded
@Gilchrist Why not just move the cursor? This seems unnecessary complicated to me.Murchison
This task can be done opening the file only once... but it needs to be opened 'r+', AND , you'd need to call flie.seek(0) (to move the cursor to the beginning) and file.truncate() (to invalidate the existing contents), before proceeding to rewrite it out.Cheeseburger
L
145

Solution to this problem with only a single open:

with open("target.txt", "r+") as f:
    d = f.readlines()
    f.seek(0)
    for i in d:
        if i != "line you want to remove...":
            f.write(i)
    f.truncate()

This solution opens the file in r/w mode ("r+") and makes use of seek to reset the f-pointer then truncate to remove everything after the last write.

Leonoreleonsis answered 21/1, 2015 at 0:42 Comment(4)
This worked very well for me, as I had to use lockfile also (fcntl). I couldnt find any way to use fileinput together with fcntl.Lapides
It would be nice to see some side effects of this solution.Bridgettbridgette
I wouldn't do this. If you get an error in the for loop, you'll end up with a partially overwritten file, with duplicate lines or a line half cut off. You might want to f.truncate() right after f.seek(0) instead. That way if you get an error you'll just end up with an incomplete file. But the real solution (if you have the disk space) is to output to a temporary file and then use os.replace() or pathlib.Path(temp_filename).replace(original_filename) to swap it with the original after everything has succeeded.Merca
Might you add i.strip('\n') != "line you want to remove..." as mentioned in the accepted answer, that would perfectly solve my problem. Because just i didn't do anything for meCaracara
B
52

The best and fastest option, rather than storing everything in a list and re-opening the file to write it, is in my opinion to re-write the file elsewhere.

with open("yourfile.txt", "r") as file_input:
    with open("newfile.txt", "w") as output: 
        for line in file_input:
            if line.strip("\n") != "nickname_to_delete":
                output.write(line)

That's it! In one loop and one only you can do the same thing. It will be much faster.

Breadwinner answered 13/11, 2014 at 15:28 Comment(5)
Instead of using normal for loop we can make use of Generator Expression This way program will not load all the lines from file to memory which is not good idea in case of big files. It will only have single line in memory at a time. With generator expression for loop will look like, (output.write(line) for line in input if line!="nickname_to_delete"+"\n")Pammie
@ShriShinde You're not reading the file into memory when looping over the file object either, so this solution works identical to your suggestion.Murchison
You might want to delete the original file and rename the second file to the original file's name, which with Python on a Linux OS would look like this, subprocess.call(['mv', 'newfile.txt', 'yourfile.txt'])Malinger
os.replace (new in python v 3.3) is more cross-platform than a system call to mv.Lissettelissi
I think this is a better solution because it doesn't store the whole file in memory before making changes, which could be an issue with very large files.Rosiorosita
C
40

This is a "fork" from @Lother's answer (should be considered the right answer).

For a file like this:

$ cat file.txt 
1: october rust
2: november rain
3: december snow

This code:

#!/usr/bin/python3.4

with open("file.txt","r+") as f:
    new_f = f.readlines()
    f.seek(0)
    for line in new_f:
        if "snow" not in line:
            f.write(line)
    f.truncate()

Improvements:

  • with open, which discards the usage of f.close()
  • more clearer if/else for evaluating if string is not present in the current line
Civvies answered 25/7, 2017 at 5:46 Comment(2)
If f.seek(0) required?Vanzant
@Vanzant yes. Otherwise instead of overwriting the file you'll append the file to itself (without the lines you're excluding).Merca
S
10

The issue with reading lines in first pass and making changes (deleting specific lines) in the second pass is that if you file sizes are huge, you will run out of RAM. Instead, a better approach is to read lines, one by one, and write them into a separate file, eliminating the ones you don't need. I have run this approach with files as big as 12-50 GB, and the RAM usage remains almost constant. Only CPU cycles show processing in progress.

Serrulate answered 17/1, 2011 at 4:39 Comment(0)
J
6

A simple solution not been proposed :

with open( file_of_nicknames, "r+" ) as f:
    lines = f.readlines()           # Get a list of all lines
    f.seek(0)                       # Reset the file to the beginning

    idx = lines.index("Nickname\n") # Don't forget the '\n'
    lines.pop( idx )                # Remove the corresponding index

    f.truncate()                    # Stop processing now
                                    # because len(file_lines) > len( lines ) 
    f.writelines( lines )           # write back

Inspired of precedent answers

Jochebed answered 16/3, 2023 at 10:42 Comment(0)
Q
4

If you use Linux, you can try the following approach.
Suppose you have a text file named animal.txt:

$ cat animal.txt  
dog
pig
cat 
monkey         
elephant  

Delete the first line:

>>> import subprocess
>>> subprocess.call(['sed','-i','/.*dog.*/d','animal.txt']) 

then

$ cat animal.txt
pig
cat
monkey
elephant
Quantifier answered 27/2, 2016 at 7:11 Comment(3)
This solution isn't OS agnostic, and since OP didn't specify a operation system, there's no reason to post a Linux specific answer imo.Murchison
Anyone who suggests using subprocess for anything that can be done with just python gets a downvote! And +1 to @SteinarLima... I agreeDebarath
The -i option is nonstandard, and works differently on *BSD platforms (including macOS) than on Linux. Python's fileinput module does the same thing transparently, portably, and natively.Stuffing
S
3

I liked the fileinput approach as explained in this answer: Deleting a line from a text file (python)

Say for example I have a file which has empty lines in it and I want to remove empty lines, here's how I solved it:

import fileinput
import sys
for line_number, line in enumerate(fileinput.input('file1.txt', inplace=1)):
    if len(line) > 1:
            sys.stdout.write(line)

Note: The empty lines in my case had length 1

Shipentine answered 13/1, 2015 at 10:58 Comment(0)
V
2

Probably, you already got a correct answer, but here is mine. Instead of using a list to collect unfiltered data (what readlines() method does), I use two files. One is for hold a main data, and the second is for filtering the data when you delete a specific string. Here is a code:

main_file = open('data_base.txt').read()    # your main dataBase file
filter_file = open('filter_base.txt', 'w')
filter_file.write(main_file)
filter_file.close()
main_file = open('data_base.txt', 'w')
for line in open('filter_base'):
    if 'your data to delete' not in line:    # remove a specific string
        main_file.write(line)                # put all strings back to your db except deleted
    else: pass
main_file.close()

Hope you will find this useful! :)

Velites answered 25/11, 2015 at 7:16 Comment(0)
D
2

I think if you read the file into a list, then do the you can iterate over the list to look for the nickname you want to get rid of. You can do it much efficiently without creating additional files, but you'll have to write the result back to the source file.

Here's how I might do this:

import, os, csv # and other imports you need
nicknames_to_delete = ['Nick', 'Stephen', 'Mark']

I'm assuming nicknames.csv contains data like:

Nick
Maria
James
Chris
Mario
Stephen
Isabella
Ahmed
Julia
Mark
...

Then load the file into the list:

 nicknames = None
 with open("nicknames.csv") as sourceFile:
     nicknames = sourceFile.read().splitlines()

Next, iterate over to list to match your inputs to delete:

for nick in nicknames_to_delete:
     try:
         if nick in nicknames:
             nicknames.pop(nicknames.index(nick))
         else:
             print(nick + " is not found in the file")
     except ValueError:
         pass

Lastly, write the result back to file:

with open("nicknames.csv", "a") as nicknamesFile:
    nicknamesFile.seek(0)
    nicknamesFile.truncate()
    nicknamesWriter = csv.writer(nicknamesFile)
    for name in nicknames:
        nicknamesWriter.writeRow([str(name)])
nicknamesFile.close()
Doggo answered 24/4, 2016 at 18:31 Comment(0)
C
1

In general, you can't; you have to write the whole file again (at least from the point of change to the end).

In some specific cases you can do better than this -

if all your data elements are the same length and in no specific order, and you know the offset of the one you want to get rid of, you could copy the last item over the one to be deleted and truncate the file before the last item;

or you could just overwrite the data chunk with a 'this is bad data, skip it' value or keep a 'this item has been deleted' flag in your saved data elements such that you can mark it deleted without otherwise modifying the file.

This is probably overkill for short documents (anything under 100 KB?).

Curtiscurtiss answered 17/1, 2011 at 5:55 Comment(0)
P
1

I like this method using fileinput and the 'inplace' method:

import fileinput
for line in fileinput.input(fname, inplace =1):
    line = line.strip()
    if not 'UnwantedWord' in line:
        print(line)

It's a little less wordy than the other answers and is fast enough for

Pointdevice answered 6/5, 2019 at 1:43 Comment(0)
P
0

Save the file lines in a list, then remove of the list the line you want to delete and write the remain lines to a new file

with open("file_name.txt", "r") as f:
    lines = f.readlines() 
    lines.remove("Line you want to delete\n")
    with open("new_file.txt", "w") as new_f:
        for line in lines:        
            new_f.write(line)
Pulchritudinous answered 19/2, 2017 at 0:55 Comment(2)
When giving an answer it is preferable to give some explanation as to WHY your answer is the one.Mavismavra
If your file doesn't end with a newline, this code won't remove the last line even if it contains a word you want to remove.Merca
B
0

here's some other method to remove a/some line(s) from a file:

src_file = zzzz.txt
f = open(src_file, "r")
contents = f.readlines()
f.close()

contents.pop(idx) # remove the line item from list, by line number, starts from 0

f = open(src_file, "w")
contents = "".join(contents)
f.write(contents)
f.close()
Biforked answered 9/10, 2018 at 13:49 Comment(0)
G
0

You can use the re library

Assuming that you are able to load your full txt-file. You then define a list of unwanted nicknames and then substitute them with an empty string "".

# Delete unwanted characters
import re

# Read, then decode for py2 compat.
path_to_file = 'data/nicknames.txt'
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

# Define unwanted nicknames and substitute them
unwanted_nickname_list = ['SourDough']
text = re.sub("|".join(unwanted_nickname_list), "", text)
Garrulity answered 8/8, 2019 at 16:1 Comment(0)
S
-1

Do you want to remove a specific line from file so use this snippet short and simple code you can easily remove any line with sentence or prefix(Symbol).

with open("file_name.txt", "r") as f:
lines = f.readlines() 
with open("new_file.txt", "w") as new_f:
    for line in lines:
        if not line.startswith("write any sentence or symbol to remove line"):
            new_f.write(line)
Senzer answered 12/8, 2020 at 14:34 Comment(1)
The only unique feature relative to existing older answers seems to be the indentation error.Stuffing
V
-1

This is the easiest method i found and worked for me

with open('/content/punch_data.txt') as punch_file : #opening the file in the reading mode
for line in punch_file:
  if line.isspace():
    continue
  else:
    print(line)
Vandiver answered 10/1 at 1:1 Comment(0)
E
-2

To delete a specific line of a file by its line number:

Replace variables filename and line_to_delete with the name of your file and the line number you want to delete.

filename = 'foo.txt'
line_to_delete = 3
initial_line = 1
file_lines = {}

with open(filename) as f:
    content = f.readlines() 

for line in content:
    file_lines[initial_line] = line.strip()
    initial_line += 1

f = open(filename, "w")
for line_number, line_content in file_lines.items():
    if line_number != line_to_delete:
        f.write('{}\n'.format(line_content))

f.close()
print('Deleted line: {}'.format(line_to_delete))

Example output:

Deleted line: 3
Enroot answered 16/4, 2020 at 18:31 Comment(1)
there is no need for building a dict, just use for nb, line in enumerate(f.readlines())Gujarat
I
-3

Take the contents of the file, split it by newline into a tuple. Then, access your tuple's line number, join your result tuple, and overwrite to the file.

Imogen answered 17/1, 2011 at 4:40 Comment(1)
(1) do you mean tuple(f.read().split('\n'))?? (2) "access your tuple's line number" and "join your result tuple" sound rather mysterious; actual Python code might be more understandable.Isopropanol

© 2022 - 2024 — McMap. All rights reserved.