to read line from file without getting "\n" appended at the end [duplicate]
Asked Answered
T

8

20

My file is "xml.txt" with following contents:

books.xml 
news.xml
mix.xml

if I use readline() function it appends "\n" at the name of all the files which is an error because I want to open the files contained within the xml.txt. I wrote this:

fo = open("xml.tx","r")
for i in range(count.__len__()): #here count is one of may arrays that i'm using
    file = fo.readline()
    find_root(file) # here find_root is my own created function not displayed here

error encountered on running this code:

IOError: [Errno 2] No such file or directory: 'books.xml\n'
Tigrinya answered 1/7, 2012 at 7:31 Comment(2)
Don't use count.__len__(), but len(count)!Hawkinson
Although the question asks specifically about the '\n' character, there is a more general issue of reading a line without the line-ending, whatever it may be for the file. Almost all of the answers do not address this. (Daniel F.'s appears to).Huckleberry
L
42

To remove just the newline at the end:

line = line.rstrip('\n')

The reason readline keeps the newline character is so you can distinguish between an empty line (has the newline) and the end of the file (empty string).

Lychnis answered 1/7, 2012 at 7:39 Comment(0)
C
19

From Best method for reading newline delimited files in Python and discarding the newlines?

lines = open(filename).read().splitlines()
Capstan answered 3/11, 2013 at 8:13 Comment(0)
P
7

You could use the .rstrip() method of string objects to get a version with trailing whitespace (including newlines) removed.

E.g.:

find_root(file.rstrip())
Piscator answered 1/7, 2012 at 7:36 Comment(2)
can you tell me the syntax?I mean how and where should I add this?Tigrinya
This solution will remove all trailing whitespace instead of just the newline. If the line read is 'foo \n', then .rstrip() will return 'foo' whereas 'foo ' is desired as per the problem statement.Innutrition
J
3

I timed it just for curiosity. Below are the results for a vary large file.

tldr; File read then split seems to be the fastest approach on a large file.

with open(FILENAME, "r") as file:
    lines = file.read().split("\n")

However, if you need to loop through the lines anyway then you probably want:

with open(FILENAME, "r") as file:
    for line in file:
        line = line.rstrip("\n")

Python 3.4.2

import timeit


FILENAME = "mylargefile.csv"
DELIMITER = "\n"


def splitlines_read():
    """Read the file then split the lines from the splitlines builtin method.

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = file.read().splitlines()
    return lines
# end splitlines_read

def split_read():
    """Read the file then split the lines.

    This method will return empty strings for blank lines (Same as the other methods).
    This method may also have an extra additional element as an empty string (compared to
    splitlines_read).

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = file.read().split(DELIMITER)
    return lines
# end split_read

def strip_read():
    """Loop through the file and create a new list of lines and removes any "\n" by rstrip

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = [line.rstrip(DELIMITER) for line in file]
    return lines
# end strip_readline

def strip_readlines():
    """Loop through the file's read lines and create a new list of lines and removes any "\n" by
    rstrip. ... will probably be slower than the strip_read, but might as well test everything.

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = [line.rstrip(DELIMITER) for line in file.readlines()]
    return lines
# end strip_readline

def compare_times():
    run = 100
    splitlines_t = timeit.timeit(splitlines_read, number=run)
    print("Splitlines Read:", splitlines_t)

    split_t = timeit.timeit(split_read, number=run)
    print("Split Read:", split_t)

    strip_t = timeit.timeit(strip_read, number=run)
    print("Strip Read:", strip_t)

    striplines_t = timeit.timeit(strip_readlines, number=run)
    print("Strip Readlines:", striplines_t)
# end compare_times

def compare_values():
    """Compare the values of the file.

    Note: split_read fails, because has an extra empty string in the list of lines. That's the only
    reason why it fails.
    """
    splr = splitlines_read()
    sprl = split_read()
    strr = strip_read()
    strl = strip_readlines()

    print("splitlines_read")
    print(repr(splr[:10]))

    print("split_read", splr == sprl)
    print(repr(sprl[:10]))

    print("strip_read", splr == strr)
    print(repr(strr[:10]))

    print("strip_readline", splr == strl)
    print(repr(strl[:10]))
# end compare_values

if __name__ == "__main__":
    compare_values()
    compare_times()

Results:

run = 1000
Splitlines Read: 201.02846901328783
Split Read: 137.51448011841822
Strip Read: 156.18040391519133
Strip Readline: 172.12281272950372

run = 100
Splitlines Read: 19.956802833188124
Split Read: 13.657361738959867
Strip Read: 15.731161020969516
Strip Readlines: 17.434831199281092

run = 100
Splitlines Read: 20.01516321280158
Split Read: 13.786344555543899
Strip Read: 16.02410587620824
Strip Readlines: 17.09326775703279

File read then split seems to be the fastest approach on a large file.

Note: read then split("\n") will have an extra empty string at the end of the list.

Note: read then splitlines() checks for more then just "\n" possibly "\r\n".

Jeweller answered 13/3, 2015 at 18:41 Comment(0)
E
1

It's better style to use a context manager for the file, and len() instead of calling .__len__()

with open("xml.tx","r") as fo:
    for i in range(len(count)): #here count is one of may arrays that i'm using
        file = next(fo).rstrip("\n")
        find_root(file) # here find_root is my own created function not displayed here
Expectancy answered 1/7, 2012 at 7:45 Comment(2)
You forgot to mention that good Python style also includes not hiding built-ins with your own names, like file...Emendation
@martineau, Yes, I let that one slide since it's deprecatedExpectancy
J
1

To remove the newline character fro the end you could also use something like this:

for line in file:
   print line[:-1]
Jumbala answered 20/4, 2013 at 21:9 Comment(0)
O
1

A use case with @Lars Wirzenius's answer:

with open("list.txt", "r") as myfile:
    for lines in myfile:
        lines = lines.rstrip('\n')    # the trick
        try:
            with open(lines) as myFile:
                print "ok"
        except IOError as e:
            print "files does not exist"
Ordination answered 28/10, 2015 at 16:21 Comment(0)
S
0
# mode : 'r', 'w', 'a'
f = open("ur_filename", "mode")
for t in f:
    if(t):
        fn.write(t.rstrip("\n"))

"If" condition will check whether the line has string or not, if yes next line will strip the "\n" at the end and write to a file. Code Tested. ;)

Sathrum answered 22/4, 2015 at 21:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.