Python file.write creating extra carriage return
Asked Answered
S

5

25

I'm writing a series of SQL statements to a file using python. The template string looks like:

store_insert = '\tinsert stores (storenum, ...) values (\'%s\', ...)'

I'm writing to the file like so:

for line in source:
    line = line.rstrip()
    fields = line.split('\t')
    script.write(store_insert % tuple(fields))
    script.write(os.linesep)

However, in the resulting output, I see \r\r\n at the end of each line, rather than \r\n as I would expect. Why?

Simplicidentate answered 26/10, 2010 at 16:31 Comment(6)
% string formatting is now old; the preferred idiom is str.format =)Ahner
Did you open the file in text or binary mode? Which OS are you using?Mandarin
Windows, and I just did a open(file, 'r')Simplicidentate
oh for the output file I did open(outputFile, 'w')Simplicidentate
Ok, I opened the file using "open(file, 'wb')" instead (for binary mode) and that fixed the problem. Why is python converting my \r\n to \r\r\n when the file is opened in text mode?Simplicidentate
I ran into this as well. This is insane. Chalk up another weird issue with Python.Rickirickie
H
49

\n is converted to os.linesep for files opened in text-mode. So when you write os.linesep to a text-mode file on Windows, you write \r\n, and the \n gets converted resulting in \r\r\n.

See also the docs:

Do not use os.linesep as a line terminator when writing files opened in text mode (the default); use a single '\n' instead, on all platforms.

Housman answered 26/10, 2010 at 16:57 Comment(2)
+1 well found! This doesn't actually happen for me (Win7), maybe it's a Windows-dependent thing?Ahner
I'm also using windows 7, but that explains it. +1 and answer!Simplicidentate
W
19

With Python 3

open() introduces the new parameter newline that allows to specify a string which any occurrence of \n will be translated to.

Passing an empty string argument newline='' disables the translation, leaving the new line char as it is. Valid for text mode only.

From the documentation

On output, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

Womack answered 20/12, 2016 at 17:25 Comment(2)
For a use case and some elaboration, see hereScintilla
os.open has no newline param. open does however. os.open has an O_BINARY flag though, which does the same thing.Barimah
S
7

Text files have different line endings on different operating systems, but it's convenient to work with strings that have a consistent line ending character. Python inherits the convention from C of using '\n' as the universal line ending character and relying on the file read and write functions to do a conversion, if necessary. The read and write functions know to do this if the file was opened in the default text mode. If you add the b character to the mode string when opening the file, this translation is skipped.

Sclerite answered 26/10, 2010 at 17:4 Comment(0)
P
1

see the open() doc:

In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.

Portsalut answered 26/10, 2010 at 16:35 Comment(2)
@AndiDog: i think what he is saying is that when he open a file with open('', 'r') after he did write on it he see \r\r\n and he think that he did write only '\r\n' (windows), so i told him that when he will open his file open() will add automatically \r\n to his data , so '\r\n' + '\r\n' = '\r\r\n' ,the '\n' is removed do you want me to elaborate more ???Portsalut
No I'm actually using a separate output file opened with open(file, 'w'). Changing to open(file, 'wb') fixed the problem, but I'm not entirely sure I understand whySimplicidentate
A
1

Works for me:

>>> import tempfile
>>> tmp = tempfile.TemporaryFile(mode="w+")
>>> store_insert = '\tinsert stores (storenum, ...) values (\'%s\', ...)'
>>> lines = ["foo\t\t"]
>>> for line in lines:
...     line = line.rstrip()
...     fields = line.split("\t")
...     tmp.write(store_insert % tuple(fields))
...     tmp.write(os.linesep)
...
>>> tmp.seek(0)
>>> tmp.read()
"\tinsert stores (storenum, ...) values ('foo', ...)\r\n"

Are you sure this is the code that's running, that os.linesep is what you think it is, etc?

Ahner answered 26/10, 2010 at 16:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.