Why can't I end a raw string with a backslash? [duplicate]
Asked Answered
H

4

11

I am confused here, even though raw strings convert every \ to \\ but when this \ appears in the end it raises error.

>>> r'so\m\e \te\xt'
'so\\m\\e \\te\\xt'

>>> r'so\m\e \te\xt\'
SyntaxError: EOL while scanning string literal

Update:

This is now covered in Python FAQs as well: Why can’t raw strings (r-strings) end with a backslash?

Hepzi answered 23/6, 2012 at 8:40 Comment(0)
N
10

You still need \ to escape ' or " in raw strings, since otherwise the python interpreter doesn't know where the string stops. In your example, you're escaping the closing '.

Otherwise:

r'it wouldn\'t be possible to store this string'
r'since it'd produce a syntax error without the escape'

Look at the syntax highlighting to see what I mean.

Nonmetal answered 23/6, 2012 at 8:42 Comment(9)
but the \ inside the string then should also escape the character next to them, instead they simply convert to \\.Hepzi
@AshwiniChaudhary: No, in a raw string, a `` only escapes a quote character.Nonmetal
yes, SO is not allowing me to write a single \ in code formatting.;) Thanks I got the point.Hepzi
still it's not a single '\' SO converts it to '\\'.Hepzi
Not really sure what you mean...Nonmetal
A backslash in a raw string doesn't escape anything. You cannot produce the string '''""" with a raw string literal. Try it.Blakney
This answer is incorrect. r'\'' Produces "\\'" in Python 3, so ` doesn't actually escape the '. The documentation is unclear in this case, there is no escaping going on, it's just the string literal parsing that gives an error.Obmutescence
@Lennart, @Karl: I stand corrected. That behavior is very odd then. I can produce that string with r"'''"'"""', but that's kinda cheating.Nonmetal
Yeah, it's two literals, and only one of them is raw. :-)Obmutescence
O
9

Raw strings can't end in single backslashes because of how the parser works (there is no actual escaping going on, though). The workaround is to add the backslash as a non-raw string literal afterwards:

>>> print(r'foo\')
  File "<stdin>", line 1
    print(r'foo\')
                 ^
SyntaxError: EOL while scanning string literal
>>> print(r'foo''\\')
foo\

Not pretty, but it works. You can add plus to make it clearer what is happening, but it's not necessary:

>>> print(r'foo' + '\\')
foo\
Obmutescence answered 23/6, 2012 at 14:13 Comment(0)
B
5

Python strings are processed in two steps:

  1. First the tokenizer looks for the closing quote. It recognizes backslashes when it does this, but doesn't interpret them - it just looks for a sequence of string elements followed by the closing quote mark, where "string elements" are either (a character that's not a backslash, closing quote or a newline - except newlines are allowed in triple-quotes), or (a backslash, followed by any single character).

  2. Then the contents of the string are interpreted (backslash escapes are processed) depending on what kind of string it is. The r flag before a string literal only affects this step.

Blakney answered 23/6, 2012 at 12:49 Comment(1)
It seems the Python scanner stores the 'r' as a token, then goes on to scan the string using the default string processing rules, instead of rules where a baskslash is treated as an ordinary character. This issue is discussed at https://mcmap.net/q/1014122/-why-does-the-single-backslash-raw-string-in-python-cause-a-syntax-error-duplicate/3259619.Wellworn
E
3

Quote from https://docs.python.org/3.4/reference/lexical_analysis.html#literals:

Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, not as a line continuation.

So in raw string, backslash are not treated specially, except when preceding " or '. Therefore, r'\' or r"\" is not a valid string cause right quote is escaped thus making the string literal invalid. In such case, there's no difference whether r exists, i.e. r'\' is equivalent to '\' and r"\" is equivalent to "\".

Epicurus answered 17/5, 2015 at 3:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.