How to create raw string from string variable in python?
Asked Answered
I

3

14

You create raw string from a string this way:

test_file=open(r'c:\Python27\test.txt','r')

How do you create a raw variable from a string variable, such as

path = 'c:\Python27\test.txt'

test_file=open(rpath,'r')

Because I have a file path:

file_path = "C:\Users\b_zz\Desktop\my_file"

When I do:

data_list = open(os.path.expandvars(file_path),"r").readlines()

I get:

Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    scheduled_data_list = open(os.path.expandvars(file_path),"r").readlines()
IOError: [Errno 22] invalid mode ('r') or filename: 'C:\\Users\x08_zz\\Desktop\\my_file'
Isochromatic answered 6/2, 2014 at 14:23 Comment(4)
why "b_zz" is replaced as "x08_zz" in your error message?Guglielmo
that's what I'd like to knowIsochromatic
´ord('\b')´ is 8. Either double the backslashes or prepend the string in code with an ´r´.Suhail
Why not just write r"C:\Users\b_zz\Desktop\my_file" in the first place? Or better yet, "C:/Users/b_zz/Desktop/my_file"?Mountain
W
11

There is no such thing as "raw string" once the string is created in the process. The "" and r"" ways of specifying the string exist only in the source code itself.

That means "\x01" will create a string consisting of one byte 0x01, but r"\x01" will create a string consisting of 4 bytes '0x5c', '0x78', '0x30', '0x31'. (assuming we're talking about python 2 and ignoring encodings for a while).

You mentioned in the comment that you're taking the string from the user (either gui or console input will work the same here) - in that case string character escapes will not be processed, so there's nothing you have to do about it. You can check it easily like this (or whatever the windows equivalent is, I only speak *nix):

% cat > test <<EOF                                             
heredoc> \x41
heredoc> EOF
% < test python -c "import sys; print sys.stdin.read()"
\x41
Webby answered 6/2, 2014 at 17:27 Comment(0)
C
7

My solution to convert string to raw string (works with this sequences only: '\a', \b', '\f', '\n', '\r', '\t', '\v' . List of all escape sequences is here):

def str_to_raw(s):
    raw_map = {8:r'\b', 7:r'\a', 12:r'\f', 10:r'\n', 13:r'\r', 9:r'\t', 11:r'\v'}
    return r''.join(i if ord(i) > 32 else raw_map.get(ord(i), i) for i in s)

Demo:

>>> file_path = "C:\Users\b_zz\Desktop\fy_file"
>>> file_path
'C:\\Users\x08_zz\\Desktop\x0cy_file'
>>> str_to_raw(file_path)
'C:\\Users\\b_zz\\Desktop\\fy_file'
Cola answered 6/2, 2014 at 14:35 Comment(13)
But the I get the path string from a GUI input. How do I add "r" to the beginning?Isochromatic
What the user is asking is, how can i take an unknown string and make it so that the path doesn't get binary-represented (a "raw" string rather than a interpreted string)Hie
In memory there are no raw strings. A raw string is just a helper for source code. If you get the string via (GUI)input everything is OK.Suhail
@Isochromatic How do you get the path string from the user? If you get a \b character in there, I don't think you got what you wanted from them.Hardness
You'll get exactly what the user provided. No string transformation/unescaping happens when you just look at/use a value. The "\b" part is not related to the string itself, but rather it's an artifact of source code parsing.Webby
@Webby for \b case: im search 8 in string and return r'\b' in that place. And r'\b' is actually two characters '\' and 'b' in raw string. Test itCola
@Cola Yes, because that's what you put in the source. Basically you're saying in the first line is construct a string that contains an 0x08 byte after "C:\Users". If file_path comes from the gui, this will not happen. Also str_to_raw is incorrect - even if it was needed, it would fail on "C:\x2345" for example.Webby
@Webby author of question say that he got this from input, why i shouldnt trust him? And my solution just works, i dont understand you. No fail for "C:\0x2345"Cola
@Cola In [2]: str_to_raw("C:\x2345"), Out[2]: 'C:#45' this is not what you want to get. I don't trust the author, because I he doesn't understand the difference between source and in-memory representation of the strings ("How do you create a raw variable from a string variable") - nothing personal, just trying to explain what really happens. There's a small chance that GUI really does such conversion, but that would be a bug in the GUI implementation. (and as shown in this example, it's not fixable in the code receiving the value)Webby
@Webby maybe bug, ok. About fail - it works for me in console python 2.7 >>> str_to_raw("C:\0x2345") 'C:\x00x2345'Cola
I edited the comment with \x2345 instead of \0x2345, but the original is still incorrect. Do you see how the function appended the x00 part that didn't exist there in the first place?Webby
@Webby but "C:\x2345" is exactly what you got from my function: >>> "C:\x2345" 'C:#45'. It is not fail but featureCola
We'll have to disagree then :) I think it's a bug if you want to reverse some transformation, but fail to do it for all cases. (which is impossible to do here) Also this function won't help the question author since the issue begins in some other place in the code - he shouldn't just patch the place where the issue is first seen.Webby
K
0

The solution by ndpu works for me.

I could not resist the temptation to enhance it (make it compatible with ancient Python 2 versions and hoping to speed it up):

_dRawMap = {8:r'\b', 7:r'\a', 12:r'\f', 10:r'\n', 13:r'\r', 9:r'\t', 11:r'\v'}

def getRawGotStr(s):
    #
    return r''.join( [ _dRawMap.get( ord(c), c ) for c in s ] )

I did a careful time trial, and it turns out that the original code by ndpu is a little faster. List comprehensions are fast, but generator expressions are faster.

Keelby answered 28/1, 2018 at 20:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.