Make R use C notation when escaping terminals

Asked 5/8, 2012 at 23:3 Answered 6/8, 2012 at 14:40

Not sure I am using the right terminology here, but I need the print or deparse methods use C notation (e.g. "\x05" instead of "\005" ) when escaping bytes out of the regular character set.

x <- "This is a \x05 symbol"
print(x)
[1] "This is a \005 symbol"

Is there a native way to accomplish this?

I need this for generating BSON: http://bsonspec.org/#/specification. All of the examples explicitly use \x05 notation.

Carmon answered 5/8, 2012 at 23:3 Comment(1)

FYI, both "\x05" and "\005" are valid C syntax. – Disc 6/8, 2012 at 1:36

Hacking into the internals of print seems a bad idea. Instead I think you should do the string escaping yourself, and eventually use cat to print the string without any extra escaping.

You can use encodeString to do the initial escaping, gregexpr to identify octal \0.. escapes, strtoi to convert strings representing octal numbers to those numbers, sprintf to print numbers in hexadecimal, and regenmatches to operate on the matched parts. The whole process would look something like this:

inputString <- "This is a \005 symbol. \x13 is \\x13."
x <- encodeString(inputString)
m <- gregexpr("\\\\[0-3][0-7][0-7]", x)
charcodes <- strtoi(substring(regmatches(x, m)[[1]], 2, 4), 8)
regmatches(x, m) <- list(sprintf("\\x%02x", charcodes))
cat(x, "\n")

Note that this approach will convert octal escapes like \005 to hexadecimal escapes like \x05, but other escape sequences like \t or \a won't be affected by this. You might need more code to deal with those as well, but the above should contain all the ingredients you need.

Note that the BSON specification you refer to almost certainly meant raw bytes, so as long as your string contains a character with code 5, which you can write as "\x05" in your input, and you write that string to the desired output in binary mode, it shouldn't matter at all how R prints that string to you. After all, octal \005 and hexadecimal \x05 are just two representations of the same byte you'll write.

Coffle answered 6/8, 2012 at 14:40 Comment(1)

Ugh. I was hoping for a more native solution. But I guess this works. – Carmon 8/8, 2012 at 10:29

-1

Does cat suit your needs? Note, you have to escape the backslash:

> x <- "This is a \\x05 symbol\n"
> cat(x)
This is a \x05 symbol

Stateroom answered 6/8, 2012 at 1:2 Comment(1)

You string does not contain the actual character. It just contains an escaped backslash. I need this for large text files which contain actual \x05 characters – Carmon 6/8, 2012 at 8:48

Recommended topics

Hot tags