Hacking into the internals of print
seems a bad idea. Instead I think you should do the string escaping yourself, and eventually use cat
to print the string without any extra escaping.
You can use encodeString
to do the initial escaping, gregexpr
to identify octal \0..
escapes, strtoi
to convert strings representing octal numbers to those numbers, sprintf
to print numbers in hexadecimal, and regenmatches
to operate on the matched parts. The whole process would look something like this:
inputString <- "This is a \005 symbol. \x13 is \\x13."
x <- encodeString(inputString)
m <- gregexpr("\\\\[0-3][0-7][0-7]", x)
charcodes <- strtoi(substring(regmatches(x, m)[[1]], 2, 4), 8)
regmatches(x, m) <- list(sprintf("\\x%02x", charcodes))
cat(x, "\n")
Note that this approach will convert octal escapes like \005
to hexadecimal escapes like \x05
, but other escape sequences like \t
or \a
won't be affected by this. You might need more code to deal with those as well, but the above should contain all the ingredients you need.
Note that the BSON specification you refer to almost certainly meant raw bytes, so as long as your string contains a character with code 5, which you can write as "\x05"
in your input, and you write that string to the desired output in binary mode, it shouldn't matter at all how R prints that string to you. After all, octal \005
and hexadecimal \x05
are just two representations of the same byte you'll write.
"\x05"
and"\005"
are valid C syntax. – Disc