Why do string commands via `R -e ..` on Mac vs Linux requires an extra escape?
Asked Answered
D

1

6

This one stumped me. I have a simple shell script executing that works fine on my Linux (AWS aka CentOS) machine but crashed on my Mac OS X machine. It turned out escapes (\) in string commands needed an extra escape character (\\).

Could someone enlighten me as to what I am missing here -- ie, what is it about running the R scripts on Macs that require this?

The behavior was *not* observed when calling, say, python3 -c ..

On both machines, I am using bash, specifically /bin/bash

NOTE: The Mac is a slightly later version of R: 3.5.1 vs 3.4.1, but I would be strongly surprised if that was the culprit. Anyone available to confirm?


Simple Example:

R --vanilla -e 'cat(" Hello \n World \n ")'

The above runs fine on a CentOS machine, but requires an additional escape character (\\n instead of \n) to execute correctly. (example at bottom)

For reference/comparison, the following python command works identically on each of the Mac OS X, CentOS machines I tested.

python3 -c 'print("Hello \n World")'


For details, here is the output comparing the two commands on each of the two machines

1. R --vanilla -e 'cat(" Hello \n World \n ")'
2. R --vanilla -e 'cat(" Hello \\n World \\n ")'

1.

R --vanilla -e 'cat(" Hello \n World \n ")'

## CENTOS: 
> cat(" Hello \n World \n ")
 Hello
 World

## MAC OS X:
> cat(" Hello
+
+ Error: unexpected end of input
Execution halted

2.

R --vanilla -e 'cat(" Hello \\n World \\n ")'

## CENTOS: 
> cat(" Hello \\n World \\n ")
 Hello \n World \n >

## MAC OS X:
> cat(" Hello \n World \n ")
 Hello
 World

For comparison's sake, I'm not seeing the same behavior when running a simple python script.

## Each of these produce identical 
##  results in Mac OSX as CentOS

python3 -c 'print("Hello \n World")'
python3 -c 'print("Hello \\n World")'


Machine & Session Info:

  1. Linux Box
> cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2018.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2018.03"
PRETTY_NAME="Amazon Linux AMI 2018.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"

> R --vanilla -e 'sessionInfo()'
R version 3.4.1 (2017-06-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Amazon Linux AMI 2018.03

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.1
  1. Mac OS Box
Mojave 10.14.3

> R --vanilla -e 'sessionInfo()'
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14.3

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.5.1
  1. Another Mac OSX machine, running 3.4.3, same error
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Delightful answered 18/3, 2019 at 19:5 Comment(15)
@CharlesDuffy: That's why I included the python example. If it was related to the shell, I would have expected the same behavior for both Python & R. I think the issue is R related. But no, not using fish, just bash in both cases. I will edit the questionDelightful
BTW, I can't reproduce the behavior -- with R version 3.5.1 installed via Nixpkgs, I get the same behavior on MacOS you report for Linux.Alban
@CharlesDuffy that's very interesting, thank you. I can't update my CentOS machine, but I will try to spin up a new one.Delightful
...so, looking at the R wrapper shell script, it's pretty ugly in there (as in, rampant BashFAQ #50 violations) -- I'd need to dig in more to say anything conclusive, but it certainly smells like they could have introduced some bugs.Alban
...as a pull quote, the R wrapper script executes exec "${R_binary}" ${args} "${@}". That's... well, follow the BashFAQ #50 link in the above comment for a description of why unquoted $args is a practice nobody should ever, ever use.Alban
One thing you can try is cutting the wrapper out of the picture on both operating systems and seeing if that solves the problem. Does R_HOME=/usr/lib/R /usr/lib/R/bin/exec/R -e 'cat(" Hello \n World \n ")' behave consistently, after you fix up the paths to be correct for each platform?Alban
On your MacOS systems, by the way, which version of sed is in your PATH -- the host's BSD sed, or GNU sed installed with Nix/MacPorts/Homebrew/etc? (Though if that were related to the problem, bypassing the wrapper as suggested above should also avoid it).Alban
(...to be very clear, btw, the use of sed to munge command lines in R's wrapper script is an abomination, but that's neither here nor there; it exists, in the current implementation).Alban
Could the version of Bash be relevant here, i.e. the escape behavior have changed in some version and on your macOS its an older version? bash --versionHolman
@HenrikB, I describe earlier in the comments (ed: in now-deleted comments, oops) why the version of bash is not relevant -- interpretation between single-quoted strings is identical between all POSIX-compliant shells. And bash --version is the wrong way to check anyhow; it tells you which version is first in the PATH, not which version is currently running; thus, on MacOS, it shows any newer interpreter installed with Nix/MacPorts/Homebrew, even if Apple's /bin/bash is currently in use.Alban
Maybe this issue should be sent to the R-SIG-Mac mailing list.Cracknel
@CharlesDuffy, thxs. 1. I noticed that you could not reproduce OPs observations, whereas OP could do it on two different systems. That's some kind of clue. 2. My point was that there could be a bug/bug fix/change in Bash that reveals itself in one version but not another - would be useful to rule out. 3. Is there a way to query the version of the running bash? 4. I agree with your concerns about the R script. 5. Running ShellCheck on it (and other scripts such as build, INSTALL, and check) reveals this and other potential problemsHolman
I had a look at the R script; it uses sed to process commandline arguments. As mentioned Charles Duffy the macOS (BSD) version of sed may have something to do with this. BSD sed seems to not accept control sequences \n and \t (see this). This could explain the needed double escapes. However I may be completely wrong.Cracknel
@HenrikB, one likely reason I couldn't reproduce the behavior they stated for MacOS is that I'm using a (Nix-provided) GNU sed even on MacOS. Answering your other question: To check the running version of bash, declare -p BASH_VERSION will do.Alban
@Bhas, I agree. If the OP tried bypassing the wrapper script (and thus bypassing sed) by executing R/bin/exec/R directly, that should be able to rule out that theory (or increase its likelihood, by isolating the wrapper or something it calls as the source of the variance).Alban
C
4

Following up on my suggestion that the cause of the problem could be the difference between macOS sed and GNU sed and Charles Duffy's suggestion as last comment on the original question I have tried calling the R executable directly on macOS Mojave 10.14.3 and with R 3.5.3 as follows in a shell script:

export R_HOME=$(R RHOME)
Rexec=${R_HOME}/bin/exec/R

$Rexec --vanilla -e 'cat(" Hello \n World \n ")'

And this will not give error messages and give the same output as on Linux.

In my opinion the issue is not a bug in bash but a regrettable difference between macOS (BSD) and GNU sed. I don't have a clue how this could be corrected in the R wrapper script, if possible.

Cracknel answered 19/3, 2019 at 20:52 Comment(2)
@CharlesDuffy. Thanks for the improvement to my answer.Cracknel
np; thank you for doing the experiment to confirm. Personally, my fix would be to rewrite the wrapper to not use sed -- the easiest ways to avoid unquoted expansion are to use arrays as BashFAQ #50 suggests, but one could probably also get the same effect through in-place manipulation of "$@" in a POSIX-compliant way. Beyond the scope of this question, though; I think you've answered it well.Alban

© 2022 - 2024 — McMap. All rights reserved.