This is an easy mistake to make.
First, let's define some terms:
- statement This is a piece of shell code that generally represents a single action for the shell to execute. The action may be a documented shell builtin or keyword command plus arguments, the file name of an external executable plus arguments, a compound command (such as a braced block or subshell), a pipeline of all of the above, or a command list of all of the above. Multiple statements can usually be coded sequentially using statement separators, which differ by shell. For example, the Unix
bash
shell uses semicolons (for foreground execution) or ampersands (for background), while the Windows cmd
shell uses ampersands (for foreground).
- command This is a very general term that can refer to any of the above types of command, or to a whole statement, or even to multiple sequential statements. This is the kind of term that requires context to clarify its meaning.
- simple command This is a command that only executes a shell builtin or external executable. These may occur as their own statements, or they may form part of compound commands, pipelines, or command lists. In the bash shell, variable assignments and redirections can form part of or even the entirety of a simple command.
- command word In the context of a single simple command, this is the name of the program you want to run. This will either be the documented name of a shell builtin, or it will be the file name of an external executable. This is sometimes described as the first word of the command, or the zeroth argument.
- command arguments In the context of a single simple command, this is the zero or more (additional) arguments given to the builtin or executable.
- command line This term carries with it the suggestion that it refers to a single line of shell code. However, it is often used slightly more loosely, to describe any self-contained, often one-off piece of shell code, which may in actuality contain line breaks, and thus technically consists of more than one textual line. The term command is sometimes used as a shorthand for this concept as well, further adding to its ambiguity. Also note that command line is sometimes used as a shorthand for the command-line interface type of user interface, which is never connoted by the unqualified term command.
- system command This is another general term that requires context to clarify its meaning. It can be considered a synonym for command, except the additional modifier "system" suggests that the execution of the command is being initiated from a programmatic context that exists outside the shell, such as an R session.
The design of the system2()
function seems to suggest that the authors only intended it to be used to run simple commands. It takes the command word as the first function argument (expected to be a scalar string, meaning a one-element character vector) and the command arguments as the second (also expected to be a character vector, zero or more elements). Here's how the documentation puts it in the description of these two function arguments:
command
the system command to be invoked, as a character string.
args
a character vector of arguments to command
.
The above doesn't make it perfectly clear, but the first sentence of the Details section helps:
Unlike system()
, command
is always quoted by shQuote()
, so it must be a single command without arguments.
As you can see, the documentation is a little bit vague in that it throws around the general term command without much clarification. They also use the vague term system command, which doesn't help matters much either. What they mean is that the first function argument command
is intended to be a command word of a simple command. If you want to pass any command arguments, you must specify them in the the second function argument args
.
In the authors' defense, shell code can be very platform-dependent and inconsistent in implementation and behavior. To use the more precise terms that I've defined in this post would have put the documentation writers at risk of committing errors, at least with respect to some systems which R aspires to support. Vagueness can be a safehouse against risk of outright error.
Note that this differs from the other R system command function, system()
:
command
the system command to be invoked, as a character string.
And in the Details section:
command
is parsed as a command plus arguments separated by spaces. So if the path to the command (or a single argument such as a file path) contains spaces, it must be quoted e.g. by shQuote()
. Unix-alikes pass the command line to a shell (normally ‘/bin/sh
’, and POSIX requires that shell), so command
can be anything the shell regards as executable, including shell scripts, and it can contain multiple commands separated by ;
.
So for system()
, the first function argument command
is a full command line.
So they actually use exactly the same function argument name (command
) and description ("the system command to be invoked, as a character string."), even though the argument has two completely different meanings between system()
and system2()
! Understanding this documentation really does require careful parsing by the reader.
So, finally, we can understand how to correctly use system2()
to invoke the desired java command:
word <- 'java';
args <- c('-jar','sample.jar','674');
result <- system2(word,args,stdout='C:/Code/stdout.txt',stderr='C:/Code/stderr.txt');
Just to try to clarify further, it's helpful to experiment with the behavior of these functions by trying some simple test cases. For example (on my Cygwin bash shell):
system('printf %d:%x\\\\n 31 31');
## 31:1f
system2('printf',c('%d:%x\\\\n','31','31'));
## 31:1f
(Note that the quadrupling of backslashes is necessary because they pass through 3 interpolation contexts, namely (1) R string literal interpolation, (2) bash (non-single-quoted) lexical context, and (3) the printf
command's interpolation of its first command argument. We need printf
to interpolate the final \n
ASCII character code.)
Also, it should be noted that, although system2()
clearly encourages only running simple commands by enforcing separation of the command word and command arguments into separate function arguments, it is very possible to subvert that intention and use shell metacharacters to execute some decidedly non-simple shell code through the system2()
interface:
system('echo a b; echo c d');
## a b
## c d
system2('echo',c('a','b; echo c d'));
## a b
## c d
This is, of course, highly inadvisable.