Should I escape shell arguments in Perl?
Asked Answered
D

5

15

When using system() calls in Perl, do you have to escape the shell args, or is that done automatically?

The arguments will be user input, so I want to make sure this isn't exploitable.

Dacy answered 6/3, 2009 at 18:33 Comment(2)
What do you mean, escape shell args? Do you mean putting \'s before any characters like ">" or " " or do you want to include escaping $'s so people can't inject your Perl variables? Or what? Give an example of what you mean.Overvalue
How to add curly braces inside System calls.... system("$jboss_client /subsystem=logging/size-rotating-file-handler=SAMPLE:add\(formatter=\{yyyy\} \)"); I am always getting the out like below.... formatter=yyyy instaed of farmatter={yyyy}. Can you give me an IDEA to resolve this?Hudnall
R
38

If you use system $cmd, @args rather than system "$cmd @args" (an array rather than a string), then you do not have to escape the arguments because no shell is invoked (see system). system {$cmd} $cmd, @args will not invoke a shell either even if $cmd contains metacharacters and @args is empty (this is documented as part of exec). If the args are coming from user input (or other untrusted source), you will still want to untaint them. See -T in the perlrun docs, and the perlsec docs.

If you need to read the output or send input to the command, qx and readpipe have no equivalent. Instead, use open my $output, "-|", $cmd, @args or open my $input, "|-", $cmd, @args although this is not portable as it requires a real fork which means Unix only... I think. Maybe it'll work on Windows with its simulated fork. A better option is something like IPC::Run, which will also handle the case of piping commands to other commands, which neither the multi-arg form of system nor the 4 arg form of open will handle.

Rasheedarasher answered 6/3, 2009 at 18:43 Comment(7)
As an addition, system {'cmd'} 'cmd' always bypasses sh even if 'cmd' contains characters that would normally be interpreted by the shell.Burke
You should add that the reason why you don't have to escape shell metacharacters with "system 'cmd' @args" is that no shell is being invoked in this case (since the OP asked wether shell metachars would be escaped "automatically" which is not the case).Alwin
+1, I'm with chaos -- I never heard of the indirect-object syntax before!Flaring
The "-|" and "|-" piped open() modes work fine on Windows when used in the 3-or-more-arguments form -- no simulated fork() is required (for that, anyway). What doesn't work is using them in the 2-arg form to try to communicate with a forked child process.Flaring
I've been testing the "system" and "open my $output" solutions, but neither let me redirect STDERR to a file, which sadly means I can't use the solution in my scenario. I assume that is working as designed, but wanted to share in case that info helps others. #4413844Zelaya
@Dan: See docs for open(). You can dup then redirect STDERR, run your command, then restore STDERR.Rasheedarasher
@schulwitz - he wants to redirect stderr to a file, try it, it works.Rasheedarasher
F
13

On Windows, the situation is a bit nastier. Basically, all Win32 programs receive one long command-line string -- the shell (usually cmd.exe) may do some interpretation first, removing < and > redirections for example, but it does not split it up at word boundaries for the program. Each program must do this parsing themselves (if they wish -- some programs don't bother). In C and C++ programs, routines provided by the runtime libraries supplied with the compiler toolchain will generally perform this parsing step before main() is called.

The problem is, in general, you don't know how a given program will parse its command line. Many programs are compiled with some version of MSVC++, whose quirky parsing rules are described here, but many others are compiled with different compilers that use different conventions.

This is compounded by the fact that cmd.exe has its own quirky parsing rules. The caret (^) is treated as an escape character that quotes the following character, and text inside double quotes is treated as quoted if a list of tricky criteria are met (see cmd /? for the full gory details). If your command contains any strange characters, it's very easy for cmd.exe's idea of which parts of text are "quoted" and which aren't to get out of sync with your target program's, and all hell breaks loose.

So, the safest approach for escaping arguments on Windows is:

  1. Escape arguments in the manner expected by the command-line parsing logic of the program you're calling. (Hopefully you know what that logic is; if not, try a few examples and guess.)
  2. Join the escaped arguments with spaces.
  3. Prefix every single non-alphanumeric character of the resulting string with ^.
  4. Append any redirections or other shell trickery (e.g. joining commands with &&).
  5. Run the command with system() or backticks.
Flaring answered 8/3, 2009 at 7:11 Comment(6)
Interesting information - thank you. It doesn't endear Windows to this Unixophile, but it helps to know what happens behind the scenes. (The ref'd page is a bit quiet about the role of caret! It mentions it, but only by exception. It is not clear how it handles ^\ or ^", for example.)Halm
I agree with Jonathan Leffler. That is (in my opinion) an awful way to handle command-line arguments.Overvalue
I totally agree that it's a terrible situation. Though in fairness, most of the terribleness probably arises from MS's laudable devotion to maintaining backwards compatibility. (To see just how obsessive they are, check out Raymond Chen's excellent blog sometime.)Flaring
@Jonathan: To clarify, two levels of encoding are necessary -- the caret is seen only by cmd.exe, which removes it when it passes the command line to the program being run. The rules on that page describe how a MSVC++-compiled program will parse its cmd line (i.e. the 2nd layer of parsing).Flaring
+1, and thanks for the link from my blog to this answer :) I agree that it is terrible, and you have given good resources on tackling the problem. I wonder if there is a tool to auto-escape this stuff that is already aware of the correct escaping logic. That would make things like this less trial-and-error.Stateroom
@MerlynMorgan-Graham: No problem :) I'm afraid I haven't looked for a tool. If it did exist it would be flaky though, because it would need to guess which runtime library the called program was compiled with (MSVC is probably right 95% of the time)...Flaring
L
2
 sub esc_chars {
  # will change, for example, a!!a to a\!\!a
     @_ =~ s/([;<>\*\|`&\$!#\(\)\[\]\{\}:'"])/\\$1/g;
     return @_;
  }

http://www.slac.stanford.edu/slac/www/resource/how-to-use/cgi-rexx/cgi-esc.html

Legalese answered 19/11, 2011 at 2:53 Comment(1)
Does anyone know where this list of characters come from? I could not find it perl docs. I'd rather not read the source code of Perl's exec function it that is possible.Tancred
C
1

If you use system "$cmd @args" (a string), then you have to escape the arguments because a shell is invoked.

Fortunately, for double quoted strings, only four characters need escaping:

"    - double quote
$    - dollar
@    - at symbol
\    - backslash
Canto answered 3/11, 2013 at 12:2 Comment(1)
I think OP was talking about shell interpretation, not avoiding accidental interpolation of scalars and arrays.Auditorium
Z
0

The answers on your question were very useful. In the end I followed @runrig's advice but then used the core module open3() command so I could capture the output from STDERR as well as STDOUT.

For sample code of open3() in use with @runrig's solution, see my related question and answer:
Calling system commands from Perl

Zelaya answered 13/12, 2010 at 17:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.