What is the preferred cross-platform IPC Perl module?
Asked Answered
C

3

16

I want to create a simple IO object that represents a pipe opened to another program to that I can periodically write to another program's STDIN as my app runs. I want it to be bullet-proof (in that it catches all errors) and cross-platform. The best options I can find are:

open

sub io_read {
    local $SIG{__WARN__} = sub { }; # Silence warning.
    open my $pipe, '|-', @_ or die "Cannot exec $_[0]: $!\n";
    return $pipe;
}

Advantages:

  • Cross-platform
  • Simple

Disadvantages

  • No $SIG{PIPE} to catch errors from the piped program
  • Are other errors caught?

IO::Pipe

sub io_read {
    IO::Pipe->reader(@_);
}

Advantages:

  • Simple
  • Returns an IO::Handle object for OO interface
  • Supported by the Perl core.

Disadvantages

  • Still No $SIG{PIPE} to catch errors from the piped program
  • Not supported on Win32 (or, at least, its tests are skipped)

IPC::Run

There is no interface for writing to a file handle in IPC::Run, only appending to a scalar. This seems…weird.

IPC::Run3

No file handle interface here, either. I could use a code reference, which would be called repeatedly to spool to the child, but looking at the source code, it appears that it actually writes to a temporary file, and then opens it and spools its contents to the pipe'd command's STDIN. Wha?

IPC::Cmd

Still no file handle interface.


What am I missing here? It seems as if this should be a solved problem, and I'm kind of stunned that it's not. IO::Pipe comes closest to what I want, but the lack of $SIG{PIPE} error handling and the lack of support for Windows is distressing. Where is the piping module that will JDWIM?

Car answered 13/5, 2012 at 6:34 Comment(12)
You're mistaken about IPC::Run. It can handle file handles no problem. It can even do crazy redirections and pseudo-ttys.Tambac
IPC::Run3 can also handle file handles. Where are you getting your info?Tambac
use sigtrap for the pipe signals.Andes
@ikegami—From the docs. I see the file handle stuff now in IPC::Run. Not sure how I missed it before, except that I paid more attention to IPC::Run3, where I can pass in a file handle, but printing to it seemed to be ignored after the command was run.Car
@daxim—sigtrap looks useful, but I really want a module where I don't have to think about that, where errors are just turned into exceptions, like what IPC::System::Simple does for system and back ticks.Car
Er, let me rephrase. I did notice the file handle support in IPC::Run before, and now I remember the issue: It looks like I can open some other file to be READ from, and that will be spooled to the child's STDIN. What I want is a file handle I can WRITE to, as my program runs, and it will be spooled to the child as I write. This is how the open |- file handle works.Car
Re "What I want is a file handle I can WRITE to": So use a pipe. Like you said you wanted to do. There's even built-in support for creating those pipes for you. It's right there in the synopsis!!!Tambac
Note that doing IPC with file handles is really hard if you have more than one file handle (select!!). So while IPC::Run does give you the option of doing that, it should be a last resort. The plus of IPC::Run over the others you mention is that it can hide the pipes from you.Tambac
Oh, look, a mini language for pipes. I completely missed that. (My eyes kind of glazed over reading the synopsis before.) I've started the module that I've been imagining, but will make a more careful reading of IPC::Run before I continue with it (it is not working quite right, yet, anyway…).Car
@ikegami: So I see how to make use of the scalar ref buffers, which is a little weird, but nice (I can probably abstract it into an OO interface, since I otherwise would need a bunch of attributes in my class to track the harness and the buffers). I do wish it was specifically line-oriented, though, at least for reading the output. That's why I liked file handles: I can use <$fh> or $fh->getline to iterate over lines…Car
@Theory, No, you can't use <$fh> aka $fh->getline safely if you have more than one pipe. It can lead to a deadlock. Doing IPC with file handles is really hard if you have more than one file handle (select!!).Tambac
Yeah. The state of the art here is sadly lacking, I've found. I've decided to abandon the use of IPC for this project. Thanks for the help. Oh, and @ikegami, if you want to leave an answer about IPC::Run (the least bad option?), I'd be happy to accept it. Otherwise I will add a bitchfest comment and accept it.Car
C
6

Thanks to guidance from @ikegami, I have found that the best choice for interactively reading from and writing to another process in Perl is IPC::Run. However, it requires that the program you are reading from and writing to have a known output when it is done writing to its STDOUT, such as a prompt. Here's an example that executes bash, has it run ls -l, and then prints that output:

use v5.14;
use IPC::Run qw(start timeout new_appender new_chunker);

my @command = qw(bash);

# Connect to the other program.
my ($in, @out);
my $ipc = start \@command,
    '<' => new_appender("echo __END__\n"), \$in,
    '>' => new_chunker, sub { push @out, @_ },
    timeout(10) or die "Error: $?\n";

# Send it a command and wait until it has received it.
$in .= "ls -l\n";
$ipc->pump while length $in;

# Wait until our end-of-output string appears.
$ipc->pump until @out && @out[-1] =~ /__END__\n/m;

pop @out;
say @out;

Because it is running as an IPC (I assume), bash does not emit a prompt when it is done writing to its STDOUT. So I use the new_appender() function to have it emit something I can match to find the end of the output (by calling echo __END__). I've also used an anonymous subroutine after a call to new_chunker to collect the output into an array, rather than a scalar (just pass a reference to a scalar to '>' if you want that).

So this works, but it sucks for a whole host of reasons, in my opinion:

  • There is no generally useful way to know that an IPC-controlled program is done printing to its STDOUT. Instead, you have to use a regular expression on its output to search for a string that usually means it's done.
  • If it doesn't emit one, you have to trick it into emitting one (as I have done here—god forbid if I should have a file named __END__, though). If I was controlling a database client, I might have to send something like SELECT 'IM OUTTA HERE';. Different applications would require different new_appender hacks.
  • The writing to the magic $in and $out scalars feels weird and action-at-a-distance-y. I dislike it.
  • One cannot do line-oriented processing on the scalars as one could if they were file handles. They are therefore less efficient.
  • The ability to use new_chunker to get line-oriented output is nice, if still a bit weird. That regains a bit of the efficiency on reading output from a program, though, assuming it is buffered efficiently by IPC::Run.

I now realize that, although the interface for IPC::Run could potentially be a bit nicer, overall the weaknesses of the IPC model in particular makes it tricky to deal with at all. There is no generally-useful IPC interface, because one has to know too much about the specifics of the particular program being run to get it to work. This is okay, maybe, if you know exactly how it will react to inputs, and can reliably recognize when it is done emitting output, and don't need to worry much about cross-platform compatibility. But that was far from sufficient for my need for a generally useful way to interact with various database command-line clients in a CPAN module that could be distributed to a whole host of operating systems.

In the end, thanks to packaging suggestions in comments on a blog post, I decided to abandon the use of IPC for controlling those clients, and to use the DBI, instead. It provides an excellent API, robust, stable, and mature, and suffers none of the drawbacks of IPC.

My recommendation for those who come after me is this:

  • If you just need to execute another program and wait for it to finish, or collect its output when it is done running, use IPC::System::Simple. Otherwise, if what you need to do is to interactively interface with something else, use an API whenever possible. And if it's not possible, then use something like IPC::Run and try to make the best of it—and be prepared to give up quite a bit of your time to get it "just right."
Car answered 22/5, 2012 at 5:7 Comment(0)
P
0

I've done something similar to this. Although it depends on the parent program and what you are trying to pipe. My solution was to spawn a child process (leaving $SIG{PIPE} to function) and writing that to the log, or handling the error in what way you see fit. I use POSIX to handle my child process and am able to utilize all the functionality of the parent. However if you're trying to have the child communicate back to the parent - then things get difficult. Do you have an example of the main program and what you're trying to PIPE?

Paletot answered 13/5, 2012 at 7:1 Comment(1)
Mainly I want to know if the child exits, and have that death propagate to the parent, so I can then handle that situation. The main program is Sqitch, and the child apps will be psql, mysql, sqlite3, and the like.Car
T
0

It is possible to redirect stdandard output and standard error from an (almost) arbitrary number of child processes and read them asynchronously triggered by select(). The child processes do not have to co-operate, like in the accepted answer, although it is helpful if they use line-buffered or unbuffered I/O.

For POSIX/Un*x you just use the regular combination of pipe(), fork(), POSIX::dup(), exec(), and select() or one of the higher level modules you have mentioned, like IPC::Open3(), IPC::Run() or so.

None of that works under Windows because the Perl emulation of fork() and exec() is not sufficient for that task, and select() only works on sockets.

What you have to do for Windows is:

  1. Create a socketpair() (with the socketpair() emulation of Perl).
  2. Enable non-blocking I/O on the read end of the socket pair with $true = 1; ioctl $child_stdout, 0x8004667e, \$true.
  3. dup() STDOUT and save the duplicate descriptor.
  4. dup() the write end of the socketpair replacing STDOUT.
  5. Create the child process with Win32::Process::Create(), not with fork() and exec().
  6. Restore STDOUT.
  7. Wait for the child to write (resp. flush()) to standard output with select() and sysread() it from the read end of the socket pair.

The procedure for STDERR is identical. For STDIN you have to swap the read ends and write ends.

The constant 0x8004667e is called FIONBIO in the Windows header files but not available in Perl.

I have written a blog post with more in-depth information: http://www.guido-flohr.net/platform-independent-asynchronous-child-process-ipc/. A git repository with working code in Perl and C can be found here: https://github.com/gflohr/platform-independent-ipc.

Tristantristas answered 4/9, 2023 at 18:20 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.