The code between start
and finish
runs in a separate process and the child and parent cannot write to each other's variables (even if with the same name). Forking creates an independent process with its own memory and data.† To pass data between these processes we need to use an "Inter-Process-Communication" (IPC) mechanism.
This module does provide a ready and simple way to pass data back from a child to the parent.
See Retrieving data structures from child processes in docs.
You first need to supply to finish
a reference to the data structure that the child wants to return. In your case, you want to return a scalar $commandoutput[0]
so do
$fork->finish(0, \$commandoutput[0]);
This reference is then found in the callback as the last, sixth, parameter. The one your code left out. So in the callback you need
my %ret_data; # to store data from different child processes
$pm->run_on_finish(
sub {
my ($pid, $exit, $ident, $signal, $core, $dataref) = @_;
$ret_data{$pid} = $dataref;
}
);
Here $dataref
is \$commandoutput[0]
, which is stored in %ret_data
as the value for the key which is the process id. So after the foreach
completes you can find all data in %ret_data
foreach my $pid (keys %ret_data) {
say "Data from $pid => ${$ret_data{$pid}}";
}
Here we dereference $ret_data{$pid}
as a scalar reference, since your code returns that.
Note that the data is passed by writing out files and that can be slow if a lot is going on.
Here is a full example, where each child returns an array reference, by passing it tofinish
, which is then retrieved in the callback. For a different example see this post.
use warnings;
use strict;
use feature 'say';
use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(4);
my %ret_data;
$pm->run_on_finish( sub {
my ($pid, $exit, $ident, $signal, $core, $dataref) = @_;
$ret_data{$pid} = $dataref;
});
foreach my $i (1..8)
{
$pm->start and next;
my $ref = run_job($i);
$pm->finish(0, $ref);
}
$pm->wait_all_children;
foreach my $pid (keys %ret_data) {
say "$pid returned: @{$ret_data{$pid}}";
}
sub run_job {
my ($i) = @_;
return [ 1..$i ]; # make up return data: arrayref with list 1..$i
}
Prints
15037 returned: 1 2 3 4 5 6 7
15031 returned: 1 2
15033 returned: 1 2 3 4
15036 returned: 1 2 3 4 5 6
15035 returned: 1 2 3 4 5
15038 returned: 1 2 3 4 5 6 7 8
15032 returned: 1 2 3
15030 returned: 1
† On modern systems as little data is copied as possible as a new process is forked, for performance reasons. So variables that a child "inherits" by forking aren't actually copies and thus the child does in fact read parent's variables that existed when it was forked.
However, any data that a child writes in memory is inaccessible to the parent (and what parent writes after forking is unknown to the child). If that data is written to a variable "inherited" from a parent at forking then a data copy happens so that the child's new data is independent.
There are certainly subtleties and complexities in how data is managed, with apparently a number of pointers maintained even as data changes in the child. I'd guess that this is mostly to simplify data management, and to reduce copying; there appears to be far finer granularity in data management than at a "variable" level.
But these are implementation details and in general child and parent can't poke at each other's data.