How can I hook into Perl's print?
Asked Answered
F

5

21

Here's a scenario. You have a large amount of legacy scripts, all using a common library. Said scripts use the 'print' statement for diagnostic output. No changes are allowed to the scripts - they range far and wide, have their approvals, and have long since left the fruitful valleys of oversight and control.

Now a new need has arrived: logging must now be added to the library. This must be done automatically and transparently, without users of the standard library needing to change their scripts. Common library methods can simply have logging calls added to them; that's the easy part. The hard part lies in the fact that diagnostic output from these scripts were always displayed using the 'print' statement. This diagnostic output must be stored, but just as importantly, processed.

As an example of this processing, the library should only record the printed lines that contain the words 'warning', 'error', 'notice', or 'attention'. The below Extremely Trivial and Contrived Example Code (tm) would record some of said output:

sub CheckPrintOutput
{
    my @output = @_; # args passed to print eventually find their way here.
    foreach my $value (@output) {
         Log->log($value) if $value =~ /warning|error|notice|attention/i;
    }
}

(I'd like to avoid such issues as 'what should actually be logged', 'print shouldn't be used for diagnostics', 'perl sucks', or 'this example has the flaws x y and z'...this is greatly simplified for brevity and clarity. )

The basic problem comes down to capturing and processing data passed to print (or any perl builtin, along those lines of reasoning). Is it possible? Is there any way to do it cleanly? Are there any logging modules that have hooks to let you do it? Or is it something that should be avoided like the plague, and I should give up on ever capturing and processing the printed output?

Additional: This must run cross-platform - windows and *nix alike. The process of running the scripts must remain the same, as must the output from the script.

Additional additional: An interesting suggestion made in the comments of codelogic's answer:

You can subclass http://perldoc.perl.org/IO/Handle.html and create your own file handle which will do the logging work. – Kamil Kisiel

This might do it, with two caveats:

1) I'd need a way to export this functionality to anyone who uses the common library. It would have to apply automatically to STDOUT and probably STDERR too.

2) the IO::Handle documentation says that you can't subclass it, and my attempts so far have been fruitless. Is there anything special needed to make sublclassing IO::Handle work? The standard 'use base 'IO::Handle' and then overriding the new/print methods seem to do nothing.

Final edit: Looks like IO::Handle is a dead end, but Tie::Handle may do it. Thanks for all the suggestions; they're all really good. I'm going to give the Tie::Handle route a try. If it causes problems I'll be back!

Addendum: Note that after working with this a bit, I found that Tie::Handle will work, if you don't do anything tricky. If you use any of the features of IO::Handle with your tied STDOUT or STDERR, it's basically a crapshoot to get them working reliably - I could not find a way to get the autoflush method of IO::Handle to work on my tied handle. If I enabled autoflush before I tied the handle it would work. If that works for you, the Tie::Handle route may be acceptable.

Forecourt answered 22/12, 2008 at 23:6 Comment(4)
So what are you allowed to change? Command lines? Parameter files? For example, let's say I said "Yeah you can hook print", what's the scope of what you'd be able to do?Grummet
I can change anything in the common library. The user must not need to run his scripts any differently, or have to pass anything new in on the command line. The final stream of data on STDOUT and STDERR must be the same as before.Forecourt
what happens to the original output? can you tail -f it and process from there?Dimer
The original output is processed by other programs. They expect it to remain the same as before. Also, as mentioned below in another comment, we don't want to change the environment, either - so masking 'perl' with another program that does the log processing would be problematic.Forecourt
G
25

There are a number of built-ins that you can override (see perlsub). However, print is one of the built-ins that doesn't work this way. The difficulties of overriding print are detailed at this perlmonk's thread.

However, you can

  1. Create a package
  2. Tie a handle
  3. Select this handle.

Now, a couple of people have given the basic framework, but it works out kind of like this:

package IO::Override;
use base qw<Tie::Handle>;
use Symbol qw<geniosym>;

sub TIEHANDLE { return bless geniosym, __PACKAGE__ }

sub PRINT { 
    shift;
    # You can do pretty much anything you want here. 
    # And it's printing to what was STDOUT at the start.
    # 
    print $OLD_STDOUT join( '', 'NOTICE: ', @_ );
}

tie *PRINTOUT, 'IO::Override';
our $OLD_STDOUT = select( *PRINTOUT );

You can override printf in the same manner:

sub PRINTF { 
    shift;
    # You can do pretty much anything you want here. 
    # And it's printing to what was STDOUT at the start.
    # 
    my $format = shift;
    print $OLD_STDOUT join( '', 'NOTICE: ', sprintf( $format, @_ ));
}

See Tie::Handle for what all you can override of STDOUT's behavior.

Grummet answered 23/12, 2008 at 4:57 Comment(6)
Oh, this is looking good. Does this change print's behavior if they use print to write to file handles, or just to STDOUT?Forecourt
Looking at it more. I see - it's tied to a single file handle at a time, right? I will be giving this a whirl shortly and report back.Forecourt
The "file handle" is just a means. Once we've got the flow directed to where we can specify behavior, we can write out to as many real handles as we please, filter it as we want--even make database calls, if that's what we wanted to do.Grummet
Fantastic, this is exactly what I was hoping for : only affecting STDOUT (and probably STDERR too), transparent to the outside. Thanks!Forecourt
What about re-assigning the STDOUT glob instead of select()'ing it? We've stored it in the module...for all intents and purposes our new tied handle should be STDOUT. It seems kind-of out there, but not beyond workability. Do you know of any major problems with doing something like that?Forecourt
"We've stored it in the module" it == STDOUT.Forecourt
B
9

You can use Perl's select to redirect STDOUT.

open my $fh, ">log.txt";
print "test1\n";
my $current_fh = select $fh;
print "test2\n";
select $current_fh;
print "test3\n";

The file handle could be anything, even a pipe to another process that post processes your log messages.

PerlIO::tee in the PerlIO::Util module seems to allows you to 'tee' the output of a file handle to multiple destinations (e.g. log processor and STDOUT).

Boutwell answered 22/12, 2008 at 23:29 Comment(9)
select is all well and good, but we need to process that data, too. Is there a file handle type that gives hooks to evaluate and do something with the data passed to it?Forecourt
Also, we cannot modify the current behavior of these scripts, so the output must remain on STDOUT as well.Forecourt
You can open a filehandle as a pipe to another process. Do the logging stuff in there, then you can print to stdout from inside it.Emendate
You can have the piped process redirect its STDIN to STDOUT in addition to processing it, that way current behavior will not be affected. I'm not aware of any modules that abstract this functionality.Boutwell
Interesting idea... would that be with something like fork()? There's another requirement too...this must run across platforms: windows and *nix alike. As I understand it, fork doesn't really work in windows. Or are you talking about a different piping mechanism?Forecourt
A pipe is not a fork. It creates a new process and provides you with a handle to its STDIN. Read more about it @ troubleshooters.com/codecorn/littperl/perlfile.htm#PipingBoutwell
You can subclass perldoc.perl.org/IO/Handle.html and create your own file handle which will do the logging work.Apple
re: codelogic-I meant would you create the pipe and then start the other process with fork. If not fork, what would you do to start the process? qx? re: Kamil: Interesting. This would then require that they include that subclass of IO::Handle, right? Could it be re-exported automatically?Forecourt
@Robert: No, you don't have to manually fork. Perl takes care of creating the other process and establishing the connection between the file handle and the other process' STDIN. It's non-blocking.Boutwell
F
7

Lots of choices. Use select() to change the filehandle that print defaults to. Or tie STDOUT. Or reopen it. Or apply an IO layer to it.

Freidafreight answered 23/12, 2008 at 1:57 Comment(0)
G
3

This isn't the answer to your issue but you should be able to adopt the logic for your own use. If not, maybe someone else will find it useful.

Catching malformed headers before they happen...

package PsychicSTDOUT;
use strict;

my $c = 0;
my $malformed_header = 0;
open(TRUE_STDOUT, '>', '/dev/stdout');
tie *STDOUT, __PACKAGE__, (*STDOUT);

sub TIEHANDLE {
    my $class = shift;
    my $handles = [@_];
    bless $handles, $class;
    return $handles;
}

sub PRINT {
    my $class = shift;
    if (!$c++ && @_[0] !~ /^content-type/i) {
        my (undef, $file, $line) = caller;
        print STDERR "Missing content-type in $file at line $line!!\n";
        $malformed_header = 1;
    }
    return 0 if ($malformed_header);
    return print TRUE_STDOUT @_;
}
1;

usage:

use PsychicSTDOUT;
print "content-type: text/html\n\n"; #try commenting out this line
print "<html>\n";
print "</html>\n";
Gilt answered 8/6, 2011 at 15:51 Comment(0)
D
-1

You could run the script from a wrapper script that captures the original script's stdout and writes the output somewhere sensible.

Dedradedric answered 22/12, 2008 at 23:37 Comment(4)
Unfortunately, this violates the 'script must run the same as before' requirement. The user must not have to do anything different to get this logging information.Forecourt
True enough from a general implementation standpoint; a wrapper script is one possible solution. However, asking the users to run another script just to run their script not a solution that will work in this scenario - the command line, perl distro, and console output must remain the same.Forecourt
And it must ONLY affect scripts that use this particular library. If I changed what the shell thought 'perl' was, it would affect other scripts that have no need for the common library. That too, is also not an acceptable solution for this application.Forecourt
But depending on yours, it might be. :)Forecourt

© 2022 - 2024 — McMap. All rights reserved.