PHP: trying to get fgets() to trigger both on CRLF, CR and LF
Asked Answered
S

1

6

I'm reading streams in PHP, using proc_open and fgets($stdout), trying to get every line as it comes in.

Many linux programs (package managers, wget, rsync) just use a CR (carriage return) character for lines which periodically updates "in place", like download progress. I'd like to catch these updates (as separate lines) as soon as they happen.

At the moment, fgets($stdout) just keeps reading until a LF, so when progress is going very slowly (big file for example) it just keeps on reading until it's completely done, before returning all the updated lines as one long string, including the CRs.

I've tried setting the "mac" option to detect CRs as line endings:

ini_set('auto_detect_line_endings',true); 

But that doesn't seem to work.

Now, stream_get_line would allow me to set CRs as line breaks, but not a "catch all" solution which treats both CRLF, CR and LF as delimiters.

I could of course read the whole line, split it using various PHP methods and replace all types of linebreaks with LFs, but it's a stream, and I want PHP to be able to get an indication of progress while it's still running.

So my question:

How can I read from the STDOUT pipe (from proc_open) until a LF or CR happens, without having to wait until the whole line is in?

Thanks in advance!

Solution:

I used Fleshgrinder's filter class to replace \r with \n in the stream (see accepted answer), and replaced fgets() with fgetc() to get more "realtime" access to contents of STDOUT:

$stdout = $proc->pipe(1);
stream_filter_register("EOL", "EOLStreamFilter");
stream_filter_append($stdout, "EOL"); 

while (($o = fgetc($stdout))!== false){
    $out .= $o;                            // buffer the characters into line, until \n.
    if ($o == "\n"){echo $out;$out='';}    // can now easily wrap the $out lines in JSON
}
Scevo answered 12/1, 2015 at 17:50 Comment(2)
Are you sure that this isn't simply i/o caching? Many filesystems don't write to disk on a byte by byte basis, but cache the writes until a suitable number of blocks is available to be physically written in one go; likewise in many cases with buffered outputSprung
But if I run something like wget server.com/file on the CLI, it does output the updates directly, like how it goes from the screen-wide 1%[==> ] line to 2%[==> ] as soon as it happens. Or is reading from the STDOUT pipe in PHP fundamentally different from what the terminal shows?Scevo
R
2

Use a stream filter to normalize your new line characters before consuming the stream. I created the following code that should do the trick based on the example from PHP’s manual page on stream_filter_register.

Code is untested!

<?php

// https://php.net/php-user-filter
final class EOLStreamFilter extends php_user_filter {

    public function filter($in, $out, &$consumed, $closing)
    {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $bucket->data = str_replace([ "\r\n", "\r" ], "\n", $bucket->data);
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }

}

stream_filter_register("EOL", "EOLStreamFilter");

// Open stream …

stream_filter_append($yourStreamHandle, "EOL");

// Perform your work with normalized EOLs …

EDIT: The comment Mark Baker posted on your question is true. Most Linux distributions are using a line buffer for STDOUT and it is possible that Apple is doing the same. On the other hand most STDERR streams are unbuffered. You could try to redirect the output of the program to another pipe (e.g. STDERR or any other) and see if you have more luck with that.

Raman answered 12/1, 2015 at 18:17 Comment(4)
Just tested your class extension, and it works for replacing CRs with LFs! Great trick to know :) Line buffering of STDOUT still seems to be a problem, so I'm gonna try piping it elsewhere to see what happens. But at least the output shows up with linebreaks in html <pre> tags now (I'm using ajax progress events with PHP buffering turned off), so the filtering works. N.B. Also, weirdly enough, the function "stream_bucket_make_writable" should be "stream_bucket_make_writeable", in contrast to language style guides and the other PHP functions with the word writable in it.Scevo
Good to know, let me know if redirecting helps with buffering. XD Typical for PHP!Raman
Found the solution! After some testing, it seems that STDOUT buffering is not an issue, it spits out directly into PHP. The filter also just offers a partial solution: It replaces \r, but still after the line has been read. So I encoutered the fgetc() function, which reads all bytes as a stream instead of the lines. Your filter class still wonderfully sanitizes the stream, converting \r into \n. This way I can implement the buffering myself -- just append a string until it sees a newline, wrap it as JSON (for metadata) and echo! I'll add the solution to my question after some sleep.Scevo
And thanks a lot, I learned so much from your comment, really pushed me in the right direction!Scevo

© 2022 - 2024 — McMap. All rights reserved.