What is a bucket brigade?
Asked Answered
L

2

29

I would really love to implement a php_user_filter::filter(). But therefore I have to know what a bucket brigade is. This seems to be a resource which I can operate with the stream_bucket_* functions. But the documentation is not really helpful. The best I could find are those examples in stream_filter_register().

I'm especially curios what these stream_bucket_new() and stream_bucket_make_writeable() can do.


Update: It seems that PHP is exposing an internal data structure of Apache.

Lapillus answered 24/11, 2014 at 11:0 Comment(4)
I am going to guess they are just chunks of octets. Or handles to chunks of octetsDennet
Any update on this? I find the lack of docu disturbing.. I wasn't able to find a good explanation of what's really going on. Most articles/tutorials merly scratch the surface and are mostly a list of steps without looking behind the curtain. Examples: etutorials.org/Server+Administration/upgrading+php+5/… & codediesel.com/php/creating-custom-stream-filtersConcerted
useful? Here comes the bucket brigade...Chapter
Finally I was able to create a living example for such a php_user_filter: TokenBucketFilter.Lapillus
P
45

Ah, welcome to the least documented parts of the PHP manual! [I opened a bug report about it; maybe this answer will be helpful for documenting it: https://bugs.php.net/bug.php?id=69966]

The bucket brigade

To start with your initial question, the bucket brigade is just a name to the resource named userfilter.bucket brigade.

You are passed two different brigades in as first and second parameters to php_user_filter::filter(). The first brigade is the input buckets you read from, the second brigade is initially empty; you write to it.

Regarding your update about the data structure… It's really just a doubly linked list with strings basically. But it may well be that the name was stolen from there ;-)

stream_bucket_prepend() / stream_bucket_append()

stream_bucket_prepend(resource $brigade, stdClass $bucket): null
stream_bucket_append(resource $brigade, stdClass $bucket): null

The expected $brigade is the output brigade aka the second parameter on php_user_filter::filter().

The $bucket is a stdClass object like it is returned by stream_bucket_make_writable() or stream_bucket_new().

These two functions just prepend or append the passed bucket to the brigade.

stream_bucket_new()

To demystify this function, analyze first what it's function signature is:

stream_bucket_new(resource $stream, string $buffer): stdClass

First argument is the $stream you're writing this bucket to. Second is the $buffer this new bucket will contain.

[I'd like to note here that the $stream parameter actually is not very significant; it's just used to check whether we need to allocate memory persistently so that it survives through requests. I just suppose that you can make PHP nicely segfault by passing a persistent stream in here, when operating on a non-persistent filter...]

There is now an userfilter.bucket resource created which is assigned to a property of a (stdClass) object named bucket. That object has also two other properties: data and datalen, which contain the buffer and the buffer size of this bucket.

It will return you a stdClass which you can pass in to stream_bucket_prepend() and stream_bucket_append().

stream_bucket_make_writable()

stream_bucket_make_writeable(resource $brigade): stdClass|null

It shifts the first bucket from the $brigade and returns it. If the $brigade was emptied, it returns null.

Further notes

When php_user_filter::filter() is called, the $stream property on the object filter() is called on will be set to the stream we're currently working on. That's also the stream you need to pass to stream_bucket_new() when calling it. (The $stream property will be unset again after the call. You can't reuse it in e.g. php_user_filter::onClose()).

Also note that even when you're returned a $datalen property, you do not need to set that property in case you change $data property before passing it to stream_bucket_prepend() or stream_bucket_append().

The implementation requires you (well, it expects that or will throw a warning) that you read all the data from the $in bucket before returning.

There is another case of the documentation lying to us: in php_user_filter::onCreate(), the $stream property is not set. It will only be set during filter() method call.

Generally, don't use filters with non-blocking streams. I tried that once and it went horribly wrong … And it's not likely that's ever going to be fixed...

Sum up (examples)

Let's start with the simplest case: writing back what we got in.

class simple_filter extends php_user_filter {
    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("simple", "simple_filter")

All what happens here is getting buckets from $in bucket brigade and putting it back into $out bucket brigade.

Okay, now try to manipulate our input.

class reverse_filter extends php_user_filter {
    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            $bucket->data = strrev($bucket->data);
            stream_bucket_prepend($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("reverse", "reverse_filter")

Now we registered the reverse:// protocol, which reverses your string (each write is being reversed on it's own here; write order is still preserved). So, we obviously now need to manipulate the bucket data and prepend it here.

Now, what's the use case for stream_bucket_new()? Usually you can just append to $bucket->data; yes, you even can concatenate all the data into the first bucket, but when flush()'ing it might be possible that nothing is in bucket brigade and you want to send a last bucket, then you need it.

class append_filter extends php_user_filter {
    public $stream;

    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        // always append a terminating \n
        if ($closing) {
            $bucket = stream_bucket_new($this->stream, "\n");
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("append", "append_filter")

With that (and the existing documentation about php_user_filter class), one should be able to do all sorts of magic userland stream filtering by combining all these powerful possibilities into even stronger code.

Prophylaxis answered 30/6, 2015 at 7:56 Comment(10)
I really appreciate the effort you have put into this answer, but to me it still does not explain what a bucket brigade exactly is and why there's a need for this approach in dealing with (filtering) streams. OP's link about Apache brigades reads convincing, but I was wondering if you could confirm that that is indeed why this "bucket brigading" approach is used.Wonacott
As said, the bucket brigade is a resource. That resource is under the hood an ordered list of buckets/strings. (which you can pull from / add to via these functions.) I can't confirm whether that is why it was used, you'd have to ask the author of the stream filter implementation, Wez Furlong. I just can guess, so I have no more idea than you. Also, I wonder why you want to know exactly that? I'm not sure how that helps you with implementing filters...Prophylaxis
Well, going by the link about Apache brigades, it appears as though one can't/shouldn't write more data to a bucket than its initial length and one should create a new bucket if more data is needed to be put into the output stream than was inside the input stream. If that is the whole point of this "bucket brigading" approach, then that is rather valuable information, no?Wonacott
I haven't looked that closely at Apache bucket brigades; at least the implementation (lxr.php.net/xref/PHP_5_6/ext/standard/user_filters.c#470) directly copies the full manipulated string we have back into the bucket. I think I've indirectly hinted at that it's possible in my answer. ["you even can concatenate all the data into the first bucket"]. It's maybe the point of bucket brigading when talking about PHP extensions which then can more efficiently work if they don't have to do additional allocations; but php_user_filter is just directly exposing the internal API in a safe way.Prophylaxis
@DecentDabbler I'd like to add, that the internal API very much looks like what's described on that apache site. Even some function names etc. are highly similar. But these functions are all just exposed to extensions and not to userland PHP. The php_user_filter very much is a (safe) rip-off of the internal implementation on top of the internal implementation. So, yes, I think ultimately it's safe to say that it was derived from apache brigades.Prophylaxis
Thank you for the great answer! I found that in PHP 7 the property append_filter::$stream is not set and the example is not working. In PHP 5.3 it does work. Any Idea on how to do this whit PHP 7?Trifle
@Trifle That's a bug in PHP. Can you please report it to bugs.php.net ? I'll look at fixing it (missing IS_INDIRECT handling)Prophylaxis
Could some one please explain a bit about the parameter $consumed? To what do I need it to set it if I put more or less data in the $out bucket than taken from the $in bucket? In my tests whit PHP 5.3 I always got the same (correct) result, no matter to what or if I set $consumed at all.Trifle
@Trifle I TBH am not sure there, sorry :-/ Maybe ask a new question here on SO, perhaps someone knows to answer it.Prophylaxis
This should definitely go into the official documentation of the PHP manual. Thanks!Selfpossessed
B
0

I thought I would contribute some background info.

First, the terms buckets and brigades. Turns out there is a thing called a bucket brigade... sort of a tag team effort for fighting fires... where you have a chain of people who stand still but pass buckets of water to the person next to them, producing a constant flow of buckets full of water.

Also, as pointed out above, PHPs adoption of buckets and brigades comes from Apaches [Buckets and Brigades](http://www.apachetutor.org/dev/brigades], perhaps... great explanation is given on the methodology and reasoning.

But essentally the idea is, if you need to do modification to a some content, before it is sent, doing it mid stream has many benefits, especially when you model your streams using buckets and brigades.

Bascomb answered 7/7, 2020 at 3:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.