How do I apply a limit to the number of bytes read by futures::Stream::concat2?
Asked Answered
N

1

2

An answer to How do I read the entire body of a Tokio-based Hyper request? suggests:

you may wish to establish some kind of cap on the number of bytes read [when using futures::Stream::concat2]

How can I actually achieve this? For example, here's some code that mimics a malicious user who is sending my service an infinite amount of data:

extern crate futures; // 0.1.25

use futures::{prelude::*, stream};

fn some_bytes() -> impl Stream<Item = Vec<u8>, Error = ()> {
    stream::repeat(b"0123456789ABCDEF".to_vec())
}

fn limited() -> impl Future<Item = Vec<u8>, Error = ()> {
    some_bytes().concat2()
}

fn main() {
    let v = limited().wait().unwrap();
    println!("{}", v.len());
}
Nupercaine answered 4/11, 2018 at 15:46 Comment(0)
N
5

One solution is to create a stream combinator that ends the stream once some threshold of bytes has passed. Here's one possible implementation:

struct TakeBytes<S> {
    inner: S,
    seen: usize,
    limit: usize,
}

impl<S> Stream for TakeBytes<S>
where
    S: Stream<Item = Vec<u8>>,
{
    type Item = Vec<u8>;
    type Error = S::Error;

    fn poll(&mut self) -> Poll<Option<Self::Item>, Self::Error> {
        if self.seen >= self.limit {
            return Ok(Async::Ready(None)); // Stream is over
        }

        let inner = self.inner.poll();
        if let Ok(Async::Ready(Some(ref v))) = inner {
            self.seen += v.len();
        }
        inner
    }
}

trait TakeBytesExt: Sized {
    fn take_bytes(self, limit: usize) -> TakeBytes<Self>;
}

impl<S> TakeBytesExt for S
where
    S: Stream<Item = Vec<u8>>,
{
    fn take_bytes(self, limit: usize) -> TakeBytes<Self> {
        TakeBytes {
            inner: self,
            limit,
            seen: 0,
        }
    }
}

This can then be chained onto the stream before concat2:

fn limited() -> impl Future<Item = Vec<u8>, Error = ()> {
    some_bytes().take_bytes(999).concat2()
}

This implementation has caveats:

  • it only works for Vec<u8>. You can introduce generics to make it more broadly applicable, of course.
  • it allows for more bytes than the limit to come in, it just stops the stream after that point. Those types of decisions are application-dependent.

Another thing to keep in mind is that you want to attempt to tackle this problem as low as you can — if the source of the data has already allocated a gigabyte of memory, placing a limit won't help as much.

Nupercaine answered 4/11, 2018 at 15:46 Comment(3)
Thank you, @Shepmaster! Can you show how to use this with a hyper::Body? When I call body.some_bytes(), I get "no method named some_bytes found for type hyper::Body in the current scope"Timberland
@PetrusTheron some_bytes represents the act of getting a Stream in some manner. hyper's Body is already a Stream.Nupercaine
Then how do I limit the amount of bytes that the client can POST to prevent memory overruns?Timberland

© 2022 - 2024 — McMap. All rights reserved.