WebAPI StreamContent vs PushStreamContent
Asked Answered
B

1

43

I'm implementing a MVC4 + WebAPI version of the BluImp jQuery File Upload all works well with my initial attempt but Im trying to ensure the best use of memory whilst downloading very large files (~2GB).

I've read Filip Woj's article on PushStreamContent and implemented it as best I can (removing the async parts - perhaps this is the problem?). When Im running tests and watching TaskManager Im not seeing much difference memory usage wise and Im trying to understand the difference between how the responses are handled.

Here's my StreamContent version:

private HttpResponseMessage DownloadContentNonChunked()
{
    var filename = HttpContext.Current.Request["f"];
    var filePath = _storageRoot + filename;
    if (File.Exists(filePath))
    {
        HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);
        response.Content = new StreamContent(new FileStream(filePath, FileMode.Open, FileAccess.Read));
        response.Content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
        response.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")
        {
            FileName = filename
        };
        return response;
    }
    return ControllerContext.Request.CreateErrorResponse(HttpStatusCode.NotFound, "");
}

And here's my PushStreamContent version:

public class FileDownloadStream
{
    private readonly string _filename;

    public FileDownloadStream(string filePath)
    {
        _filename = filePath;
    }

    public void WriteToStream(Stream outputStream, HttpContent content, TransportContext context)
    {
        try
        {
            var buffer = new byte[4096];

            using (var video = File.Open(_filename, FileMode.Open, FileAccess.Read))
            {
                var length = (int)video.Length;
                var bytesRead = 1;

                while (length > 0 && bytesRead > 0)
                {
                    bytesRead = video.Read(buffer, 0, Math.Min(length, buffer.Length));
                    outputStream.Write(buffer, 0, bytesRead);
                    length -= bytesRead;
                }
            }
        }
        catch (HttpException ex)
        {
            return;
        }
        finally
        {
            outputStream.Close();
        }
    }
}

private HttpResponseMessage DownloadContentChunked()
{
    var filename = HttpContext.Current.Request["f"];
    var filePath = _storageRoot + filename;
    if (File.Exists(filePath))
    {
        var fileDownload = new FileDownloadStream(filePath);
        var response = Request.CreateResponse();
        response.Content = new PushStreamContent(fileDownload.WriteToStream, new MediaTypeHeaderValue("application/octet-stream"));
        response.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")
        {
            FileName = filename
        };
        return response;
    }
    return ControllerContext.Request.CreateErrorResponse(HttpStatusCode.NotFound, "");
}

My question is why am I not seeing much difference in memory usage between the two approaches? Additionally Ive downloaded the PDB for the StreamContent type and can see references to buffer sizes and such forth (see below) so I'd like to know exactly what PushStreamContent is doing above and beyond StreamContent. Ive check the Type info on MSDN but the article were a little light on explanation!

namespace System.Net.Http
{
  /// <summary>
  /// Provides HTTP content based on a stream.
  /// </summary>
  [__DynamicallyInvokable]
  public class StreamContent : HttpContent
  {
    private Stream content;
    private int bufferSize;
    private bool contentConsumed;
    private long start;
    private const int defaultBufferSize = 4096;

    /// <summary>
    /// Creates a new instance of the <see cref="T:System.Net.Http.StreamContent"/> class.
    /// </summary>
    /// <param name="content">The content used to initialize the <see cref="T:System.Net.Http.StreamContent"/>.</param>
    [__DynamicallyInvokable]
    [TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")]
    public StreamContent(Stream content)
      : this(content, 4096)
    {
    }
Beedon answered 23/4, 2013 at 11:54 Comment(5)
I think these are similar approaches just different distribution. "pure" .net and web apiDominicadominical
They do seem similar indeed - thats really why Im asking I guess with PushStreamContent being so much more verbose and with them doing a similar thing Im left wondering which I should use as best practise in thsi scenario!Beedon
From my tests I'm seeing that StreamContent is loading into memory first, then flushing it. I would advice not to use that for big files. Your FileDownloadStream is good.Islamize
I wonder if StreamContent changed behavior. I think old versions of WebAPI would fully buffer it, forcing people to use PushStreamContent to work around that, while modern ones or, perhaps, when hosted under OWIN simply do not buffer either, making them both equivalent.Mingy
An international caveat worth mentioning. Discovered that when you set the locale with System.Globalization.CultureInfo.CurrentCulture that's supposed to flow between execution context for >= .NET 4.6, but in this case, it does not and needs to be set again from within the lambda.Glennglenna
H
29

Regarding the memory usage of these both approaches, for StreamContent and PushStreamContent, Web API doesn't buffer the responses. Following snapshot of code is from WebHostBufferPolicySelector. Source code here.

    /// <summary>
    /// Determines whether the host should buffer the <see cref="HttpResponseMessage"/> entity body.
    /// </summary>
    /// <param name="response">The <see cref="HttpResponseMessage"/>response for which to determine
    /// whether host output buffering should be used for the response entity body.</param>
    /// <returns><c>true</c> if buffering should be used; otherwise a streamed response should be used.</returns>
    public virtual bool UseBufferedOutputStream(HttpResponseMessage response)
    {
        if (response == null)
        {
            throw Error.ArgumentNull("response");
        }

        // Any HttpContent that knows its length is presumably already buffered internally.
        HttpContent content = response.Content;
        if (content != null)
        {
            long? contentLength = content.Headers.ContentLength;
            if (contentLength.HasValue && contentLength.Value >= 0)
            {
                return false;
            }

            // Content length is null or -1 (meaning not known).  
            // Buffer any HttpContent except StreamContent and PushStreamContent
            return !(content is StreamContent || content is PushStreamContent);
        }

        return false;
    }

Also PushStreamContent is for scenarios where you need to 'push' data to the stream, where as StreamContent 'pulls' data from the stream. So, for your current scenario of downloading files, using StreamContent should be fine.

Examples below:

// Here when the response is being written out the data is pulled from the file to the destination(network) stream
response.Content = new StreamContent(File.OpenRead(filePath));

// Here we create a push stream content so that we can use XDocument.Save to push data to the destination(network) stream
XDocument xDoc = XDocument.Load("Sample.xml", LoadOptions.None);
PushStreamContent xDocContent = new PushStreamContent(
(stream, content, context) =>
{
     // After save we close the stream to signal that we are done writing.
     xDoc.Save(stream);
     stream.Close();
},
"application/xml");
Heterolecithal answered 23/4, 2013 at 14:28 Comment(11)
I think I follow - in what scenario would you use PushStreamContent then?Beedon
So if I want to "stream" content in the "streaming video" sense of the word use PushStreamContent but if I want to let someone "download a file" then use StreamContent? Are there any other considerations?Beedon
I read about the Buffering out of memory issue here strathweb.com/2012/09/… but when I implemented it I could no longer access request.files - are you able to elaborate on why - a link to some documentation perhaps? Additionally what would implementing this approach have over not-implementing it if nothing is buffered for StreamContent anyway? Sorry Im digressing now but Im genuinely interested!Beedon
Right, the "streaming video" scenario is one, where you do not know the total content length upfront and you are writing to the destination stream as you are receiving the video feed from somewhere. This is the reason when using PushStreamContent response is sent in chunked transfer encoding as we do not know the content length upfront.Heterolecithal
I briefly looked at the link to Filip's article that you have in your comment here. It is for Upload scenarios. By default Web API buffers incoming request. So if you are uploading huge files, you could make a request stream to be un-buffered. The decision of having a buffered/unbuffered can be done conditionally based on incoming request as Filip mentions in his blog.Heterolecithal
Ahhhh thank you I now understand what's going on and why (sorry for my stupidity getting confused between the upload scenario on Filips article). Really very much appreciate your time!Beedon
For anyone else using Owin and wondering why their responses are being buffered no matter what they do, there is a buffering-related bug in 5.0.0 that is fixed in 5.1.0-RC1. Thought I'd share, if I'd known five hours ago I might still have some hair left. update-package microsoft.aspnet.webapi.owin -includeprerelease is your friend.Protege
@bUKaneer: cases which would use PushStreamContent include all functions/objects which write out to a stream. Most serializers write out to a stream. Many people write to a memory stream (which is not terrible, given most serializers are not async and therefore block the thread). Another case may be a GZip stream, although I can't recall the interface for that - it likely requires a target stream to write to. "Streaming Video" is an application level answer. I hope my answer is more useful. Besides, streaming video is probably a wrong answer - UDP should be used for that, not HTTP/TCP.Ison
IMHO, thomaslevesque.com/tag/pushstreamcontent describes really well the difference between "pull" and "push".Managua
Thorough and clear! Do you know how this translates to AspNet Core 2.0? It seems Controller has a writable Response.Body stream. We could simply write to that instead of PushStreamContent's stream, but what I don't understand is when the headers are sent. Are they sent as soon as anything starts writing to Response.Body? Must be, to avoid buffering, right? But then what if the controller method changes the headers after some code has already written to the stream?Improvised
The blog linked by Dejan has moved to https://thomaslevesque.com/2013/11/30/uploading-data-with-httpclient-using-a-push-model/Kraus

© 2022 - 2024 — McMap. All rights reserved.