Reading "chunked" response with HttpWebResponse
Asked Answered
A

5

14

I'm having trouble reading a "chunked" response when using a StreamReader to read the stream returned by GetResponseStream() of a HttpWebResponse:

// response is an HttpWebResponse
StreamReader reader = new StreamReader(response.GetResponseStream());
string output = reader.ReadToEnd(); // throws exception...

When the reader.ReadToEnd() method is called I'm getting the following System.IO.IOException: Unable to read data from the transport connection: The connection was closed.

The above code works just fine when server returns a "non-chunked" response.

The only way I've been able to get it to work is to use HTTP/1.0 for the initial request (instead of HTTP/1.1, the default) but this seems like a lame work-around.

Any ideas?


@Chuck

Your solution works pretty good. It still throws the same IOExeception on the last Read(). But after inspecting the contents of the StringBuilder it looks like all the data has been received. So perhaps I just need to wrap the Read() in a try-catch and swallow the "error".

Abrams answered 19/8, 2008 at 21:28 Comment(2)
To read chunked response, you need to follow en.wikipedia.org/wiki/Chunked_transfer_encodingBatholith
I'm seeing this behavior with .NET 4.6 connecting to the PowerDNS 3.4.5 HTTP REST API. The workarounds don't help. If I swallow the exception, I lose part of the response.Knawel
E
2

Haven't tried it this with a "chunked" response but would something like this work?

StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();
string tmpString = null;
int count = 0;
do
{
     count = resStream.Read(buf, 0, buf.Length);
     if(count != 0)
     {
          tmpString = Encoding.ASCII.GetString(buf, 0, count);
          sb.Append(tmpString);
     }
}while (count > 0);
Ermaermanno answered 19/8, 2008 at 23:54 Comment(2)
This is dangerous for multibyte encodings (i.e. not ASCII) because there is no guarantee that the reads will be aligned to char boundaries.Mescal
@Chuck You can't just use ASCII, you need to figure out what encoding is actually being used, i.e. by means of Content-Type, and then use that to "GetString"Perfumer
R
1

I am working on a similar problem. The .net HttpWebRequest and HttpWebRequest handle cookies and redirects automatically but they do not handle chunked content on the response body automatically.

This is perhaps because chunked content may contain more than simple data (i.e.: chunk names, trailing headers).

Simply reading the stream and ignoring the EOF exception will not work as the stream contains more than the desired content. The stream will contain chunks and each chunk begins by declaring its size. If the stream is simply read from beginning to end the final data will contain the chunk meta-data (and in case where it is gziped content it will fail the CRC check when decompressing).

To solve the problem it is necessary to manually parse the stream, removing the chunk size from each chunk (as well as the CR LF delimitors), detecting the final chunk and keeping only the chunk data. There likely is a library out there somewhere that does this, I have not found it yet.

Usefull resources :

http://en.wikipedia.org/wiki/Chunked_transfer_encoding https://www.rfc-editor.org/rfc/rfc2616#section-3.6.1

Republic answered 19/3, 2013 at 10:50 Comment(0)
A
0

After trying a lot of snippets from StackOverflow and Google, ultimately I found this to work the best (assuming you know the data a UTF8 string, if not, you can just keep the byte array and process appropriately):

byte[] data;
var responseStream = response.GetResponseStream();
var reader = new StreamReader(responseStream, Encoding.UTF8);
data = Encoding.UTF8.GetBytes(reader.ReadToEnd());
return Encoding.Default.GetString(data.ToArray());

I found other variations work most of the time, but occasionally truncate the data. I got this snippet from:

https://social.msdn.microsoft.com/Forums/en-US/4f28d99d-9794-434b-8b78-7f9245c099c4/problems-with-httpwebrequest-and-transferencoding-chunked?forum=ncl

Allege answered 22/7, 2019 at 16:11 Comment(0)
R
0

It is funny. During playing with the request header and removing "Accept-Encoding: gzip,deflate" the server in my usecase did answer in a plain ascii manner and no longer with chunked, encoded snippets. Maybe you should give it a try and keep "Accept-Encoding: gzip,deflate" away. The idea came while reading the upper mentioned wiki in topic about using compression.

Rabaul answered 16/12, 2021 at 9:29 Comment(0)
O
-1

I've had the same problem (which is how I ended up here :-). Eventually tracked it down to the fact that the chunked stream wasn't valid - the final zero length chunk was missing. I came up with the following code which handles both valid and invalid chunked streams.

using (StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
    StringBuilder sb = new StringBuilder();

    try
    {
        while (!sr.EndOfStream)
        {
            sb.Append((char)sr.Read());
        }
    }
    catch (System.IO.IOException)
    { }

    string content = sb.ToString();
}
Overbite answered 9/12, 2008 at 11:26 Comment(1)
Casting bytes to char is dangerous because it completely ignores multibyte encodings.Mescal

© 2022 - 2024 — McMap. All rights reserved.