Receiving Chunked HTTP Data With Winsock
Asked Answered
C

2

5

I'm having trouble reading in some chunked HTTP response data using winsock. I send a request fine and get the following back:

HTTP/1.1 200 OK
Server: LMAX/1.0
Content-Type: text/xml; charset=utf-8
Transfer-Encoding: chunked
Date: Mon, 29 Aug 2011 16:22:19 GMT

using winsock recv. At this point however it just hangs. I have the listener running in an infinite loop but nothing is ever picked up.

I think it's a C++ issue but it could also be related to the fact that I pushing the connection through stunnel to wrap it up inside HTTPS. I have a test application using some libs in C# which works perfectly through stunnel. I'm confused as to why my loop is not receiving the C++ chunked data after the initial recv.

This is the loop in question...it is called after the chunked ok response above...

while(true)
{
    recvBuf= (char*)calloc(DEFAULT_BUFLEN, sizeof(char)); 
    iRes = recv(ConnectSocket, recvBuf, DEFAULT_BUFLEN, 0);
    cout << WSAGetLastError() << endl;
    cout << "Recv: " << recvBuf << endl;
    if (iRes==SOCKET_ERROR)
    {
        cout << recvBuf << endl;
        err = WSAGetLastError();
        wprintf(L"WSARecv failed with error: %d\n", err);
        break;
    }     

}

Any ideas?

Crucifer answered 29/8, 2011 at 16:27 Comment(4)
I'd suggest you change your code to not allocate in the loop, otherwise you're leaking memory, one DEFAULT_BUFLEN at a time. Also, what's the stop condition for the loop? and is it possible that you are consuming the data before reaching this recv?Cistaceous
Yeah I understand that it's leaking memory but I'm not too fussed about that right now. I can easily switch it to memset. I do a print after every receive which would indicate that the data is never arriving.Crucifer
It might help if you post the code before this bit, to see if you are accidentally consuming the data. also, note that recvBuf is never modified if recv returns an error, so printing that is pointless.Cistaceous
cout << recvBuf is dangerous; if you fill the buffer then it's not going to be NULL-terminated.Germander
O
10

You need to change your reading code. You cannot read chunked data using a fixed-length buffer like you are trying to do. The data is sent in variable-length chunks, where each chunk has a header that specifies the actual length of the chunk in bytes, and the final chunk of the data has a length of 0. You need to read the chunked headers in order to process the chunks properly. Please read RFC 2616 Section 3.6.1. Your logic needs to be more like the following pseudo-code:

send request;

status = recv() a line of text until CRLF;
parse status as needed;
response-code = extract response-code from status;
response-version = extract response-version from status;

do
{
    line = recv() a line of text until CRLF;
    if (line is blank)
        break;
    store line in headers list;
}
while (true);

parse headers list as needed;

if ((response-code is not in [1xx, 204, 304]) and (request was not "HEAD"))
{
    if (Transfer-Encoding header is present and not "identity")
    {
        do
        {
            line = recv a line of text until CRLF;
            length = extract length from line;
            extensions = extract extensions from line;
            process extensions as needed; // optional
            if (length == 0)
                break;
            recv() length number of bytes into destination buffer;
            recv() and discard bytes until CRLF;
        }
        while (true);

        do
        {
            line = recv a line of text until CRLF;
            if (line is blank)
                break;
            store line in headers list as needed;
        }
        while (true);

        re-parse headers list as needed;
    }
    else if (Content-Length header is present)
    {
        recv() Content-Length number of bytes into destination buffer;
    }
    else if (Content-Type header starts with "multipart/")
    {
        boundary = extract boundary from Content-Type's "boundary" attribute;
        recv() data into destination buffer until MIME termination boundary is reached;
    }
    else
    {
        recv() data into destination buffer until disconnected;
    }
}

if (not disconnected)
{
    if (response-version is "HTTP/1.1")
    {
        if (Connection header is "close")
            close connection;
    }
    else
    {
        if (Connection header is not "keep-alive")
            close connection;
    }
}

check response-code for errors;
process destination buffer, per info in headers list;
Oversexed answered 29/8, 2011 at 18:39 Comment(0)
R
0

Indeed you do not receive chunked, but the content is chunked. You have to draw a picture for yourself how any buffer you receive might look. It's not like you receive one chunk at the time. Sometimes you have some data of the previous chunk, the line indicating the size of the new chunk, followed by some chunk data. Some other time you just receive just a bit of chunk data. Another time a bit of chunk data and a part of the line indicating the new chunk, etc, etc. Imagine the worst case scenarios, this isn't easy. Read this: http://www.jmarshall.com/easy/http/

Before you can use the following piece of code receive all the headers until the empty line. Where the content starts in the buffer is nContentStart. The code uses some in-house classes I cannot share but you should get the idea ;) As far as I tested it works like expected and does not leak memory. Although since this isn't easy I cannot be completely sure!

    if (bChunked)
    {
        int nOffset = nContentStart;
        int nChunkLen = 0;
        int nCopyLen;

        while (true)
        {
            if (nOffset >= nDataLen)
                {pData->SetSize(0); Close(); ASSERTRETURN(false);}

            // copy data of previous chunk to caller's buffer

            if (nChunkLen > 0)
            {
                nCopyLen = min(nChunkLen, nDataLen - nOffset);
                n = pData->GetSize();
                pData->SetSize(n + nCopyLen);
                memcpy(pData->GetPtr() + n, buf.GetPtr() + nOffset, nCopyLen);
                nChunkLen -= nCopyLen;
                ASSERT(nChunkLen >= 0);

                nOffset += nCopyLen;
                if (nChunkLen == 0)
                    nOffset += strlen(lpszLineBreak);
                ASSERT(nOffset <= nDataLen);
            }

            // when previous chunk is copied completely, process new chunk

            if (nChunkLen == 0 && nOffset < nDataLen)
            {
                // chunk length is specified on first line

                p1 = buf.GetPtr() + nOffset;
                p2 = strstr(p1, lpszLineBreak);

                while (!p2) // if we can't find the line break receive more data until we do
                {
                    buf.SetSize(nDataLen + RECEIVE_BUFFER_SIZE + 1);
                    nReceived = m_socket.Receive((BYTE*)buf.GetPtr() + nDataLen, RECEIVE_BUFFER_SIZE);

                    if (nReceived == -1)
                        {pData->SetSize(0); Close(); ASSERTRETURN(false);} // connection error
                    if (nReceived == 0)
                        {pData->SetSize(0); Close(); ASSERTRETURN(false);} // all data already received but did not find line break

                    nDataLen += nReceived;
                    buf[nDataLen] = 0;

                    p1 = buf.GetPtr() + nOffset; // address of buffer likely changed
                    p2 = strstr(p1, lpszLineBreak);
                }

                *p2 = 0;
                p2 += strlen(lpszLineBreak);

                p3 = strchr(p1, ';');
                if (p3)
                    *p3 = 0;

                if (sscanf(p1, "%X", &nChunkLen) != 1)
                    {pData->SetSize(0); Close(); ASSERTRETURN(false);}

                if (nChunkLen < 0)
                    {pData->SetSize(0); Close(); ASSERTRETURN(false);}

                if (nChunkLen == 0)
                    break; // last chunk received

                // copy the following chunk data to caller's buffer

                nCopyLen = min(nChunkLen, buf.GetPtr() + nDataLen - p2);
                n = pData->GetSize();
                pData->SetSize(n + nCopyLen);
                memcpy(pData->GetPtr() + n, p2, nCopyLen);
                nChunkLen -= nCopyLen;
                ASSERT(nChunkLen >= 0);

                nOffset = (p2 - buf.GetPtr()) + nCopyLen;
                if (nChunkLen == 0)
                    nOffset += strlen(lpszLineBreak);

                if (nChunkLen == 0 && nOffset < nDataLen)
                    continue; // a new chunk starts in this buffer at nOffset, no need to receive more data
            }

            // receive more data

            buf.SetSize(RECEIVE_BUFFER_SIZE + 1);
            nDataLen = m_socket.Receive((BYTE*)buf.GetPtr(), RECEIVE_BUFFER_SIZE);
            if (nDataLen == -1)
                {pData->SetSize(0); Close(); ASSERTRETURN(false);}
            if (nDataLen == 0)
                {pData->SetSize(0); Close(); ASSERTRETURN(false);}
            buf[nDataLen] = 0;

            nOffset = 0;
        }

        // TODO: receive optional footers and add them to m_headers
    }
Resign answered 30/8, 2011 at 14:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.