C# Socket.Receive message length
Asked Answered
G

4

7

I'm currently in the process of developing a C# Socket server that can accept multiple connections from multiple client computers. The objective of the server is to allow clients to "subscribe" and "un-subscribe" from server events.

So far I've taken a jolly good look over here: http://msdn.microsoft.com/en-us/library/5w7b7x5f(v=VS.100).aspx and http://msdn.microsoft.com/en-us/library/fx6588te.aspx for ideas.

All the messages I send are encrypted, so I take the string message that I wish to send, convert it into a byte[] array and then encrypt the data before pre-pending the message length to the data and sending it out over the connection.

One thing that strikes me as an issue is this: on the receiving end it seems possible that Socket.EndReceive() (or the associated callback) could return when only half of the message has been received. Is there an easy way to ensure each message is received "complete" and only one message at a time?

EDIT: For example, I take it .NET / Windows sockets does not "wrap" the messages to ensure that a single message sent with Socket.Send() is received in one Socket.Receive() call? Or does it?

My implementation so far:

private void StartListening()
{
    IPHostEntry ipHostInfo = Dns.GetHostEntry(Dns.GetHostName());
    IPEndPoint localEP = new IPEndPoint(ipHostInfo.AddressList[0], Constants.PortNumber);

    Socket listener = new Socket(localEP.Address.AddressFamily, SocketType.Stream, ProtocolType.Tcp);
    listener.Bind(localEP);
    listener.Listen(10);

    while (true)
    {
        // Reset the event.
        this.listenAllDone.Reset();

        // Begin waiting for a connection
        listener.BeginAccept(new AsyncCallback(this.AcceptCallback), listener);

        // Wait for the event.
        this.listenAllDone.WaitOne();
    }
}

private void AcceptCallback(IAsyncResult ar)
{
    // Get the socket that handles the client request.
    Socket listener = (Socket) ar.AsyncState;
    Socket handler = listener.EndAccept(ar);

    // Signal the main thread to continue.
    this.listenAllDone.Set();

    // Accept the incoming connection and save a reference to the new Socket in the client data.
    CClient client = new CClient();
    client.Socket = handler;

    lock (this.clientList)
    {
        this.clientList.Add(client);
    }

    while (true)
    {
        this.readAllDone.Reset();

        // Begin waiting on data from the client.
        handler.BeginReceive(client.DataBuffer, 0, client.DataBuffer.Length, 0, new AsyncCallback(this.ReadCallback), client);

        this.readAllDone.WaitOne();
    }
}

private void ReadCallback(IAsyncResult asyn)
{
    CClient theClient = (CClient)asyn.AsyncState;

    // End the receive and get the number of bytes read.
    int iRx = theClient.Socket.EndReceive(asyn);
    if (iRx != 0)
    {
        // Data was read from the socket.
        // So save the data 
        byte[] recievedMsg = new byte[iRx];
        Array.Copy(theClient.DataBuffer, recievedMsg, iRx);

        this.readAllDone.Set();

        // Decode the message recieved and act accordingly.
        theClient.DecodeAndProcessMessage(recievedMsg);

        // Go back to waiting for data.
        this.WaitForData(theClient);
    }         
}
Gasconade answered 23/3, 2011 at 16:28 Comment(0)
B
10

Yes, it is possible you'll have only part of message per one receiving, also it can be even worse during transfer only part of message will be sent. Usually you can see that during bad network conditions or under heavy network load.

To be clear on network level TCP guaranteed to transfer your data in specified order but it not guaranteed that portions of data will be same as you sent. There are many reasons for that software (take a look to Nagle's algorithm for example), hardware (different routers in trace), OS implementation, so in general you should never assume what part of data already transferred or received.

Sorry for long introduction, below some advices:

  1. Try to use relatevely "new" API for high-performance socket server, here samples Networking Samples for .NET v4.0

  2. Do not assume you always send full packet. Socket.EndSend() returns number of bytes actually scheduled to send, it can be even 1-2 bytes under heavy network load. So you have to implement resend rest part of buffer when it required.

    There is warning on MSDN:

    There is no guarantee that the data you send will appear on the network immediately. To increase network efficiency, the underlying system may delay transmission until a significant amount of outgoing data is collected. A successful completion of the BeginSend method means that the underlying system has had room to buffer your data for a network send.

  3. Do not assume you always receive full packet. Join received data in some kind of buffer and analyze it when it have enough data.

  4. Usually, for binary protocols, I add field to indicate how much data incoming, field with message type (or you can use fixed length per message type (generally not good, e.g. versioning problem)), version field (where applicable) and add CRC-field to end of message.

  5. It not really required to read, a bit old and applies directly to Winsock but maybe worth to study: Winsock Programmer's FAQ

  6. Take a look to ProtocolBuffers, it worth to learn: http://code.google.com/p/protobuf-csharp-port/, http://code.google.com/p/protobuf-net/

Hope it helps.

P.S. Sadly sample on MSDN you refer in question effectively ruin async paradigm as stated in other answers.

Brahmin answered 24/3, 2011 at 15:53 Comment(7)
Great answer, I think you've covered all the bases nicely there! Thanks a lot! ;)Gasconade
@Gasconade I have the exact same problem :-( How did you solve it? Have you implemented the above mentioned point 4? (seems to me to be the most "reasonable" solution...)Viral
@Viral Yes I fully implemented point 4, adding the length of the msg to the front of each msg and then buffering on the server end until a "whole" msg had been received.Gasconade
Nice answer, however I disagree with this part "add CRC-field to end of message" because TCP ensures valid data is transmitted over the network.Anelace
@Javid, thank you for the comment. It is one of a common misconception that TCP ensures and guarantees data come always valid. Keep in mind you've read RFC, but in practice packets can pass through broken router (with memory issues or similar) or particular TCP implementation (e.g. for embedded systems) can skip checks and so on. Hopefully, you never meet corrupted packets but I got more that enough in my practice, especially under heavy load. So it's up to you, but I suggest to always add CRC.Brahmin
@NickMartyshchenko: Thank you for explanation. Actually, I don't have much experience so I guess you're right if you have already experienced corrupted packets with TCP.Anelace
@Javid, yes, I've got it when our distributed sensor network comes to live. It took around a month to actually track down the source why we got corrupted data time to time (one of the routers has memory issues). Also, I found that in practice you should check everything to make it robust. Do not rely just on the expected properties.Brahmin
A
5

Your code is very wrong. Doing loops like that defeats the purpose of asynchronous programming. Async IO is used to not block the thread but let them continue doing other work. By looping like that, you are blocking the thread.

void StartListening()
{
    _listener.BeginAccept(OnAccept, null);
}

void OnAccept(IAsyncResult res)
{
    var clientSocket = listener.EndAccept(res);

    //begin accepting again
    _listener.BeginAccept(OnAccept, null);

   clientSocket.BeginReceive(xxxxxx, OnRead, clientSocket);
}

void OnReceive(IAsyncResult res)
{
    var socket = (Socket)res.Asyncstate;

    var bytesRead = socket.EndReceive(res);
    socket.BeginReceive(xxxxx, OnReceive, socket);

    //handle buffer here.
}

Note that I've removed all error handling to make the code cleaner. That code do not block any thread and is therefore much more effecient. I would break the code up in two classes: the server handling code and the client handling code. It makes it easier to maintain and extend.

Next thing to understand is that TCP is a stream protocol. It do not guarentee that a message arrives in one Receive. Therefore you must know either how large a message is or when it ends.

The first solution is to prefix each message with an header which you parse first and then continue reading until you get the complete body/message.

The second solution is to put some control character at the end of each message and continue reading until the control character is read. Keep in mind that you should encode that character if it can exist in the actual message.

Amadis answered 24/3, 2011 at 15:38 Comment(1)
Strange that the code at MSDN uses loop for the Asynchronous Server Socket Example. Could it be because they are demonstrating it from a console application?Blister
C
2

You need to send fixed length messages or include in the header the length of the message. Try to have something that allows you to clearly identify the start of a packet.

Candide answered 23/3, 2011 at 16:33 Comment(5)
I can't really keep the messages to a fixed length as I have simple messages like a heartbeat poll which are tiny, compared to a complex event notification that might be the best part of 500 bytes, say. I do send all messages with the length as the first item, however I don't append an "end" character at the end in the encoded message (there is one in the un-encoded message).Gasconade
Also, what with using AES encryption to "garble" the message, what can I use to end/start a single message, as anything could occur in encrypted data!?Gasconade
Well, this is a really common problem. Just try a somewhat long string like "HEADER" and "TAIL", it's very uncommon that it will come as encripted data just when you're searching for the header or the tail, and if it comes just consider it a corrupted message and wait for a resend. You should only delimit the un-encoded message, the encoded data must be just data into the un-encoded message.Candide
Take a look at my "EDIT:" to the original message, as this is what a fellow colleague programmer is saying happens.. Although it seems possible, it's not what I'm experiencing.Gasconade
Also, while I understand what you're suggesting, I find it incredible that not a single example I can find on the MSDN (or web generally) takes account that the whole "sent" message may not be received in one "receive" call. Do you have any examples?Gasconade

© 2022 - 2024 — McMap. All rights reserved.