About recv and the read buffer - C Berkeley Sockets
Asked Answered
W

4

10

I am using berkeley sockets and TCP (SOCK_STREAM sockets).

The process is:

  1. I connect to a remote address.
  2. I send a message to it.
  3. I receive a message from it.

Imagine I am using the following buffer:

char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);

Questions are:

  • How can I know if after calling recv first time the read buffer is empty or not? If it's not empty I would have to call recv again, but if I do that when it's empty I would have it blocking for much time.
  • How can I know how many bytes I have readed into recv_buffer? I can't use strlen because the message I receive can contain null bytes.

Thanks.

Wallsend answered 6/12, 2010 at 1:37 Comment(0)
L
12

How can I know if after calling recv first time the read buffer is empty or not? If it's not empty I would have to call recv again, but if I do that when it's empty I would have it blocking for much time.

You can use the select or poll system calls along with your socket descriptor to tell if there is data waiting to be read from the socket.

However, usually there should be an agreed-upon protocol that both sender and receiver follow, so that both parties know how much data is to be transferred. For example, perhaps the sender first sends a 2-byte integer indicating the number of bytes it will send. The receiver then first reads this 2-byte integer, so that it knows how many more bytes to read from the socket.

Regardless, as Tony pointed out below, a robust application should use a combination of length-information in the header, combined with polling the socket for additional data before each call to recv, (or using a non-blocking socket). This will prevent your application from blocking in the event that, for example, you know (from the header) that there should still be 100 bytes remaining to read, but the peer fails to send the data for whatever reason (perhaps the peer computer was unexpectedly shut off), thus causing your recv call to block.

How can I know how many bytes I have readed into recv_buffer? I can't use strlen because the message I receive can contain null bytes.

The recv system call will return the number of bytes read, or -1 if an error occurred.

From the man page for recv(2):

[recv] returns the number of bytes received, or -1 if an error occurred. The return value will be 0 when the peer has performed an orderly shutdown.

Lacewing answered 6/12, 2010 at 1:41 Comment(3)
What's the relevance of the read(2) manual page to recv(2)? They say similar things, but quoting the relevant page would be better.Prepossess
@Jonathan, when the descriptor type is a socket, read is the same as recv, except recv allows an extra flags parameter. But I edited my answer to use recv to avoid the confusion.Lacewing
Just a nitpick re a subtle, presumably unintended implication: "select/poll / however message-length in header" falsely suggests that such headers resolve the blocking issue, where-as select/poll, non-blocking sockets or threads should be used in combination with message-length header or sentinel data.Furbelow
F
2

How can I know if after calling recv first time the read buffer is empty or not?

Even the first time (after accepting a client), the recv can block and fail if the client connection has been lost. You must either:

  • use select or poll (BSD sockets) or some OS-specific equivalent, which can tell you whether there is data available on specific socket descriptors (as well as exception conditions, and buffer space you can write more output to)
  • you can set the socket to be nonblocking, such that recv will only return whatever is immediately available (possibly nothing)
  • you can create a thread that you can afford to have block recv-ing data, knowing other threads will be doing the other work you're concerned to continue with

How can I know how many bytes I have readed into recv_buffer? I can't use strlen because the message I receive can contain null bytes.

recv() returns the number of bytes read, or -1 on error.

Note that TCP is a byte stream protocol, which means that you're only guaranteed to be able to read and write bytes from it in the correct order, but the message boundaries are not guaranteed to be preserved. So, even if the sender has made a large single write to their socket, it can be fragmented en route and arrive in several smaller blocks, or several smaller send()/write()s can be consolidated and retrieved by one recv()/read().

For that reason, make sure you loop calling recv until you either get all the data you need (i.e. a complete logical message you can process) or an error. You should be prepared/able to handle getting part/all of subsequent sends from your client (if you don't have a protocol where each side only sends after getting a complete message from the other, and are not using headers with message lengths). Note that doing recvs for the message header (with length) then the body can result in a lot more calls to recv(), with a potential adverse affect on performance.

These reliability issues are often ignored. They manifest less often when on a single host, a reliable and fast LAN, with less routers and switches involved, and fewer or non-concurrent messages. Then they may break under load and over more complex networks.

Furbelow answered 6/12, 2010 at 2:13 Comment(0)
P
0
  1. If the recv() returns fewer than 3000 bytes, then you can assume that the read buffer was empty. If it returns 3000 bytes in your 3000 byte buffer, then you'd better know whether to continue. Most protocols include some variation on TLV - type, length, value. Each message contains an indicator of the type of message, some length (possibly implied by the type if the length is fixed), and the value. If, on reading through the data you did receive, you find that the last unit is incomplete, you can assume there is more to be read. You can also make the socket into a non-blocking socket; then the recv() will fail with EAGAIN or EWOULDBLOCK if there is no data read for reading.

  2. The recv() function returns the number of bytes read.

Prepossess answered 6/12, 2010 at 1:44 Comment(2)
Not correct. You can assume that the receive buffer was emptied by that read, but you can't assume that data hasn't subsequently arrived in it by the time you are next ready to call recv().Materialism
@EJP: the 'wrong' is a very strong statement - it seems obvious to me that you can't tell whether data has arrived since your recv() call, but maybe that level of basic, obvious statement does need to be pointed out.Prepossess
M
0

ioctl() with the FIONREAD option tells you how much data can currently be read without blocking.

Materialism answered 13/2, 2011 at 3:4 Comment(2)
The ioctl() function is not really a part of the POSIX standard, though it shows up as an obsolescent interface in the STREAMS part of the Single UNIX Specification (see ioctl(). It is in fact available on most UNIX-derived platforms, but it is rather platform specific.Prepossess
@Jonathan Leffler: agreed (not that the OP mentioned POSIX). FIONREAD or its variants are widely enough supported that Java can provide available() on sockets on all its platforms.Materialism

© 2022 - 2024 — McMap. All rights reserved.