How large should my recv buffer be when calling recv in the socket library
Asked Answered
D

6

160

I have a few questions about the socket library in C. Here is a snippet of code I'll refer to in my questions.

char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);
  1. How do I decide how big to make recv_buffer? I'm using 3000, but it's arbitrary.
  2. what happens if recv() receives a packet bigger than my buffer?
  3. how can I know if I have received the entire message without calling recv again and have it wait forever when there is nothing to be received?
  4. is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space? maybe using strcat to concatenate the latest recv() response to the buffer?

I know it's a lot of questions in one, but I would greatly appreciate any responses.

Dyadic answered 19/5, 2010 at 0:19 Comment(0)
P
269

The answers to these questions vary depending on whether you are using a stream socket (SOCK_STREAM) or a datagram socket (SOCK_DGRAM) - within TCP/IP, the former corresponds to TCP and the latter to UDP.

How do you know how big to make the buffer passed to recv()?

  • SOCK_STREAM: It doesn't really matter too much. If your protocol is a transactional / interactive one just pick a size that can hold the largest individual message / command you would reasonably expect (3000 is likely fine). If your protocol is transferring bulk data, then larger buffers can be more efficient - a good rule of thumb is around the same as the kernel receive buffer size of the socket (often something around 256kB).

  • SOCK_DGRAM: Use a buffer large enough to hold the biggest packet that your application-level protocol ever sends. If you're using UDP, then in general your application-level protocol shouldn't be sending packets larger than about 1400 bytes, because they'll certainly need to be fragmented and reassembled.

What happens if recv gets a packet larger than the buffer?

  • SOCK_STREAM: The question doesn't really make sense as put, because stream sockets don't have a concept of packets - they're just a continuous stream of bytes. If there's more bytes available to read than your buffer has room for, then they'll be queued by the OS and available for your next call to recv.

  • SOCK_DGRAM: The excess bytes are discarded.

How can I know if I have received the entire message?

  • SOCK_STREAM: You need to build some way of determining the end-of-message into your application-level protocol. Commonly this is either a length prefix (starting each message with the length of the message) or an end-of-message delimiter (which might just be a newline in a text-based protocol, for example). A third, lesser-used, option is to mandate a fixed size for each message. Combinations of these options are also possible - for example, a fixed-size header that includes a length value.

  • SOCK_DGRAM: An single recv call always returns a single datagram.

Is there a way I can make a buffer not have a fixed amount of space, so that I can keep adding to it without fear of running out of space?

No. However, you can try to resize the buffer using realloc() (if it was originally allocated with malloc() or calloc(), that is).

Parthenogenesis answered 19/5, 2010 at 0:53 Comment(14)
I have an "/r/n/r/n" at the end of a message in the protocol I'm using. And I have a do while loop, inside I'm calling recv I place the message at the beginning of recv_buffer. and my while statement looks like this while((!(strstr(recv_buffer, "\r\n\r\n")); My question is, is it possible for one recv to get "\r\n" and in the next recv get "\r\n", so that my while condition never comes true?Dyadic
Yes, it is. You can solve that problem by looping around if you don't have a complete message and stuffing the bytes from the next recv into the buffer following the partial message. You shouldn't use strstr() on the raw buffer filled by recv() - there's no guarantee that it contains a nul-terminator, so it might cause strstr() to crash.Parthenogenesis
In case of UDP, there is nothing wrong with sending UDP packets above 1400 bytes. Fragmentation is perfectly legal and a fundamental part of the IP protocol (even in IPv6, yet there always the initial sender must perform fragmentation). For UDP you are always save if you use a buffer of 64 KB, since no IP packet (v4 or v6) can be above 64 KB in size (not even when fragmented) and this even includes the headers IIRC, so data will always be below 64 KB for sure.Pavid
@Parthenogenesis do you need to empty the buffer on each call to recv()? I'v seen code loop and collect the data and loop it again which should collect more data. But if the buffer ever gets full don't you need to empty it in order to avoid a memory violation due to writing pass the amount of memory allocated for the buffer?Disfeature
@Alex_Nabu: You don't need to empty it as long as there's some space remaining in it, and you don't tell recv() to write more bytes than there is space remaining.Parthenogenesis
@Parthenogenesis i don't understand. then what happends if the aggregate data you are receiving is larger than the buffer? What happends if I set my buffer to 1024 and I fill up all 1024 bytes but there is still more data fore receive() to give me. what happends to the buffer? does it rewrite the data already in it or what?Disfeature
@Alex_Nabu: You pass recv() a pointer to a buffer and a length, and will write up to that many bytes to the buffer. If you have a buffer of length 1024, there's more than 1024 bytes to recieve and you call recv(fd, buffer, 1024, flags), then it will write 1024 bytes to your buffer and return the value 1024. The remaining bytes will be available to read the next time you call recv() with a sufficiently sized buffer.Parthenogenesis
Hello, I am using SO_STREAM unix socket, if remote peer sent 3 bytes, say 'abc', here recv only recv 2 bytes, then remote peer sent another 3 bytes, say 'def', at this point recv 2 bytes again, I will get 'de', not 'cd', what happens to in the kernel side, where is the 'c' now?Whisler
@hylepo: No, that's not how it works. If the peer sends "abc" but recv() returns 2, then the "c" will remain in the socket buffer to be returned by the next recv(). If the peer then sends "def", the next recv() can return "cdef".Parthenogenesis
@Parthenogenesis Help me understand if a receive a SOCK_STREAM larger then the max value of /proc/sys/net/ipv4/tcp_rmem what would happen in such case. Assuming the min, default and max value of linux receive buffer is 1 2 4 and sending part sent a TCP stream of 10 bytes what would happen on the receiving side will drop the next 6 bytes ..Serialize
@Viren: The receiving side will not advertise a receive window larger than the buffer space it has remaining, so the sending side will not send more than that.Parthenogenesis
@Parthenogenesis can you help me understand this then https://mcmap.net/q/152148/-how-does-a-linux-socket-buffer-overflow The first answer with 8 upvotes. According to that answer the Kernel will drop message.Serialize
@Viren: That question is referring to a multicast socket which means it is UDP (SOCK_DGRAM), not TCP. There is no built-in flow control for UDP.Parthenogenesis
@Parthenogenesis Very well. Thanks for your patience on me .. Btw found this reading about TCP receive window attaching so that it can be helpful to someone else {blog.performancevision.com/tcp-receive-windows](http://…Serialize
N
18

For streaming protocols such as TCP, you can pretty much set your buffer to any size. That said, common values that are powers of 2 such as 4096 or 8192 are recommended.

If there is more data then what your buffer, it will simply be saved in the kernel for your next call to recv.

Yes, you can keep growing your buffer. You can do a recv into the middle of the buffer starting at offset idx, you would do:

recv(socket, recv_buffer + idx, recv_buffer_size - idx, 0);
Noonan answered 19/5, 2010 at 0:28 Comment(2)
Power of two can be more efficient in multiple ways, and is strongly suggested.Smiga
elaborating on @theatrus, a notable efficiency is that modulo operator can be replaced by bitwise and with a mask (e.g. x % 1024 == x & 1023), and integer division can be replaced by a shift right operation (e.g. x / 1024 == x / 2^10 == x >> 10)Edholm
B
15

If you have a SOCK_STREAM socket, recv just gets "up to the first 3000 bytes" from the stream. There is no clear guidance on how big to make the buffer: the only time you know how big a stream is, is when it's all done;-).

If you have a SOCK_DGRAM socket, and the datagram is larger than the buffer, recv fills the buffer with the first part of the datagram, returns -1, and sets errno to EMSGSIZE. Unfortunately, if the protocol is UDP, this means the rest of the datagram is lost -- part of why UDP is called an unreliable protocol (I know that there are reliable datagram protocols but they aren't very popular -- I couldn't name one in the TCP/IP family, despite knowing the latter pretty well;-).

To grow a buffer dynamically, allocate it initially with malloc and use realloc as needed. But that won't help you with recv from a UDP source, alas.

Brei answered 19/5, 2010 at 0:28 Comment(1)
As UDP always returns at most one UDP packet (even if multiple are in the socket buffer) and no UDP packet can be above 64 KB (an IP packet may at most be 64 KB, even when fragmented), using a 64 KB buffer is absolutely safe and guarantees, that you never lose any data during a recv on an UDP socket.Pavid
P
11

For SOCK_STREAM socket, the buffer size does not really matter, because you are just pulling some of the waiting bytes and you can retrieve more in a next call. Just pick whatever buffer size you can afford.

For SOCK_DGRAM socket, you will get the fitting part of the waiting message and the rest will be discarded. You can get the waiting datagram size with the following ioctl:

#include <sys/ioctl.h>
int size;
ioctl(sockfd, FIONREAD, &size);

Alternatively you can use MSG_PEEK and MSG_TRUNC flags of the recv() call to obtain the waiting datagram size.

ssize_t size = recv(sockfd, buf, len, MSG_PEEK | MSG_TRUNC);

You need MSG_PEEK to peek (not receive) the waiting message - recv returns the real, not truncated size; and you need MSG_TRUNC to not overflow your current buffer.

Then you can just malloc(size) the real buffer and recv() datagram.

Palla answered 9/8, 2016 at 18:32 Comment(9)
MSG_PEEK|MSG_TRUNC makes no sense.Tokenism
You want MSG_PEEK to peek (not receive) the waiting message, to obtain its size (recv returns the real, not truncated size) and you need MSG_TRUNC to not overflow your current buffer. Once you get the size you allocate the correct buffer and receive (not peek, not truncate) the waiting message.Palla
@Alex Martelli says 64KB is the max size of a UDP packet so if we malloc() for a buffer of 64KB then MSG_TRUNC is unnecessary?Tripp
IP protocol supports fragmentation, so the datagram may be larger than a single packet - it will be fragmented and transmitted in multiple packets. Also SOCK_DGRAM is not only UDP.Palla
Note that for UDP, ioctl(FIONREAD) returns the total bytes of all queued datagrams, not the byte size of the next datagramBrickbat
@RemyLebeau See man7.org/linux/man-pages/man7/udp.7.html FIONREAD (SIOCINQ) Gets a pointer to an integer as argument. Returns the size of the next pending datagram in the integer in bytes, or 0 when no datagram is pending.Palla
@Palla on Linux, yes. But not on most other platforms. The behavior is platform-dependentBrickbat
@RemyLebeau Thanks for clarification. :-) Anyway, allocating the buffer of returned size will be enough to fit the datagram, right?Palla
@Palla Yes, though you might end up over-allocating, that's all.Brickbat
N
1

There is no absolute answer to your question, because technology is always bound to be implementation-specific. I am assuming you are communicating in UDP because incoming buffer size does not bring problem to TCP communication.

According to RFC 768, the packet size (header-inclusive) for UDP can range from 8 to 65 515 bytes. So the fail-proof size for incoming buffer is 65 507 bytes (~64KB)

However, not all large packets can be properly routed by network devices, refer to existing discussion for more information:

What is the optimal size of a UDP packet for maximum throughput?
What is the largest Safe UDP Packet Size on the Internet

Newness answered 19/5, 2010 at 1:22 Comment(0)
S
-4

16kb is about right; if you're using gigabit ethernet, each packet could be 9kb in size.

Shel answered 19/5, 2010 at 0:46 Comment(1)
TCP sockets are streams, that means a recv may return data accumulated from multiple packets, so the packet size is totally irrelevant for TCP. In case of UDP, each recv call returns at most a single UDP packet, here the packet size is relevant but the correct packet size is about 64 KB, as an UDP packet may (and often will) be fragmented if required. However, no IP packet can be above 64 KB, not even with fragmentation, thus recv on an UDP socket can at most return 64 KB (and what is not returned is discarded for the current packet!)Pavid

© 2022 - 2024 — McMap. All rights reserved.