Jon Skeet's answer unfortunately leaves a big part of the picture out - the buffer sizes for the send and receive buffers, and the bandwidth-delay product of the pipe you're writing to.
If you are trying to send data over a large pipe using a single socket, and you want TCP to fill that pipe, you need to use a send buffer size and receive buffer size that are each equivalent to the bandwidth-delay product of the pipe. Otherwise, TCP will not fill the pipe because it will not leave enough 'bytes in flight' at all times.
TCP handles packet loss for you, which means that it has to have buffers to hold onto the data you give it until it can confirm that data has been received correctly by the other side (by a TCP ACK). No buffer is infinite, therefore there has to be a limit somewhere. That limit is arbitrary, you get to choose whatever you want, but you need to make sure it is large enough to handle the connection's BDP.
Consider a TCP socket that has a buffer size of exactly: 1 byte. And you're trying to send data over a connection that has a bitrate of 1 gbit/sec and a one-way latency of 1 ms.
- You give the TCP socket your first byte.
- The socket blocks any further write calls (the send buffer is full).
- TCP sends the one byte. The gig eth adapter has a bulk transmission rate of 8 ns per byte, so the transmission time is negligible.
- 1 millisecond later, the receiver gets the 1 byte.
- 1 millisecond later, you get an ack back from the receiver.
- TCP removes the first byte from the send buffer, because it has confirmed the receiver correctly got the byte.
- The send buffer unblocks because it has room.
- You give the TCP socket your second byte...
- and so on.
How fast is this connection getting data across? It takes 2 milliseconds to send 1 byte, therefore, this connection is getting a bitrate of 500 bytes/sec == 4 kbit/sec.
Yikes.
Consider a connection that has a speed of 1 gigabit, and has a one-way latency of 10 milliseconds, on average. The round-trip-time (aka, the amount of time that elapses between your socket sending a packet and the time it receives the ack for that packet and thus knows to send more data) is usually twice the latency.
So if you have a 1 gigabit connection, and a RTT of 20 milliseconds, then that pipe has 1 gigabit/sec * 20 milliseconds == 2.5 megabytes of data in flight at all time if it's being utilized completely.
If your TCP send buffer is anything less than 2.5 megabytes, then that one socket will never fully utilize the pipe - you'll never get a gigabit/sec of performance out of your socket.
If your application uses many sockets, then the aggregate size of all TCP send buffers must be 2.5 MB in order to fully utilize this hypothetical 1 gigabit/20 ms RTT pipe. For instance, if you use 8192-byte buffers, you need 306 simultaneous TCP sockets to fill that pipe.
Edit for questions:
Calculating BDP is just multiplying the Bandwidth times the
Round-trip Delay and paying attention to units.
So if you have a 1 gigabit/sec connection, and a round-trip time of 20 msecs, then what happens is you're multiplying Bits/Sec * Seconds, so the seconds cancel out and you're left with Bits. Convert to Bytes and you have your buffer size.
- 1 gbit/sec * 20 msec == 1 * gbit/sec * 0.02 sec == (1 * 0.02) gbit
- 0.020 gbit == 20 MBit.
- 20 Mbit * 1 Byte / 8 bits == 20 / 8 MBytes == 2.5 MBytes.
And thus, our TCP buffer needs to be set to 2.5 MB to saturate this made-up pipe.