What would cause UDP packets to be dropped when being sent to localhost?
Asked Answered
E

5

36

I'm sending very large (64000 bytes) datagrams. I realize that the MTU is much smaller than 64000 bytes (a typical value is around 1500 bytes, from my reading), but I would suspect that one of two things would happen - either no datagrams would make it through (everything greater than 1500 bytes would get silently dropped or cause an error/exception to be thrown) or the 64000 byte datagrams would get chunked into about 43 1500 byte messages and transmitted transparently.

Over a long run (2000+ 64000 byte datagrams), about 1% (which seems abnormally high for even a LAN) of the datagrams get dropped. I might expect this over a network, where datagrams can arrive out of order, get dropped, filtered, and so on. However, I did not expect this when running on localhost.

What is causing the inability to send/receive data locally? I realize UDP is unreliable, but I didn't expect it to be so unreliable on localhost. I'm wondering if it's just a timing issue since both the sending and receiving components are on the same machine.

For completeness, I've included the code to send/receive datagrams.

Sending:

DatagramSocket socket = new DatagramSocket(senderPort);

int valueToSend = 0;

while (valueToSend < valuesToSend || valuesToSend == -1) {
    byte[] intBytes = intToBytes(valueToSend);

    byte[] buffer = new byte[bufferSize - 4];

     //this makes sure that the data is put into an array of the size we want to send
    byte[] bytesToSend = concatAll(intBytes, buffer);

    System.out.println("Sending " + valueToSend + " as " + bytesToSend.length + " bytes");

    DatagramPacket packet = new DatagramPacket(bytesToSend,
                        bufferSize, receiverAddress, receiverPort);

    socket.send(packet);

    Thread.sleep(delay);

    valueToSend++;
}

Receiving:

DatagramSocket socket = new DatagramSocket(receiverPort);

while (true) {
    DatagramPacket packet = new DatagramPacket(
            new byte[bufferSize], bufferSize);

    System.out.println("Waiting for datagram...");
    socket.receive(packet);

    int receivedValue = bytesToInt(packet.getData(), 0);

    System.out.println("Received: " + receivedValue
            + ". Expected: " + expectedValue);

    if (receivedValue == expectedValue) {
        receivedDatagrams++;
        totalDatagrams++;
    }
    else {
        droppedDatagrams++;
        totalDatagrams++;
    }

    expectedValue = receivedValue + 1;
    System.out.println("Expected Datagrams: " + totalDatagrams);
    System.out.println("Received Datagrams: " + receivedDatagrams);
    System.out.println("Dropped Datagrams: " + droppedDatagrams);
    System.out.println("Received: "
            + ((double) receivedDatagrams / totalDatagrams));
    System.out.println("Dropped: "
            + ((double) droppedDatagrams / totalDatagrams));
    System.out.println();
}
Enticement answered 1/11, 2011 at 15:12 Comment(4)
Could you be reaching an internal OS buffer limit? The OS will only keep so much data before dropping packets.Postmortem
I've ran into this exact same scenario and thought the same thing, that I shouldn't really ever lose packets over localhost. Unfortunately, it does happen. You can easily create the situation by creating a simple UDP broadcaster that rapidly broadcasts a 512 byte message a thousand times. Create a simple UDP client that will receive the messages and check the counts... you will most certainly lose messages.Technics
@Randy With my most recent experimentation, it appears to only be because of the buffer size. Increasing the buffer size eliminated the problem entirely, at least under the conditions that I'm running.Enticement
@ThomasOwens Thanks for your response. I ran a few tests and to my amazement, I was able to send over 100,000 512 byte messages without a single lost packet over localhost by increasing the buffer size to 4 megabytes or (1024 * 4096 bytes)! Here's a snippet of the UDP message receiver: UdpClient listener = (UdpClient)ar.AsyncState; listener.Client.ReceiveBufferSize = 1024 * 4096; Byte[] receiveBytes = listener.EndReceive(ar, ref ipEndpoint); string receiveString = Encoding.ASCII.GetString(receiveBytes);Technics
R
41

Overview

What is causing the inability to send/receive data locally?

Mostly buffer space. Imagine sending a constant 10MB/second while only able to consume 5MB/second. The operating system and network stack can't keep up, so packets are dropped. (This differs from TCP, which provides flow control and re-transmission to handle such a situation.)

Even when data is consumed without overflowing buffers, there might be small time slices where data cannot be consumed, so the system will drop packets. (Such as during garbage collection, or when the OS task switches to a higher-priority process momentarily, and so forth.)

This applies to all devices in the network stack. A non-local network, an Ethernet switch, router, hub, and other hardware will also drop packets when queues are full. Sending a 10MB/s stream through a 100MB/s Ethernet switch while someone else tries to cram 100MB/s through the same physical line will cause dropped packets.

Increase both the socket buffers size and operating system's socket buffer size.

Linux

The default socket buffer size is typically 128k or less, which leaves very little room for pausing the data processing.

sysctl

Use sysctl to increase the transmit (write memory [wmem]) and receive (read memory [rmem]) buffers:

  • net.core.wmem_max
  • net.core.wmem_default
  • net.core.rmem_max
  • net.core.rmem_default

For example, to bump the value to 8 megabytes:

sysctl -w net.core.rmem_max=8388608

To make the setting persist, update /etc/sysctl.conf as well, such as:

net.core.rmem_max=8388608

An in-depth article on tuning the network stack dives into far more details, touching on multiple levels of how packets are received and processed in Linux from the kernel's network driver through ring buffers all the way to C's recv call. The article describes additional settings and files to monitor when diagnosing network issues. (See below.)

Before making any of the following tweaks, be sure to understand how they affect the network stack. There is a real possibility of rendering your network unusable. Choose numbers appropriate for your system, network configuration, and expected traffic load:

  • net.core.rmem_max=8388608
  • net.core.rmem_default=8388608
  • net.core.wmem_max=8388608
  • net.core.wmem_default=8388608
  • net.ipv4.udp_mem='262144 327680 434274'
  • net.ipv4.udp_rmem_min=16384
  • net.ipv4.udp_wmem_min=16384
  • net.core.netdev_budget=600
  • net.ipv4.ip_early_demux=0
  • net.core.netdev_max_backlog=3000

ethtool

Additionally, ethtool is useful to query or change network settings. For example, if ${DEVICE} is eth0 (use ip address or ipconfig to determine your network device name), then it may be possible to increase the RX and TX buffers using:

  • ethtool -G ${DEVICE} rx 4096
  • ethtool -G ${DEVICE} tx 4096

iptables

By default, iptables will log information about packets, which consumes CPU time, albeit minimal. For example, you can disable logging of UDP packets on port 6004 using:

iptables -t raw -I PREROUTING 1 -p udp --dport 6004 -j NOTRACK
iptables -I INPUT 1 -p udp --dport 6004 -j ACCEPT

Your particular port and protocol will vary.

Monitoring

Several files contain information about what is happening to network packets at various stages of sending and receiving. In the following list ${IRQ} is the interrupt request number and ${DEVICE} is the network device:

  • /proc/cpuinfo - shows number of CPUs available (helpful for IRQ-balancing)
  • /proc/irq/${IRQ}/smp-affinity - shows IRQ affinity
  • /proc/net/dev - contains general packet statistics
  • /sys/class/net/${DEVICE}/queues/QUEUE/rps_cpus - relates to Receive Packet Steering (RPS)
  • /proc/softirqs - used for ntuple filtering
  • /proc/net/softnet_stat - for packet statistics, such as drops, time squeezes, CPU collisions, etc.
  • /proc/sys/net/core/flow_limit_cpu_bitmap - shows packet flow (can help diagnose drops between large and small flows)
  • /proc/net/snmp
  • /proc/net/udp

Summary

Buffer space is the most likely culprit for dropped packets. There are numerous buffers strewn throughout the network stack, each having its own impact on sending and receiving packets. Network drivers, operating systems, kernel settings, and other factors can affect packet drops. There is no silver bullet.

Further Reading

Ronrona answered 1/11, 2011 at 15:37 Comment(2)
I believe Windows does not impose restrictions on the socket buffer size , so I'd try setting it to perhaps 2-5MB. (note, sending that large datagrams is really not optimal on an ethernet network, if just one of the fragments is lost, you lose the entire datagram.)Ronrona
Yeah - I'm aware of the losing one fragment = losing datagram. I raised that concern, but it was deemed to be acceptable for this particular application. If we lose a datagram, we can cope with it. And my searches turn up the same thing about Windows. This appears to mitigate (or at least reduce) the problem. I was even able to remove the delay between packet transmissions and only have a negligable loss of UDP datagrams (I sent almost 30000 datagrams before 1 was lost due to the buffer sizes).Enticement
U
4

UDP pkts scheduling may be handled by multiple threads on OS level. That would explain why you receive them out of order even on 127.0.0.1.

Unexampled answered 13/11, 2012 at 2:38 Comment(0)
B
2

Your expectations, as expressed in your question and in numerous comments to other answers, are wrong. All the following can happen even in the absence of routers and cables.

  1. If you send a packet to any receiver and there is no room in his socket receive buffer it will get dropped.

  2. If you send a UDP datagram larger than the path MTU it will get fragmented into smaller packets, which are subject to (1).

  3. If all the packets of a datagram don't arrive, the datagram will never get delivered.

  4. The TCP/IP stack has no obligation to deliver packets or UDP datagrams in order.

Blim answered 1/11, 2011 at 22:37 Comment(1)
I knew that 2-4 were true, although my problem was caused by 1. I just happened to run into a case where the default buffer sizes were causing problems, managed to produce a small code sample that demonstrated it, and got a solution.Enticement
O
0

UDP packets are not guaranteed to reach their destination whereas TCP is!

Onyx answered 1/11, 2011 at 15:23 Comment(9)
If I'm sending to 127.0.0.1, I'm not going over any kind of network. I would expect zero to no loss here in such a condition.Enticement
TCP and UDP are two different ways to send and receive packets. The network does not affect which approach you take. Please see this SO explanation and the skullbox link for an explanation. #48403Onyx
I know all about the differences between TCP and UDP. My point is that if I send to 127.0.0.1, there should be 0 loss of data, regardless of the protocol used, unless some data loss is explicitly introduced on purpose. UDP data loss comes from the fact that it's a message-oriented protocol with no concept of packet sequence. Neither applies when running locally.Enticement
How are you saying that it does not apply locally?Onyx
Because the reason why UDP packets get dropped. On a localhost loopback, there exist no factors that cause a packet to be dropped or arrive out of order (and in fact, only on networks with multiple paths between the two nodes can packets arrive out of order), unless you use a tool such as ipfw to cause packets to get dropped or delay arrival of specific packets.Enticement
I dont know how I can say for sure that even in localhost that there exists no reason for packets to arrive out of order, I think even on localhost it would send it over your local network. Interesting ,have you tried to run this program disabling all wirelesss and wired connections ?Onyx
Physics says there's no reason for packets to arrive out of order. On any network, local or not, that has only a single path between the nodes, a packet A sent before a packet B will always arrive at the destination first. In addition, sending to 127.0.0.1 should not (and under everything that I've ever used, does not) rely on any network. To prove this, simply unplug all network cables and turn off all wireless radios - you can still connect to 127.0.0.1/localhost and send data.Enticement
will try that, I think at the end of the day it depends upon the receiving end if there is too much congestion to process the data, which is by nature what that protocol does.Onyx
actually tcp packets are not 'guaranteed' to arrive, the only guarantee you can count on is that if they do arrive they will be delivered in order to the application.Integrated
M
0

I don't know what makes you expect a percentage less then 1% of dropped packets for UDP.

That being said, based on RFC 1122 (see section 3.3.2), the maximum buffer size guaranteed not to be split into multiple IP datagrams is 576 bytes. Larger UDP datagrams may be transmitted but they will likely be split into multiple IP datagrams to be reassembled at the receiving end point.

I would imagine that a reason contributing to the high rate of dropped packets you're seeing is that if one IP packet that was part of a large UDP datagram is lost, the whole UDP datagram will be lost. And you're counting UDP datagrams - not IP packets.

Mast answered 1/11, 2011 at 15:35 Comment(1)
I would expect dropped packets...over a network. However, I'm sending to localhost. I expect no, or very few, IP packets to be dropped.Enticement

© 2022 - 2024 — McMap. All rights reserved.