how to find which packets got dropped
Asked Answered
H

3

12

I'm getting thousands of dropped packages from a Broadcom Network Card:

eth1      Link encap:Ethernet  HWaddr 01:27:B0:14:DA:FE
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2746252626 errors:0 dropped:1151734 overruns:0 frame:0
          TX packets:4109502155 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:427998700000 (408171.3 Mb)  TX bytes:3530782240047 (3367216.3 Mb)
          Interrupt:40 Memory:d8000000-d8012700

Here is the installed version:

filename:       /lib/modules/2.6.27.54-0.2-default/kernel/drivers/net/bnx2.ko
version:        1.8.0
license:        GPL
description:    Broadcom NetXtreme II BCM5706/5708/5709 Driver

The packets get dropped in bulks ranging from 500 to 5000 packets several times an hour. The Server (running Postgres) is running fine - just the dropps are annoying.

After trying lots of different things, I'm asking: How may I find out where the packets came from and why were they dropped?

Helvetic answered 24/1, 2012 at 13:50 Comment(0)
E
4

(For the benefit of those that come to this via a search) I've seen the same problem (also with a bnx2 module, IIRC).

You might try turning off the irqbalance service. In my case, it completely stopped the solution.

Please also note that not so long ago, there were plenty of updates (RHEL 6) for irqbalance. Firmware updates should also be checked for both main system and the ethernet board(s).

We were seeing this only a very large subnet with a very large amount of broadcast/multicast activity. We weren't seeing this on the same equipment on a less noisy -- but still very active -- part of the network.

Potentially, setting the ethernet ring buffer size for the NIC can also be of use. I know there were some alterations for sysctl on that busy network...

Edva answered 28/3, 2014 at 10:28 Comment(0)
S
13

A dropped packet means that the buffer that is used to store the packet for forwarding/processing is full. The act of looking into the packet's data for information implies that you have the data to look at in the first place (which you don't, because there was no room to store it).

A nice way around this, so you can see what data is being dropped, is to look through a dump of your traffic for the TCP retransmission requests leaving your server. When a TCP packet is missing, for whatever reason, your server is going to ask for it to be re-sent. The retransmit will give you the conversation context that you're looking for.

I'd actually suggest taking a look at the switch/router that your server is connected to. It will be able to give you a nice idea of the loss and throughput over the interface to your server, letting you diagnose, for example, if your card is too slow for the wire.

EDIT

This blog post cites a tool called dropwatch, which may give you some clues as well.

Sphery answered 25/1, 2012 at 20:47 Comment(3)
Worth pointing out that that there's no available builds for Ubuntu at the time of writing this - dropwatch is maintained and provided for fedora/redhat ..Twin
and even then, I've never found it to be terribly useful (or at least usable)Edva
If really you need dropwatch, you find the rpm for your architecture and convert it do a deb file with alien.Meissen
I
9

You may ran into https://www.novell.com/support/kb/doc.php?id=7007165.

quote:

Beginning with kernel 2.6.37, it has been changed the meaning of dropped packet count. Before, dropped packets was most likely due to an error. Now, the rx_dropped counter shows statistics for dropped frames because of:

Softnet backlog full -- (Measured from /proc/net/softnet_stat)

Bad / Unintended VLAN tags

Unknown / Unregistered protocols

IPv6 frames when the server is not configured for IPv6

If any frames meet those conditions, they are dropped before the protocol stack and the rx_dropped counter is incremented.

Infatuated answered 8/6, 2015 at 7:48 Comment(0)
E
4

(For the benefit of those that come to this via a search) I've seen the same problem (also with a bnx2 module, IIRC).

You might try turning off the irqbalance service. In my case, it completely stopped the solution.

Please also note that not so long ago, there were plenty of updates (RHEL 6) for irqbalance. Firmware updates should also be checked for both main system and the ethernet board(s).

We were seeing this only a very large subnet with a very large amount of broadcast/multicast activity. We weren't seeing this on the same equipment on a less noisy -- but still very active -- part of the network.

Potentially, setting the ethernet ring buffer size for the NIC can also be of use. I know there were some alterations for sysctl on that busy network...

Edva answered 28/3, 2014 at 10:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.