Embedded Linux on Zynq 7000, dropping almost all UDP packets
Asked Answered
E

1

8

I am working with the Xilinx distribution of Linux on a Zynq 7000 board. This has two ARM processors, some L2 cache, a DRAM interface, and a large amount of FPGA fabric. Our appliance collects data being processed by the FPGA and then sends it over the gigabit network to other systems.

One of the services we need to support on this appliance is SNMP, which relies on UDP datagrams, and although SNMP does have TCP support, we cannot force the clients to use that.

What I am finding is that this system is losing almost all SNMP requests.

It is important to note that neither the network nor the CPUs are being overloaded. The data rate isn't particularly high, and the CPUs are usually somewhere around 30% load. Plus, we're using SNMP++ and Agent++ libraries for SNMP, so we have control over those, so it's not a problem with a system daemon breaking. However, if we do stop the processing and network activity, SNMP requests are not lost. SNMP is being handled in its own thread, and we've made sure to keep requests rare and spread-out so that there really should be no more than one request buffered at any one time. With the low CPU load, there should be no problem context-switching to the receiving process to service the request.

Since it's not a CPU or ethernet bandwidth problem, my best guess is that the problem lies in the Linux kernel. Despite the low network load, I'm guessing that there are limited network stack buffers being overfilled, and this is why it's dropping UDP datagrams.

When googling this, I find examples of how to use netstat to report lost packets, but that doesn't seem to work on this system, because there is no "-s" option. How can I monitor these packet drops? How can I diagnose the cause? How can I tune kernel parameters to minimize this loss?

Thanks!

Elveraelves answered 22/9, 2016 at 16:49 Comment(3)
It's possible that the SNMP responses are being lost on the requesting end, but that's just a plain old x86 machine running Linux with lots of CPU and memory.Elveraelves
It'd be great to bisect the problem and see if you can determine where the packets are getting to and then lost. You can use a tool like wireshark to make sure the requests are getting to the Zynq board. I'd recommend tcpdump then to see if the UDP packets are available in the kernel. You can install other linux utils through the board's flash. Also, UDP doesn't have guaranteed delivery (I'm not sure if SNMP has its own retry logic ontop).Senate
Ok, I've personally never used wireshark, but my colleagues have, so I'll work with them on that. As for SNMP, what it has is a timeout. In my case, I'm trying to change a state variable and finding it not getting set, so I spawn a thread that retries 10 times. Frequently all 10 attempts are lost, which is distressing.Elveraelves
J
3

Wireshark or tcpdump is a good approach. You may want to take a look at the settings in /proc/sys/net/ipv4/ or try an older kernel (3.x instead of 4.x). We had an issue with tcp connections on the Zynq with the 4.4 kernel but this could be seen in the system logs (A warning regarding SYN cookies and possible flooding).

Julenejulep answered 30/9, 2016 at 18:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.