Packet capture in RDMA?
Asked Answered
L

6

10

Is there any utility like tcpdump in Linux for capturing the traffic which is going over RDMA channel? (Infiniband/RoCE/iWARP)

Ledford answered 26/9, 2012 at 18:0 Comment(4)
How do you solve this problem finally?Bilbrey
ibdump worked for me as suggested by @kliteyn What kind of packets are you looking for? I was doing RMDA_WRITE_WITH_IMMIDIATE and I could see all the packets.Ledford
I just want to justify whether there is some RDMA packet is out-going to the network. But when I use the ibdump, I captured only a very little packet, such as 2 packets showed by ibdump. but I send a lot of data. I also wonder the meaning of the packets captured by ibdump, is it just for connection setup, not contain the data send out?Bilbrey
I have another question. the packet number captured by the ibdump will increase after I end of sending data. so what is meaning for? is ibdump response slow? not like the tcpdump, which show the real time.Bilbrey
P
10

Old thread, but still:

As Roland pointed out, sniffing RDMA traffic is tricky, because once the endpoints did the initial handshake, traffic goes through network card (HCA) directly to the memory. The only way to sniff this traffic w/o putting a dedicated HW sniffer on the wire is to have vendor-specific hooks in the network card, and a SW tool that uses these hooks.

If you have Mellanox HCAs, you can use the "ibdump" tool. This tool is also a part of Mellanox OFED package.

If you have other vendor's HW, you need to check with that vendor - you won't find any open-source packet sniffer for all RDMA-capable devices, sorry.

Puentes answered 19/11, 2012 at 15:22 Comment(3)
I think your answer fits the best. I have learned that each vendor has to make utility available for packet capture on their HCA. I am currently only dealing with Mellanox HCA and you are right, "ibdump" is the answer for this. I have tried it now and it does the capture. However I have found out that it logs only RDMA operation headers and not the payload itself. I don't know if thats the behaviour by default or I need to upgrade my packages. But in essence, "ibdump" works and it is what I was looking for when I asked the question. Thank you!Ledford
@Puentes But why the packet captured by ibdump is so little, I send a lot of packet, but it only caputured a little packets.Bilbrey
@Bilbrey The ibdump readme lists some limitations - such as dropping packets during bursts. And, in contrast to tcpdump, it doesn't report how many packets it dropped during capturing. Your ibdump might be overwhelmed by the number of packets to be captured.Conceal
S
4

In general, no. One of the main characteristics of RDMA is that all the network processing is done on the adapter, without involving the CPU at all. Typically work requests are queued up directly from userspace to the adapter, without any system call. So there's nowhere for a sniffer to hook in to get traffic.

With that said, for Ethernet protocols, iWARP or IBoE (aka RoCE), you can hook up a system in the middle of a connection and set it up to do forwarding in software (eg the Linux bridge module) and then run tcpdump or wireshark to capture the RDMA traffic that passes through this system. Wireshark even has dissectors for iWARP and IBoE.

For native InfiniBand it is theoretically possible to build something similar (set up an adapter to capture and forward traffic) but as far as I know, no one has done even the needed firmware or driver work to do basic packet sniffing.

Segalman answered 27/9, 2012 at 7:9 Comment(1)
Thank you Roland for your input! I'll explore using Linux bridge to sniff in. I understood that packets are queued up directly from userspace and thats why there is no place to intercept. I am using ib_post_send() from kernel to queue the work requests, so I thought there could be some place inside the implementation to know that packet was send to other node. I don't know if this is possible without a firmware support, may be when you get a event on CQ? The main reason for this question is when I do not see the data in receiver, we need a way to tell which RNIC is at fault, sender or reciver?Ledford
R
3

Chelsio's T4 device supports a packet trace feature allowing it to replicate ingress/egress offload packets to one of the device's NIC queues. Then you can use tcpdump or whatever on that ethX interface to see the RDMA or TOE packets.

Redhanded answered 27/9, 2012 at 14:17 Comment(1)
Thank you Steve! I will check with my hardware vendor (Mellanox) if they support something similar.Ledford
C
1

As I'm writing this answer is now possible to sniff network using tcpdump with a recent linux kernel or by installing Mellanox OFED (Nvidia) for older versions.

HOW-TO DUMP RDMA TRAFFIC USING THE INBOX TCPDUMP TOOL (CONNECTX-4 AND ABOVE)

After installing the Mellanox OFED (if needed) you can generate a pcap file and analyze it later by opening the pcap file in Wireshark.

tcpdump -i mlx5_1 -s 65535 -w rdma_traffic.pcap

Make sure to use mlx5_X available interfaces.

Conker answered 25/8, 2023 at 21:14 Comment(0)
E
0

Wireshark can be the one. But the problem is you need an observing server. Enabling the mirror feature, you should be able to receive the ROCE pocket at the observer.

Epperson answered 19/10, 2017 at 11:8 Comment(0)
C
0

A sure way to capture such traffic is to duplicate it into dedicated capture ports. Those ports might be additional ethernet/IB ports (of additional adapters) in your development machine or they may be located in an additional capture machine.

There are basically 2 ways how to duplicate the traffic:

  1. Configure port-mirroring in your switch. Support for port mirroring is pretty common in managed Ethernet switches, even in cheap ones. This feature is also available in some Mellanox Infiniband switches. You can configure to mirror both directions of a port into another one, although this oversubscribes the receiver if the mirrored port receives and sends at line speed at the same time (full-duplex). In such a situation some frames can't be forwarded to the capture port then and are thus dropped. To avoid this limitation one needs to mirror each direction into a separate capture port.

  2. Connect your network cable to a TAP (target access point) device that duplicates or splits the signal. With optical networking those TAPs are often constructed in a completely passive way and thus don't add much complexity and are relatively cheap to produce (examples). You need one TAP for each fiber, i.e. you always occupy 2 capture ports if you want to capture both directions. TAP devices are available for the fibers and connectors commonly used in Ethernet networks. If your Infiniband hardware uses the same then you should be able to use the same TAP devices there, as well. At least the passive ones.

Once the mirrored/tapped traffic arrive at your capture port(s), you can use standard capture tools such as tcpdump.

For Infiniband there is ibdump, however, depending on the Infiniband software you are using (open-source OFED vs. the proprietary Mellanox OFED) and the host channel adapter (HCA) you might be able to use tcpdump to capture Infiniband traffic, as well.

Conceal answered 22/1, 2021 at 21:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.