"zero copy networking" vs "kernel bypass"?
Asked Answered
A

4

49

What is the difference between "zero-copy networking" and "kernel bypass"? Are they two phrases meaning the same thing, or different? Is kernel bypass a technique used within "zero copy networking" and this is the relationship?

Ashram answered 20/8, 2013 at 19:21 Comment(3)
@alk Google "the moon is made out of cheese" and you will find pages describing how the moon is made out of cheese.... My point? Too much out there isn't accurate- I trust the general consensus on SO more.Ashram
The Art of Googling is to pose the right question. And The Art of Research/Investigation is to not trust only one source of information.Orland
Within the topic of zero-copy, I think we need to define the domain of where it's taking place. I'm guessing you're asking about zero-copy from application (user space) to hardware (bi-directionally) and not solely within the layers of the kernel's network stackFar
C
41

What is the difference between "zero-copy networking" and "kernel bypass"? Are they two phrases meaning the same thing, or different? Is kernel bypass a technique used within "zero copy networking" and this is the relationship?

TL;DR - They are different concepts, but it is quite likely that zero copy is supported within kernel bypass API/framework.


User Bypass

This mode of communicating should also be considered. It maybe possible for DMA-to-DMA transactions which do not involve the CPU at all. The idea is to use splice() or similar functions to avoid user space at all. Note, that with splice(), the entire data stream does not need to bypass user space. Headers can be read in user space and data streamed directly to disk. The most common downfall of this is splice() doesn't do checksum offloading.

Zero copy

The zero copy concept is only that the network buffers are fixed in place and are not moved around. In many cases, this is not really beneficial. Most modern network hardware supports scatter gather, also know as buffer descriptors, etc. The idea is the network hardware understands physical pointers. The buffer descriptor typically consists of,

  1. Data pointer
  2. Length
  3. Next buffer descriptor

The benefit is that the network headers do not need to exist side-by-side and IP, TCP, and Application headers can reside physically seperate from the application data.

If a controller doesn't support this, then the TCP/IP headers must precede the user data so that they can be filled in before sending to the network controller.

zero copy also implies some kernel-user MMU setup so that pages are shared.

Kernel Bypass

Of course, you can bypass the kernel. This is what pcap and other sniffer software has been doing for some time. However, pcap does not prevent the normal kernel processing; but the concept is similar to what a kernel bypass framework would allow. Ie, directly deliver packets to user space where processing headers would happen.

However, it is difficult to see a case where user space will have a definite win unless it is tied to the particular hardware. Some network controllers may have scatter gather supported in the controller and others may not.

There are various incarnation of kernel interfaces to accomplish kernel by-pass. A difficulty is what happens with the received data and producing the data for transmission. Often this involve other devices and so there are many solutions.


To put this together...

Are they two phrases meaning the same thing, or different?

They are different as above hopefully explains.

Is kernel bypass a technique used within "zero copy networking" and this is the relationship?

It is the opposite. Kernel bypass can use zero copy and most likely will support it as the buffers are completely under control of the application. Also, there is no memory sharing between the kernel and user space (meaning no need for MMU shared pages and whatever cache/TLB effects that may cause). So if you are using kernel bypass, it will often be advantageous to support zero copy; so the things may seem the same at first.

If scatter-gather DMA is available (most modern controllers) either user space or the kernel can use it. zero copy is not as useful in this case.

Reference:

Caseate answered 20/8, 2013 at 23:26 Comment(9)
I don't know of Linux support for DMA-to-DMA transactions; but I do know that some hardware supports it. The idea is an Ethernet controller and a disk controller can transfer data directly to each other. This sounds more promising than doing things in user space.Caseate
Also see https://mcmap.net/q/356633/-kernel-bypass-for-udp-and-tcp-on-linux-what-does-it-involve/632951 for more information regarding NIC bypassing.Napoleonnapoleonic
@Napoleonnapoleonic Your link on linux kernel bypass and perfromance is nice to explain different manifestations of what people think kernel bypass might be. RDMA is like my 'DMA-to-DMA* explanation. For embedded systems, there may be an dedicated 'DMA controller' that can patch to most system peripherals (NIC, disk, display, audio, etc) to transfer data without a CPU.Caseate
See Splice and pipes in Linux for more details on User bypass.Caseate
Linux USB3 zero copy - this is implemented by mmap() in user space; for instance libusb libusb_dev_mem_alloc(). So the driver will place the buffer directly in application addressable memory.... this is for USB which is good for USB vision, USB drives, etc.... but the driver concepts are the same for a NIC.Caseate
"Of course, you can bypass the kernel. This is what pcap and other sniffer software has been doing for some time." They "bypass the kernel" in the sense that the kernel fills in a memory-mapped buffer with captured packets and the userland code reads from that buffer; all that's "bypassed" is copying from kernel-space buffers to user-space buffers.Larimore
If you're capturingTCP packets that are sent to a connected endpoint on your machine, that will not, in and of itself, prevent TCP/IP processing by the kernel if. All it means is that copies of the raw packets will be delivered to the capture mechanism (PF_PACKET socket, on Linux) without those copies of the packets passing through the TCP/IP stack. And PF_PACKET sockets don't get packets directly DMAed into their buffers. I.e., pcap does NOT do direct DMAing of packets into user buffers when using PF_PACKET sockets. (Trust me on this, i"m a libpcap core developer.)Larimore
Yes, an application could use pcap to receive incoming IP packets and do its own IP, TCP, and UDP processing to let an application have its own Internet protocol stack, bypassing the kernel's stack - as long as it can keep those IP packets out of the hands of the kernel's IP input code path - but that's not what a lot of (most?) software using libpcap does. Mentioning pcap in this context doesn't really add anything.Larimore
An example of 'kernel bypass' is various malware using this mechanism to avoid simple 'open port' analysis. This is equivalent to port knocking. So the 'out-of-band' listening to non-existent addresses/ports kicks off communications by another means. I still think it is worthwhile to mention here. I agree that legitimate, normal communications would be cumbersome to implement with PF_PACKET. But it is technically possible.Caseate
T
25

Zero-copy networking

You're doing zero-copy networking when you never copy the data between the user-space and the kernel-space (I mean memory space). By example:

C language recv(fd, buffer, BUFFER_SIZE, 0);

By default the data are copied:

  1. The kernel gets the data from the network stack
  2. The kernel copies this data to the buffer, which is in the user-space.

With zero-copy method, the data are not copied and come to the user-space directly from the network stack.

Kernel Bypass

The kernel bypass is when you manage yourself, in the user-space, the network stack and hardware stuff. It is hard, but you will gain a lot of performance (there is zero copy, since all the data are in the user-space). This link could be interesting if you want more information.

Triparted answered 20/8, 2013 at 19:32 Comment(7)
Are you saying that zero-copy networking is done first and then "kernel bypass" is the next stage, to process the zero-copied data, in the user space?Ashram
Actually a "kernel bypass" is a way to have zero-copy.Triparted
@nouney: ... if you do it well! ;-)Orland
The reason I ask is because I read somewhere that a patch was released in to the Linux kernel to implement zero copy. So does this mean there are no kernel bypass opportunities for networking anymore? Linux already does it by default...??Ashram
You can doing zero-copy without a kernel bypass ;) Zero-copy depends on the OS actually, since it requires some kernel privileges. The kernel will still manage the network for you. With a kernel bypass, you have to manage all the network stuff yourself.Triparted
@Triparted Ok so say I want to get the network data directly from the driver to my user application- would I have to edit the code of the driver (or kernel) to achieve this?Ashram
@Ashram You don't have to but you could. There's more than one way to do zero-copy. Read this long but good paper on the zero-copy method with linux.Triparted
K
8

ZERO-COPY:

When transmitting and receiving packets, all packet data must be copied from user-space buffers to kernel-space buffers for transmitting and vice versa for receiving. A zero-copy driver avoids this by having user space and the driver share packet buffer memory directly.

Instead of having the transmit and receive point to buffers in kernel space which will later require to copy, a region of memory in user space is allocated, and mapped to a given region of physical memory, to be shared memory between the kernel buffers and the user-space buffers, then point each descriptor buffer to its corresponding place in the newly allocated memory.

Karissakarita answered 6/8, 2017 at 14:11 Comment(0)
B
5

Other examples of kernel bypass and zero copy are DPDK and RDMA. When an application uses DPDK it is bypassing the kernel TCP/IP stack. The application is creating the Ethernet frames and the NIC grabbing those frames with DMA directly from user space memory so it's zero copy because there is no copy from user space to kernel space. Applications can do similar things with RDMA. The application writes to queue pairs that the NIC directly access and transmits. RDMA iblibverbs is used inside the kernel as well so when iSER is using RDMA it's not Kernel bypass but it is zero copy.

http://dpdk.org/

https://www.openfabrics.org/index.php/openfabrics-software.html

Brumfield answered 19/1, 2017 at 17:23 Comment(2)
It is true there is 'zero copying' with a scatter-gather NIC. Earlier NIC cards would only transmit a complete contiguous buffer as a packet. In this case, you need an API to the networks stack that ensures all data is put together in contiguous memory before sending to the NIC. Zero copy is software that means no copying even if the NIC does not support scatter-gather. At this point in time, that may seem archaic, but changing definition of things will really confuse the issue even more. vger.kernel.org/~davem/skb_data.htmlCaseate
Nouney also references a good paper. mirlabs.org/ijcisim/regular_papers_2011/Paper2.pdf For certain scatter-gather hardware is NOT the full story on zero copy.Caseate

© 2022 - 2024 — McMap. All rights reserved.