Zero-copy with and without Scatter/Gather operations

Asked 19/3, 2012 at 12:23 Answered 19/3, 2012 at 20:30

Solved c linux network-programming linux-kernel zero-copy

I just read an article that explains the zero-copy mechanism.

It talks about the difference between zero-copy with and without Scatter/Gather supports.

NIC without SG support, the data copies are as follows

enter image description here

NIC with SG support, the data copies are as follows

enter image description here

In a word, zero-copy with SG support can eliminate one CPU copy.

My question is that why data in kernel buffer could be scattered?

Emmanuelemmeline answered 19/3, 2012 at 12:23 Comment(0)

Because the Linux kernel's mapping / memory allocation facilities by default will create virtually-contiguous but possibly physically-disjoint memory regions.
That means the read from the filesystem which sendfile() does internally goes to a buffer in kernel virtual memory, which the DMA code has to "transmogrify" (for lack of a better word) into something that the network card's DMA engine can grok.

Since DMA (often but not always) uses physical addresses, that means you either duplicate the data buffer (into a specially-allocated physically-contigous region of memory, your socket buffer above), or else transfer it one-physical-page-at-a-time.

If your DMA engine, on the other hand, is capable of aggregating multiple physically-disjoint memory regions into a single data transfer (that's called "scatter-gather") then instead of copying the buffer, you can simply pass a list of physical addresses (pointing to physically-contigous sub-segments of the kernel buffer, that's your aggregate descriptors above) and you no longer need to start a separate DMA transfer for each physical page. This is usually faster, but whether it can be done or not depends on the capabilities of the DMA engine.

Kindred answered 19/3, 2012 at 13:50 Comment(1)

My answer here should be seen "in the context of its age"; the strongest reason why in particularly network cards require, and for decades, have supported scatter-gather, is that network packets are commonly "segmented" (that's the term the Linux kernel uses) - ethernet headers, IP headers, TCP headers, payload are all independently constructed but "chained". A network card can consume this chain and send a single packet from it. Again, the result is "zero copy" - the driver doesn't need to accumulate all parts of a network packet into a bounce buffer before sending it. – Kindred 3/9 at 23:10

Re: My question is that why data in kernel buffer could be scattered?

Because it already is scattered. The data queue in front of a TCP socket is not divided into the datagrams that will go out onto the network interface. Scatter allows you to keep the data where it is and not have to copy it to make a flat buffer that is acceptable to the hardware.

With the gather feature, you can give the network card a datagram which is broken into pieces at different addresses in memory, which can be references to the original socket buffers. The card will read it from those locations and send it as a single unit.

Without gather (hardware requires simple, linear buffers) a datagram has to be prepared as a contiguously allocated byte string, and all the data which belongs to it has to be memcpy-d into place from the buffers that are queued for transmission on the socket.

Optics answered 19/3, 2012 at 17:6 Comment(0)

Because when you write to a socket, the headers of the packet are assembled in a different place from your user-data, so to be coalesced into a network packet, the device needs "gather" capability, at least to get the headers and data.

Also to avoid the CPU having to read the data (and thus, fill its cache up with useless stuff it's never going to need again), the network card also needs to generate its own IP and TCP checksums (I'm assuming TCP here, because 99% of your bulk data transfers are going to be TCP). This is OK, because nowadays they all can.

What I'm not sure is, how this all interacts with TCP_CORK.

Most protocols tend to have their own headers, so a hypothetical protocol looks like:

Client: Send request Server: Send some metadata; send the file data

So we tend to have a server application assembling some headers in memory, issuing a write(), followed by a sendfile()-like operation. I suppose the headers still get copied into a kernel buffer in this case.

Sp answered 19/3, 2012 at 20:30 Comment(0)

Recommended topics

Hot tags