Streaming DMA in PCIE linux kernel driver
Asked Answered
P

1

6

I'm working on FPGA driver for Linux kernel. Code seems to work fine on x86, but on x86_64 I've got some problems. I implemented streaming DMA. So it goes like

get_user_pages(...);
for (...) {
    sg_set_page();
}
pci_map_sg();

But pci_map_sg returned addresses like 0xbd285800, which are not aligned by PAGE_SIZE, so I can't send full first page, because PCIE specification says

"Requests must not specify an Address/Length combination which causes a Memory Space access to cross a 4-KB boundary."

Is there any way to get aligned addresses, or did I just missed something important?

Source code of DMA.

Particularism answered 21/2, 2012 at 16:49 Comment(5)
Can you include code from your real source? There's not enough there to spot the bug.Withdrew
Yeah, of course. Attached to original post.Particularism
@soh: Any plans to release it to the public? I was looking around for an open driver and could not find a good one. Being too lazy to write my own, I'd be more than glad to contribute and help with testing.Romeliaromelle
@Vlad Lazarenko: If you are still intrested, I guess after end of the project I can share the code.Particularism
Hi Please try your DMA mask is set properly for 64 Bit dma_set_maskBehling
W
3

The first possibility that comes to mind is that the user buffer coming in does not start on a page boundary. If your start address is 0x800 bytes through a page, then the offset on your first sg_set_page call will be 0x800. This will produce a DMA address ending in 0x800. This is a normal thing to happen, and not a bug.

As pci_map_sg coalesces pages, this first segment may be larger than one page. The important thing is that pci_map_sg produces contiguous blocks of DMA addressable memory, but it does not produce a list of low-level PCIe transactions. On x64 you are more likely to get a large region, because most x64 platforms have an IOMMU.

Many devices I deal with have DMA engines that allow me to specify a logical transfer length of several megabytes. Normally the DMA implementation in the PCIe endpoint is responsible for starting a new PCIe transaction at each 4kB boundary, and the programmer can ignore that constraint. If resources in the FPGA are too limited to handle that, you can consider writing driver code to convert the Linux list of memory blocks into a (much longer) list of PCIe transactions.

Withdrew answered 22/2, 2012 at 9:6 Comment(1)
Thanks a lot. User buffer was ok, but FPGA PCIE core does not handle long buffers.Particularism

© 2022 - 2024 — McMap. All rights reserved.