Peer-to-Peer CUDA transfers
Asked Answered
M

1

5

I heard about peer-to-peer memory transfers and read something about it but could not really understand how much fast this is compared to standard PCI-E bus transfers.

I have a CUDA application which uses more than one gpu and I might be interested in P2P transfers. My question is: how fast is it compared to PCI-E? Can I use it often to have two devices communicate with each other?

Micromho answered 17/7, 2013 at 18:25 Comment(3)
p2p is just market speak for saying CUDA devices can now transfer data between each other over PCI-E. The speeds will be what you expect from your PCI-E bus. On a more interesting side there is also something called "peer access" which lets you launch a kernel that can read / write data from multiple devices.Tercel
This is interesting.. can you point me to something that describes this "peer access" ? Also: make this an answer, it is sufficient to me and I'll accept it!Micromho
Added an answer with links.Tercel
T
11

A CUDA "peer" refers to another GPU that is capable of accessing data from the current GPU. All GPUs with compute 2.0 and greater have this feature enabled.

Peer to peer memory copies involve using cudaMemcpy to copy memory over PCI-E as shown below.

cudaMemcpy(dst, src, bytes, cudaMemcpyDeviceToDevice);

Note that dst and src can be on different devices.

cudaDeviceEnablePeerAccess enables the user to launch a kernel that uses data from multiple devices. The memory accesses are still done over PCI-E and will have the same bottlenecks.

A good example of this would be simplep2p from the cuda samples.

Tercel answered 17/7, 2013 at 19:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.