Peer-to-Peer CUDA transfers

About

Asked 17/7, 2013 at 18:25 Answered 17/7, 2013 at 19:58

I heard about peer-to-peer memory transfers and read something about it but could not really understand how much fast this is compared to standard PCI-E bus transfers.

I have a CUDA application which uses more than one gpu and I might be interested in P2P transfers. My question is: how fast is it compared to PCI-E? Can I use it often to have two devices communicate with each other?

Micromho answered 17/7, 2013 at 18:25 Comment(3)

p2p is just market speak for saying CUDA devices can now transfer data between each other over PCI-E. The speeds will be what you expect from your PCI-E bus. On a more interesting side there is also something called "peer access" which lets you launch a kernel that can read / write data from multiple devices. – Tercel 17/7, 2013 at 19:22

This is interesting.. can you point me to something that describes this "peer access" ? Also: make this an answer, it is sufficient to me and I'll accept it! – Micromho 17/7, 2013 at 19:32

Added an answer with links. – Tercel 17/7, 2013 at 20:15

A CUDA "peer" refers to another GPU that is capable of accessing data from the current GPU. All GPUs with compute 2.0 and greater have this feature enabled.

Peer to peer memory copies involve using cudaMemcpy to copy memory over PCI-E as shown below.

cudaMemcpy(dst, src, bytes, cudaMemcpyDeviceToDevice);

Note that dst and src can be on different devices.

cudaDeviceEnablePeerAccess enables the user to launch a kernel that uses data from multiple devices. The memory accesses are still done over PCI-E and will have the same bottlenecks.

A good example of this would be simplep2p from the cuda samples.

Tercel answered 17/7, 2013 at 19:58 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags