I heard about peer-to-peer memory transfers and read something about it but could not really understand how much fast this is compared to standard PCI-E bus transfers.
I have a CUDA application which uses more than one gpu and I might be interested in P2P transfers. My question is: how fast is it compared to PCI-E? Can I use it often to have two devices communicate with each other?
p2p
is just market speak for saying CUDA devices can now transfer data between each other over PCI-E. The speeds will be what you expect from your PCI-E bus. On a more interesting side there is also something called "peer access" which lets you launch a kernel that can read / write data from multiple devices. – Tercel