InfiniBand explained
Asked Answered
G

3

19

Can anybody explain what is InfiniBand? What is the key differences in comparison with Ethernet, how these differences allow for it to be faster than Ethernet?

In the official description from mellanox it is written that

Introduce InfiniBand, a switch-based serial I/O interconnect architecture operating at...

What does it mean that Infiniband is a switch-based interconnect? I found this description, but it does not explain what happens if several inputs want to write to a single output, how is the collision resolved?

It is also said that Infiniband has end-to-end flow control. Does it mean that there is no (need) for any other (in-between) flow control? Why?

Gerardogeratology answered 25/10, 2017 at 13:16 Comment(0)
G
28

Key difference between Ethernet and Infiniband, which makes Infiniband faster, is RDMA (Remote Direct Memory Access). DMA (in networking) is an operation which access the memory directly from the NIC (Network Interface Controller), without involving the CPU. RDMA is the same idea, but the direct memory access is done by a remote machine.

More differences:

  1. Communication is done between QPs (Queue Pairs) instead of channels.
  2. Data flow to/from user space straight to/from HW instead of going thru the kernel stack.

A basic RDMA flow between a requestor and a responder would consist of:

  1. Handshake - exchange details between requestor and responder (mainly allocated memory addresses and access keys).
  2. Create a READ/WRITE/ATOMIC request on the requestor side.
  3. Send the request to the responder.
  4. Directly access the memory on the responder side.
  5. If READ/ATOMIC - send the data read from responder's memory back to the requestor.

Main benefits:

  1. No CPU access on the responder side - throughput is limited by the HW (NIC & PCI) only.
  2. No SW is running on responder side - allows much lower latency (~10 times less than typical TCP/UDP latency).
  3. Supports "polling mode" for completion on requestor side, meaning the SW knows immediately once HW finished transmitting. Allows for lower latency and higher throughput, on the expense of high CPU utilization.

For more information please refer to the Infiniband spec (sorry it is very long).

Related traffic protocols:

  • RoCE (RDMA over Converged Ethernet), which implements RDMA over Ethernet fabric by wrapping Infiniband packets with L2/L3/L4 Ethernet headers.

  • IPoIB (IP over Infiniband), which implements regular networking (thru the kernel stack) over Infiniband fabric by wrapping L3/L4 packets with Infiniband headers.

Hope this helps.

Gaynellegayner answered 20/11, 2017 at 21:30 Comment(2)
With 'NIC' you mean the network interface controller?Acescent
@OkLetsdothis NIC==network interface cardMaize
D
7

To learn basics of InfiniBand I suggest you to visit Mellanox Academy Web-Site and after registration take InfiniBand Essentials or InfiniBand Fundamentals course (in a section Technologies).

In my opinion "switch-based architecture" means that switches are part of fabric (see picture below, where I have shown switch by blue shape).

enter image description here

End-to-end flow control, aka message level flow control, is a feature (capability) for reliable connections. This can be used by a responder to optimize the use of its receive resources. Essentially, a requester cannot send a request message unless it has appropriate credits to do so. Please, refer to InfiniBand specification for details.

Disk answered 17/11, 2017 at 9:24 Comment(0)
O
2

Technical Information

It is also said that InfiniBand has end-to-end flow control.

Intra-fabric traffic flow is controlled via a daemon called the Subnet Manager (often just called a "SM"). A well known open source implementation (opensm) currently supports 9 different routing algorithms (Min Hop, UPDN, DNUP, Fat Tree, Torus-2QoS, etc). Many pages could be written about these algorithms and their different approaches to flow control.

Does it mean that there is no (need) for any other (in-between) flow control? Why?

Inter-fabric traffic flow typically requires a protocol that can route too and from InfiniBand networks and other network types. LNet is an example of a protocol that can do that.

General Information

Can anybody explain what is InfiniBand?

This question is very broad, so I will attempt to add some more general information as a complement to the existing answers.

Future Roadmap

There are currently multiple generations of Infiniband (QDR, FDR, EDR), with HDR hopefully coming out at some point in 2018 or 2019. Yes, this may become dated quickly, so refer to the roadmap for current information. Upcoming generations are called NDR and XDR, but don't even have tentative dates on the current roadmap.

Key Organizations

Important organizations include Infiniband Trade Association (IBTA) and Open Fabrics Alliance (OFA). Refer to their websites for plenty of good Infiniband information.

Olwen answered 21/7, 2018 at 2:26 Comment(1)
btw, appreciate your nice edit of my useless-use-of-cat answer; i have accepted your edit :-)Janeanjaneczka

© 2022 - 2024 — McMap. All rights reserved.