Conference Paper

Reliable Multicast Protocol Using Infiniband Remote Direct Memory Access

2018 International Conference on Computer Applications in Industry and Engineering
Shah Mansoor, J. Michael Meeha

ABSTRACT


Commonly utilized communication patterns in algorithms

for scientific computing on super-computer clusters make

extensive use of one to many, and many to many messaging

between nodes. Many of these codes are implemented using

the standard MPI_BCAST (broadcast) library function. Unfortunately,

the implementation of these operations in MPI

relies upon point-to-point reliable level three transmissions

rather than true hardware enabled level two multicast messaging.

This is a result of the fact that hardware enabled multicast

is not universally available in all systems and additionally

it is datagram in nature. The semantics of MPI_BCAST

must guarantee completion of delivery to all recipients before

continuing the sending process although there is an issue

concerning this with the way MPI_BCAST is implemented.

We define and implement a reliable multicast protocol

utilizing the hardware enabled multicast remote direct

memory access capabilities (RDMA) of Infiniband connections

commonly found in high-end computing clusters. Our

protocol utilizes cumulative acknowledgments propagated

up binomial trees. We present benchmark speedup results

versus a standard MPI_BCAST implementation on a cluster

equipped with an Infiniband switch. We demonstrate that

our approach provides faster multicast messaging and guarantees

delivery to all recipients before the transmitting process

is continued.

CAINE 2018



ISBN:
978-1-943436-04-0
PUBLISHER:
ACEE
CHIEF EDITOR:
Debnath
CONFERENCE VENUE:
San Diego, California, USA
CONTACT DETAILS:
Debnath
Copyright © Search Innovations. All rights reserved