Skip to main content

What is RDMA (Remote Direct Memory Access)

RDMA (Remote Direct Memory Access) is a high-performance networking technology that allows direct memory access between computers over a network without involving the CPU, cache, or operating system of either the sender or the receiver.

What RDMA Does

Traditionally, when data is transferred between systems:

    It’s copied from user space to kernel space.

    Passed through the network stack.

    Received in the kernel and copied again to user space.

With RDMA, data can be read/written directly from the memory of one machine to another, bypassing the kernel and reducing CPU usage and latency.

How RDMA Works

    No CPU interrupts on the receiving side.

    No context switches or system calls during data transfer.

    Uses zero-copy principles.

    Memory regions are pre-registered with the NIC.

    The NIC (RDMA-capable) directly reads/writes from/to memory.

RDMA Protocols

    InfiniBand

       High-performance computing (HPC) standard.

        Low-latency, high-bandwidth.


    RoCE (RDMA over Converged Ethernet)


        Runs RDMA on standard Ethernet.

       Needs lossless Ethernet (DCB - Data Center Bridging).

    iWARP

        RDMA over standard TCP/IP stack.

       More compatible but slightly higher latency.


RDMA Advantages

Benefit Description

Ultra-low latency No kernel involvement or context switches.

Zero copy No intermediate memory buffers or CPU copying.

High throughput NIC handles data transfer directly.

Low CPU utilization Frees CPU for application-level processing.

RDMA vs Traditional Networking

Feature Traditional Networking RDMA

CPU Involvement High Low

Memory Copy Multiple copies Zero copy

System Calls Yes No (once set up)

Latency Higher Ultra-low

Performance Good Excellent


Hardware Requirements

To use RDMA, you typically need:

   RDMA-capable NICs (e.g., Mellanox, Intel, Broadcom)

    Lossless network support (especially for RoCE)

RDMA drivers and libraries:rdma-core, libibverbs, libmlx5 For programming: Verbs API, RDMA CM, or higher-level frameworks (e.g., DAOS, NVMe-oF, Libfabric)

Common Use Cases

    High-speed datacenter communication

    Distributed databases (e.g., CockroachDB, Oracle RAC)

    HPC clusters

    Storage systems (e.g., NVMe over Fabrics)

    Machine learning training clusters

    Real-time data replication


RDMA in Code (Simplified Concept)

Here’s a pseudo-view of RDMA vs traditional send:

// Traditional

send(socket, buffer, length, 0);  // system call, data copied

// RDMA

rdma_post_write(qp, local_addr, remote_addr, length);  // no CPU copy


Brian Wilson (GT1) 7-7-25

Comments

Popular posts from this blog

“Calm Under Fire: The Secret Weapon for Customer Service Management”

“Calm Under Fire: The Secret Weapon for Customer Service Management” In today’s fast-paced, customer-driven world, businesses are constantly seeking exceptional leadership to manage their customer service departments. While resumes filled with corporate experience might catch a recruiter’s eye, one of the most overlooked goldmines of talent lies in a surprising place: the world of emergency communications. That’s right, former 911 dispatchers bring a powerhouse of skills perfectly aligned with the demands of customer service management. Here’s why hiring a former 911 dispatcher could be one of the smartest decisions your company makes. 1. Unmatched Composure Under Pressure 911 dispatchers thrive in high-stress environments. They handle life-or-death situations with a calm voice and a clear head, often juggling multiple crises at once. Transition that to a customer service setting, and you get a manager who won’t flinch when tensions rise, customers escalate, or systems go down....

Cybersecurity for Small Businesses: What It Means and Why It Matters

  Cybersecurity for Small Businesses: What It Means and Why It Matters In today’s digital landscape, cybersecurity is no longer just a concern for large corporations. Small businesses are increasingly becoming prime targets for cybercriminals, often due to their limited security measures and lack of awareness. Understanding cybersecurity and its implications is critical for protecting sensitive data, maintaining customer trust, and ensuring business continuity. What is Cybersecurity? Cybersecurity refers to the practices, technologies, and processes designed to protect digital systems, networks, and data from cyber threats such as hacking, malware, phishing, and data breaches. For a small business, this means safeguarding everything from customer records and financial data to employee information and proprietary business strategies. Why Should Small Businesses Care? Many small business owners assume that cybercriminals only target large enterprises. However, statistics sh...

Amazon's Bold Bid to Acquire TikTok: A Game-Changer or a Risky Gamble?

  Amazon's Bold Bid to Acquire TikTok: A Game-Changer or a Risky Gamble? In a stunning turn of events, Amazon has reportedly placed a bid to acquire TikTok, the massively popular social media platform. This move has sent shockwaves through both the tech and business communities, as TikTok faces mounting pressure to divest from its Chinese parent company or face a potential ban in the United States. If Amazon succeeds in this bid, the acquisition could reshape the digital landscape by merging e-commerce with one of the most powerful social media platforms in the world. But is this a strategic masterstroke or a high-stakes gamble? Let's dive into the details, potential benefits, and risks of this unprecedented move. The Bid & Strategic Motivation Amazon’s decision to pursue TikTok is more than just an expansion play—it’s a calculated move to solidify its dominance in the digital marketplace. TikTok has over a billion active users worldwide, many of whom fall into younger ...