What is Remote Direct Memory Access (RDMA)?
Remote Direct Memory Access, also called RDMA in short, enables sending data quickly and easily from the main storage of a computer to the main storage of another computer. It is essential while dealing with large data sets or complex processes, which are needed for example, for machine learning. Let’s get into details of RDMA and its working.
What is Remote Direct Memory Access (RDMA)?
Remote Direct Memory Access (RDMA) is a technology that allows two networked computers to exchange data in their main memory without using the processor, cache, or operating system of either computer. Like locally-based Direct Memory Access (DMA), RDMA frees up system resources, improving throughput and performance. This results in faster data transfer rates and lower latency between RDMA-enabled systems. RDMA benefits both networking and storage applications.
The concept of zero-copy networking is central to RDMA. It enables direct data reading from the main memory of one computer and writing to the main memory of another. This bypasses the kernel networking stack in both computers, enhancing network performance.
Consequently, communications between the two systems are completed much faster than in comparable non-RDMA networked systems.
Remote Direct Memory Access has proven useful in applications that require fast, large-scale parallel high-performance computing (HPC) clusters and data center networks. It is beneficial for big data analysis, supercomputing environments that process applications, and machine learning tasks requiring low latency and high transfer rates. Additionally, RDMA is used between nodes in compute clusters and for latency-sensitive database workloads.
Overall, RDMA serves as a powerful tool for enhancing data transfer performance in specialized applications. To better understand how RDMA achieves these improvements, it is important to look at the specific steps and mechanisms involved in its operation.
How does RDMA work?
Remote Direct Memory Access unlocks higher transfer speeds by circumventing the data buffers in the operating system. Data can then be transferred directly from the network adapter to the application memory and vice versa.
In a traditional network data path, data needs to pass through the kernel networking stack of both the host’s and the receiver’s operating system. The data path includes the TCP and IPv4/6 stack down to the device driver. RDMA on the other hand bypasses the operating system’s software kernel and allows the client system to copy data from the memory of the storage server directly into its own. This direct execution of I/O transactions between the network adapter and the application memory avoids unnecessary data copies and frees up the CPU. This has a positive effect on the throughput whilst simultaneously reducing latency, making remote file storage perform similarly to directly attached block storage. This uses fewer CPU cycles for data transfer networks, leaving more resources available for performance-demanding applications.
Benefits of Remote Direct Memory Access
- Speeds up data exchange between computers’ memories, making transfers quicker.
- Takes data transfer tasks off the CPU, freeing it up for other jobs.
- Cuts down the time data takes to travel between computers, making responses faster.
- Makes better use of network resources, reducing congestion and optimizing bandwidth.
- Efficiently handles data transfers over large networks, making it great for big computing setups.
Network Protocols That Support RDMA
- RDMA on Converged Ethernet. RoCE is a network protocol that supports Remote Direct Memory Access over Ethernet by defining the performance of RDMA in such an environment.
- Internet-wide area RDMA protocol. IWARP uses Transmission Control Protocol (TCP) or Stream Control Transmission Protocol (SCTP) to transmit data. It was developed by the Internet Engineering Task Force to allow applications on a server to directly read or write to applications that execute on another server without support from the operating system on either server.
- InfiniBand. For high-speed InfiniBand network connections, RDMA is a standard protocol. This RDMA network protocol is commonly used for inter-system communication and was first popular in high-performance computing environments. Because InfiniBand can quickly connect to large computer clusters, it has entered additional use cases, such as big data environments, databases, highly virtualized settings, and web applications with large resource requirements.
Conclusion
Remote Direct Memory Access is a technology that is hugely beneficial to media workflows. Not only does it increase the connection throughput by a significant margin, but it also brings lower latency compared to TCP. By bypassing the operating system networking stack and allowing the network adapter direct access to application memory, RDMA significantly reduces CPU utilization on both the host and the server side. This is of great value for media use cases as the freed-up CPU resources can be used by the applications, for example, to reduce the time it takes to render a timeline. RDMA is also a great match with flash-based storage as it helps to solve the transmission bottleneck by allowing more efficient utilization of the performance offered by the storage.
RDMA can bring clear performance benefits to your storage environment and allow for workflows that would otherwise not be possible.