AbstractData centers are being built around the world to meet the exponentially increasing demands for cloud computing. The same increasing demands drive the networking speed increase from 10Gb/s to 100Gb/s or higher and the e2e latency reduction from milliseconds to microseconds. The traditional software-based TCP/IP protocol, however, cannot keep up with the increasing network speed. Consequently, RDMA (Remote Direct Memory Access), once introduced in HPC, now is experiencing a renaissance in Ethernet-based data centers, at a much larger scale. In this talk, we will discuss the safety and performance challenges, including the RDMA transport livelock, PFC deadlock, PFC pause frame storm, and many others that we have encountered in making RDMA deployable and manageable at scale, the solutions that we have devised to address these challenges, the lessons we have learned, and several future research directions. We will also explore the new role that RDMA will play in a more heterogeneous system and networking infrastructure for the new era of machine learning.
Chuanxiong Guo is a Principal Researcher at Microsoft Research Redmond. Before that he was a Principal Software Engineering Manager at Microsoft Azure. He was a Senior Researcher at Microsoft Research Asia before he joined Microsoft Azure. His areas of interest include networked systems at large-scale, data center networking, cloud computing, systems availability and troubleshooting. He is among the first to start data center networking research. He has won several best paper awards for his research in data center networking. Several of his envisions including DCN virtualization, DCN monitoring, and ServerSwitch generated both academic and industrial impacts. Many of the systems that he has designed and implemented, including CloudBrain, RDMA/RoCEv2, Pingmesh, NetBouncer, DiffServ for data centers, are widely deployed in Microsoft production data centers, are running in millions of servers in data centers, and are indispensable infrastructure services and technologies for both Azure and Bing.
Network data-plane programmability is here to stay. It is going to make a fundamental shift in the way people build, operate, and manage their networks because network owners can now define -- even repeatedly -- exactly how packets are to be processed in their networking devices all the way down to the wire. Two key technologies enable this fundamental shift in networking: i) PISA (Protocol-Independent Switch Architecture), a novel machine architecture for fully-programmable high-performance packet processors, and ii) P4, a domain-specific declarative language to dictate the packet-processing logic in a target-independent fashion.
In this talk, I’ll first explain what PISA is, how it works, what kinds of design principles it is built on, and why it’s made possible now. Then I’ll introduce a few killer applications of data-plane programmability and show how one can develop similar applications in P4 on their own and at ease. In particular, I’ll demonstrate how amazingly easy, powerful, and yet simple networking monitoring and analysis tasks will be when one can program their own data plane. I’ll also explain how one can offload more elaborate upper-layer networking applications, such as middlebox functions, down to the programmable network devices.
I predict that the next few years will observe a multitude of PISA incarnations, and hence people will realize that PISA is just a new type of machine that offers an enormous amount of I/O capacity along with predictable performance and some computing capabilities. This realization will ultimately open the gate for joint-engineering a network and the distributed applications running on top of the network; I’ll conclude my talk by showing a glimpse of such an attempt.
Changhoon (Chang) Kim is Director of Architecture at Barefoot Networks and is working actively for the P4 Language Consortium (P4.org). Before getting involved with P4.org and Barefoot, he worked at Windows Azure, Microsoft's cloud-service division, and led engineering and research projects on the architecture, performance, and management of datacenter networks. Chang is interested in programmable network, network monitoring and diagnostics, network verification, self-programming/configuring networks, and debugging and diagnosis of large-scale distributed systems. Chang received Ph.D. from Princeton University. Many of his research contributions – including VL2, Tiny Packet Programs, Seawall, EyeQ, Ananta, and SEATTLE – are adopted in large production networks. Chang was the recipient of Microsoft Rockstar Award 2013, an annual recognition for the strongest individual networking contributions Microsoft-wide.