6th Asia-Pacific Workshop on Networking (APNet 2022)

2022, Fuzhou, China

APNet SIGCOMM (China) Talks

Abstract: The network infrastructure in reality consists of network devices with heterogeneous device models, due to competetion between multiple device vendors and coexsitence of legacy and latest devices. We define the process of introducing heterogeneous network devices (e.g. legacy devices and devices from a new vendor) into a centrally controlled, existing SDN network as Software-defined Network Assimilation (SNA). The current SNA approaches are painstaking for network operations (NetOps) teams, because much expert effort is required to bridge the gap between the heterogeneous configuration models of the devices and the unified data model in the SDN controller. In this talk, I will present our recent effort to help NetOps accelerate this process. Our solution, NAssim features a unified parser framework to parse diverse device user manuals into preliminary configuration models, a rigorous validator that confirm the correctness of the models via formal syntax analysis, model hierarchy validation and empirical data validation, and a deep-learning-based mapping algorithm that uses state-of-the-art neural language processing techniques to produce human-comprehensible recommended mapping between the validated configuration model and the one in the SDN controller. In all, NAssim aims to liberate the NetOps from most tedious tasks by learning directly from devices' manuals to produce data models which are comprehensible by both the SDN controller and human experts.

Speaker Bio: Huangxun Chen is a researcher at Huawei Hong Kong Research Centre, currently working on network configuration management and its intersections with machine learning. Dr. Chen received her Ph.D. in Computer Science and Engineering from Hong Kong University of Science and Technology (HKUST) in 2020 and her B.S. degree from Shanghai Jiaotong University (SJTU) in 2015. This piece of work is accepted by SIGCOMM‘22 and would be released to the academia.

Jinyang Li

Ph.D. candidate at the Institute of Computing Technology (ICT), Chinese Academy of Sciences

Paper Title:

LiveNet: A Low-Latency Video Transport Network for Large-Scale Live Streaming

Abstract: Low-latency live streaming has imposed stringent latency requirements on video transport networks. The de facto hierarchical CDN structure falls short to meet the requirements due to its pre-defined overlay paths for individual sessions. In this talk, I will present LiveNet that builds on a flat CDN overlay with a centralized controller for global optimization. The global routing computation considers the overlay network status to compute the ‘shortest’ paths for pairs of nodes, while the path assignment decides the best paths for individual viewing sessions. LiveNet also adopts a novel transmission architecture that incorporate fast and slow path transmission with fine-grained control of video frames. The performance results obtained from three years of operation in Alibaba demonstrate the effectiveness of LiveNet in improving CDN performance and QoE metrics. Specifically, compared with the hierarchical CDN deployment, LiveNet halves the CDN delay, ensures 98% of views do not experience stalls and that 95% can start playback within 1 second.

Speaker Bio: Jinyang Li is a Ph.D. candidate at the Institute of Computing Technology (ICT), Chinese Academy of Sciences, advised by Prof. Zhenyu Li. He received his BS in computer science from Sichuan University (Elite Class) in 2017. His research interest mainly focuses on low-latency transmission. He has published papers in SIGCOMM/INFOCOM.

Rui Miao

Senior engineer/researcher at Alibaba

Paper Title:

𝜇Fabric: Predictable Virtual Fabric on Informative Data Plane

Abstract: In multi-tenant data centers, each tenant desires reassuring predictability from the virtual network fabric – bandwidth guarantee, work conservation, and bounded tail latency. Achieving these goals simultaneously relies on rapid and precise traffic admission. However, the slow convergence (tens of milliseconds) of prior works can hardly satisfy the increasingly rigorous performance demand under dynamic traffic patterns. Further, state-of-the-art load balance schemes are all guarantee-agnostic and bring great risks of breaking bandwidth guarantee, which is overlooked in prior works. In this talk, I will present 𝜇FAB, a predictable virtual fabric solution that can (1) explicitly select proper paths for all flows and (2) converge to ideal bandwidth allocation at submillisecond timescales. The core idea of 𝜇FAB is to leverage the programmable data plane to build a fusion of an active edge (e.g., NIC) and an informative core (e.g., switch), where the core sends link status and tenant information to the edge via telemetry to help the latter make a timely and accurate decision on path selection and traffic admission. We fully implement 𝜇FAB with commodity SmartNICs and programmable switches. Evaluations show that 𝜇FAB can keep a minimum bandwidth guarantee with high bandwidth utilization and near-optimal transmission latency in various network situations with limited probing bandwidth overhead. Application level experiments, e.g., compute and storage scenarios, show that 𝜇FAB can improve QPS by 2.5% and cut tail latency by more than 21% compared to the alternatives.

Speaker Bio: Rui Miao is a senior engineer/researcher at Alibaba. His research focuses on building predictable and high-performance datacenter networks, by leveraging the programmable data plane. His research has been deployed in AliCloud's network infra, supporting core services such as storage and HPC/AI. His works have also been adopted by many silicon vendors in their latest products including Intel/Barefoot, Nvidia/Mellanox, Broadcom, Cisco, Marvel, etc. Before joining Alibaba, he received his Ph.D. in CS from the University of Southern California, under the guidance of Prof. Minlan Yu. He received his M.S. from Tsinghua and B.S. from UESTC. He has more than 10 publications in SIGCOMM and NSDI.

Abstract: Real-time communication (RTC) applications like video conferencing or cloud gaming require consistent low latency to provide a seamless interactive experience. However, wireless networks including WiFi and cellular, albeit providing a satisfactory median latency, drastically degrade at the tail due to frequent and substantial wireless bandwidth fluctuations. We observe that the control loop for the sending rate of RTC applications is inflated when congestion happens at the wireless access point (AP), resulting in untimely rate adaption to wireless dynamics. Existing solutions, however, suffer from the inflated control loop and fail to quickly adapt to bandwidth fluctuations. In this paper, we propose Zhuge, a pure wireless AP based solution that reduces the control loop of RTC applications by separating congestion feedback from congested queues. We design a Fortune Teller to precisely estimate per-packet wireless latency upon its arrival at the wireless AP. To make Zhuge deployable at scale, we also design a Feedback Updater that translates the estimated latency to comprehensible feedback messages for various protocols and immediately delivers them back to senders for rate adaption. Trace-driven and real-world evaluation shows that Zhuge reduces the ratio of large tail latency and RTC performance degradation by 17% to 95%.

Speaker Bio: Zili is a 3rd-year PhD student in Tsinghua University, advised by Prof. Mingwei Xu. He received his bachelor in electronic engineering from Tsinghua University. His current research interest focuses on real-time communications. He has published several papers in SIGCOMM/NSDI and received the Gold Medal of SIGCOMM 2018 SRC, Microsoft Research Asia PhD Fellowship, and two best paper awards.

Abstract: Adaptive neural networks have been used to optimize OS kernel datapath functions because they can achieve superior performance under changing environments. However, how to deploy these NNs remains a challenge. One approach is to deploy these adaptive NNs in the userspace. However, such userspace deployments suffer from either high cross-space communication overhead or low responsiveness, significantly compromising the function performance. On the other hand, pure kernel-space deployments also incur a large performance degradation because the computation logic of model tuning algorithms is complicated, interfering with the performance of normal datapath execution. This paper presents LiteFlow, a hybrid solution to build high-performance adaptive NNs for kernel datapath. At its core, LiteFlow decouples the control path of adaptive NNs into: (1) a kernel-space fast path for efficient inference execution, and (2) a userspace slow path for efficient model tuning. We have implemented LiteFlow with Linux kernel datapath and evaluated LiteFlow with three real-world datapath functions including congestion control, flow scheduling, and load balancing. Compared to prior works, LiteFlow can achieve 48.4% better goodput for congestion control, and improve the completion time for long flows by 33.7% and 56.7% respectively for flow scheduling and load balancing.

Speaker Bio: Junxue ZHANG is a Ph.D. candidate of computer science at iSINGLab, Hong Kong Univeristy of Science and Technology. He is under supervision of Prof. Kai CHEN. His reseach interests are data center networking, AI systems and privacy preserving computation. He is the co-founder/CTO of Clustar Technology Co., Ltd. Before joining HKUST, he received his BSc and MSc from Southeast Univeristy. He has published papers on SIGCOMM/CoNEXT.

Peng Zhang

Professor at School of Computer Science and Technology, Xi’an Jiaotong University.

Paper Title:

Symbolic Router Execution

Abstract: Network verifiers enable operators to proactively reason about a network’s forwarding behaviors to avoid potential problems. Existing verifiers target specific spaces and models: efficiently covering header space or failure space, but not both; assuming deterministic failures to compute failure tolerance, or probabilistic failures to compute probabilities. As a result, no single verifier can support all analyses that require different space coverage and failure models. In this talk, we introduce Symbolic Router Execution (SRE), a general and scalable verification engine that supports various analyses. SRE symbolically executes the network model to discover what we call packet failure equivalence classes (PFECs), each of which characterises a unique forwarding behavior across the product space of headers and failures. SRE enables various optimizations during the symbolic execution, while maintaining agnostic of the failure model, so that it scales to the product space in a general way. By using BDDs to encode symbolic headers and failures, various analyses reduce to graph algorithms (e.g., shortest-path) on the BDDs. Our evaluation using real and synthetic topologies shows SRE achieves better or comparable performance when checking properties, mining specifications, computing probabilities, etc. compared to state-of-the-art methods.

Speaker Bio: Peng Zhang is a professor with School of Computer Science and Technology, Xi’an Jiaotong University. Before joining Xi’an Jiaotong University, Peng Zhang received his B.E. degree from Beijing University of Posts and Telecommunications in 2008, and Ph.D. degree from Tsinghua University (with honor) in 2013, both in computer science. He was a visiting researcher at Chinese University of Hong Kong (2009) and Yale University (2011-2012). His research interests include network verification, programmable networks, and network security. He has published papers at major networking conferences, including SIGCOMM, NSDI, etc.

Yihao Zhao

Ph.D. student at Peking University

Paper Title:

Multi-Resource Interleaving for Deep Learning Training

Abstract: Training deep learning (DL) models requires multiple resource types, including CPUs, GPUs, storage IO, and network IO. Advancements in DL have produced a wide spectrum of models that have diverse usage patterns on different resource types. Existing DL schedulers only focus on GPU allocation and miss the opportunity of packing jobs along multiple resource types. We present Muri, a multi-resource cluster scheduler for DL workloads. Muri exploits multi-resource interleaving of DL training jobs to achieve high resource utilization and reduce job completion time (JCT). DL jobs have a unique staged, iterative computation pattern. In contrast to multi-resource schedulers for big data workloads that pack jobs in the space dimension, Muri leverages this unique pattern to interleave jobs on the same set of resources in the time dimension. Muri adapts Blossom algorithm to find the perfect grouping plan for single-GPU jobs with two resource types and generalizes the algorithm to handle multi-GPU jobs with more than two types. We build a prototype of Muri and integrate it with PyTorch. Experiments on a cluster with 64 GPUs demonstrate that Muri improves the average JCT by up to 3.6x and the makespan by up to 1.6x over existing DL schedulers.

Speaker Bio: Yihao Zhao is a Ph.D. student in the School of Computer Science at Peking University, advised by Prof. Xuanzhe Liu and Prof. Xin Jin. His research is in computer systems and networks, with a focus on cloud computing. He received his BS in computer science from Peking University (Turing Class) in 2021.

Abstract: Network measurement is important to datacenter operators. Most existing efforts focus on developing new implementation schemes for measurement tasks. Seldom attention is paid to on-the-fly measurement task reconfiguration. Due to resource constraints, usually it is impossible to configure all needed tasks at start-up and dynamically turn on/off them. To support real-time reconfiguration of a large number of different tasks, a key observation is that it is unnecessary to bind a task and its implementation. We design FlyMon, the first on-the-fly reconfiguration system that can accommodate a large number of measurement tasks. FlyMon realizes Composable Measurement Unit (CMU), a general operation unit which supports reconfigurable implementation for various measurement tasks combined from different keys and attributes. FlyMon maps the CMU design to programmable switches' data-plane so that the number of compacted CMUs can be maximized. FlyMon also provides dynamic memory management. We prototype FlyMon on Tofino and currently enable four frequently used attributes. Each CMU-Group (with 3 CMUs) can concurrently perform up to 96 measurement tasks with less than 8.3% hardware resources in Tofino. By cross-stacking, FlyMon can deploy up to 27 CMUs on one pipeline of Tofino. Besides, network operators can dynamically deploy the measurement tasks with configurable memory at the millisecond level.

Speaker Bio: Hao Zheng received the B.S. degree at the Department of Software Engineering, Southeast University, China, in 2020. He is working towards the Ph.D. degree in the Department of Computer Science and Technology at Nanjing University, China. His research interests include software-defined network and network measurement.