ACM SIGCOMM 2023, New York City, US

ACM SIGCOMM Workshop on Emerging Multimedia Systems (EMS)

Workshop Program

  • Sunday, September 10, 2023

  • 8:50am–9:00am      Opening Note

  • 9:00am–10:00am      Keynote 1

  • Title: Video Content Delivery: The First Quarter Century and Beyond

    Speaker: Dr. Ramesh Sitaraman, Distinguished University Professor, UMass Amherst and Chief Consulting Scientist, Akamai Technologies

    • Abstract: As the video content delivery network turns 25, we look back at its birth in the late 1990s and its subsequent evolution that fundamentally reshaped the Internet. The need to deliver high-performance videos at scale catalyzed the deployment of the edge, setting the stage for cloud and edge computing services. The challenge of monetizing online videos led to major innovations in the measurement and optimization of video QoE and the design of cost-optimized video delivery systems. Peeking into the future, the quest to deliver immersive and interactive media experiences over the Internet is an unsolved challenge that drives much of the current research. Further, the rapid growth in the carbon footprint of video delivery requires us to rethink the basics of video delivery with carbon as a first principle.

      Biography: Ramesh Sitaraman is a Distinguished University Professor at the University of Massachusetts at Amherst. He is best known for his role in pioneering content delivery and edge computing services that currently deliver much of the world’s web content, streaming videos, and online applications. As a principal architect at Akamai, he helped create the world’s first major content delivery networks (CDNs). He retains a part-time role as Akamai’s Chief Consulting Scientist.
      Prof. Sitaraman is a Fellow of the ACM and the IEEE. He is a recipient of the inaugural ACM SIGCOMM Networking Systems Award for his work on the Akamai CDN, the IEEE Willian R. Bennett Prize for his work on adaptive bitrate (ABR) algorithms for video streaming that are widely used in practice, and an Excellence in DASH award for his contributions to the MPEG-DASH standard for video streaming on the internet. He is also a recipient of the Distinguished Teaching Award (DTA), his university’s highest recognition of teaching. He received a B. Tech. in electrical engineering from the Indian Institute of Technology, Madras, and a Ph.D. in computer science from Princeton University.


  • 10:00am-10:30am Break

  • 10:30am–11:45pm      Technical Paper Session 1

  • Resource-Efficient and Privacy-Preserving Edge for Augmented Reality

    Tian Guo (Worcester Polytechnic Institute)

  • The Power of Asynchronous SLAM in Multi-User AR over Cellular Networks: A Measurement Study

    Yuting Guo, Sizhe Wang, Moinak Ghoshal (Northeastern University); Y. Charlie Hu (Purdue University); Dimitrios Koutsonikolas (Northeastern University)

  • Optimizing Real-Time Video Experience with Data Scalable Codec

    Hanchen Li, Yihua Cheng, Ziyi Zhang (University of Chicago); Qizheng Zhang (Stanford University); Anton Arapin, Nick Feamster (University of Chicago); Amrita Mazumdar (Nvidia)

  • Learning-based Homography Matrix Optimization for Dual-fisheye Video Stitching

    Mufeng Zhu, Yang Sui, Bo Yuan, Yao Liu (Rutgers University)

  • Mobile Volumetric Video Streaming System through Implicit Neural Representation

    Junhua Liu (FNii, CUHK-Shenzhen,Sensetime Research); Yuanyuan Wang (Sensetime Research); Yan Wang (Institute for AI Industry Research (AIR), Tsinghua University); Yufeng Wang (Tsinghua University); Shuguang Cui, Fangxin Wang (SSE, CUHK-Shenzhen,FNii, CUHK-Shenzhen)

  • 11:45am–12:30pm      Pitch Your Lab Session

  • 12:30pm-2:00pm Lunch

  • 2:00pm–3:00pm      Keynote 2

  • Title: Practical and Robust Neural Compression for Video Conferencing

    Speaker: Dr. Mohammad Alizadeh, Associate Professor, Department of Electrical Engineering and Computer Science, MIT

    • Abstract: Video conferencing systems suffer from poor user experience when network conditions deteriorate because current video codecs cannot operate at low bitrates or handle significant packet loss. These problems are particularly acute in regions with lower broadband penetration and disproportionately impact people in lower income brackets. Recently, several deep learning methods have been proposed that reconstruct talking head videos at very low bitrates using sparse representations of each frame such as facial landmark information. However, although these approaches work well in ideal conditions, they fail in many realistic scenarios such as scenes with large movements or occlusions, and they do not scale to high resolutions. In this talk, I will present Gemino, a neural compression system for video conferencing based on a novel high-frequency-conditional super-resolution technique. Gemino upsamples a very low-resolution version of each target frame while enhancing high-frequency details (e.g., skin texture, hair, etc.) based on information extracted from a high-resolution reference image. Gemino uses a multi-scale architecture that runs different components of the model at different resolutions, allowing it to scale to resolutions comparable to 720p, and it personalizes the model to learn specific details of each person, achieving much better fidelity at low bitrates. Our prototype system atop an open-source WebRTC implementation achieves ~2.2-5x lower bitrate than traditional codecs for the same perceptual quality and can operate 1024 x 1024 videos in real-time on a Titan X GPU. If time permits, I will also discuss recent work on Reparo, a loss-resilient codec for video conferencing that uses generative deep learning models to generate missing information when a frame of part of a frame is lost.

      Biography: Mohammad Alizadeh is an Associate Professor of Computer Science at the Massachusetts Institute of Technology. His research interests are in the areas of computer networks, systems, and applied machine learning. His current research focuses on machine learning for systems, and network protocols and algorithms for a broad range of applications, including Internet video delivery, cloud computing, and blockchain systems. Mohammad's research on datacenter networks has led to protocols now implemented in Linux and Windows and commercial switching products, and deployed by large network operators. Mohammad earned his MS and PhD in Electrical Engineering from Stanford University, and his BS from the Sharif University of Technology. He is a recipient of several awards, including the ACM Grace Murray Hopper Award, Microsoft Research Faculty Fellowship, VMware Systems Research Award, SIGCOMM Rising Star Award, NSF CAREER Award, Alfred P. Sloan Research Fellowship, SIGCOMM Test of Time Award, and multiple best paper awards.


  • 3:00pm-3:30pm Break

  • 3:30pm–4:30pm      Technical Paper Session 2

  • Understanding the Impact of Wi-Fi Configuration on Volumetric Video Streaming Applications

    Umakant Kulkarni (Hewlett Packard Labs and Purdue University); Khaled Diab, Shivang Aggarwal, Lianjie Cao, Faraz Ahmed, Puneet Sharma (Hewlett Packard Labs); Sonia Fahmy (Purdue University)

  • Text-to-3D Generative AI on Mobile Devices: Measurements and Optimizations

    Xuechen Zhang (University of California, Riverside); Zheng Li, Samet Oymak, Jiasi Chen (University of Michigan)

  • RTCSR: Zero-latency Aware Super-resolution for WebRTC Mobile Video Streaming

    Qian Yu (Southern University of Science and Technology); Qing Li (Peng Cheng Laboratory, Shenzhen, China); Rui He (Southern University of Science and Technology); Wanxin Shi (Tsinghua University International Graduate School); Yong Jiang (Tsinghua Shenzhen International Graduate School)

  • LiveAE: Attention-based and Edge-assisted Viewport Prediction for Live 360° Video Streaming

    Zipeng Pan (Communication University of China); Yuan Zhang, Tao Lin, Jinyao Yan (State Key Laboratory of Media Convergence and Communication, Communication University of China)

  • 4:30pm–4:40pm      Concluding Remarks

Call for Papers

Multimedia has played a significant role in driving Internet usage and has led to a range of technological advancements such as content delivery networks, compression algorithms, and streaming protocols. With emerging applications, including (not limited to) augmented and virtual reality (AR/VR), real-time conferencing, AI-generated content, and video analytics, multimedia is undergoing a fundamental shift in sharing experiences online and continues to drive the future of the Internet. Techniques developed for traditional video streaming should be revisited in light of these next-generation immersive technologies and require us to introduce new ways to optimize and innovate emerging multimedia systems. This Workshop will bring together multimedia experts to exchange ideas on challenges and opportunities in designing networked systems for emerging multimedia technologies.

Topics of interest include (but not limited to) the following:

  • Networked systems for immersive content capture, streaming, display
  • Networked systems for AI-driven video applications
  • Networked systems for multimedia generative AI
  • Machine learning for emerging multimedia distribution
  • Ultra-low-latency networking for AR/VR applications
  • High-throughput transport and distribution for emerging media
  • Adaptive streaming under network/user constraints for immersive media
  • Novel content distribution network for AR/VR applications
  • Management of AR/VR networked systems
  • Wireless and mobile immersive systems
  • AR/VR applications in 5G & 6G wireless networks
  • Compression and transmission design for 3D content
  • Edge cloud systems for immersive experiences
  • Quality of Experience for emerging multimedia
  • Security and privacy in AR/VR applications

Submission Instructions

Submissions must be original, unpublished work, and not in submission to other venues. Submitted papers must be at most six pages long, including all figures, tables, references, and appendices in two-column 10pt ACM format ( We also encourage research demos for which a two-page extended abstract must be submitted in the same format as the workshop papers. All submissions are double-blind.

Please submit your paper via

If you have any questions, please contact Mallesham Dasari at

Important Dates

  • June 19, 23:59, AoE, 2023

    Submission deadline

  • July 02, 2023

    Acceptance notification

  • July 16, 2023

    Camera-ready deadline

  • September 10, 2023



  • Workshop Chairs
  • Mallesham Dasari

    Carnegie Mellon University

  • Junchen Jiang

    University of Chicago

  • Maria Gorlatova

    Duke University

  • Technical Program Committee (TPC) Members
  • Klara Nahrstedt


  • Feng Qian


  • Robert LikamWa


  • Maria Gorlatova (co-chair)


  • Zili Meng


  • Wei Cai


  • Mallesham Dasari (co-chair)


  • Christian Timmerer


  • Wei Tsang Ooi


  • Amrita Mazumdar


  • Junchen Jiang (co-chair)


  • Sandip Chakraborty


  • Jiasi Chen


  • Yao Liu


  • Mario Montagud


  • Pengyuan Zhou


  • Bo Han


  • Carlee Joe-Wong