1st Workshop on Memory, Systems and Interconnect Co-design for AI (MOSAIC ‘26)
The era of Trillion-Parameter AI models has placed unprecedented pressure on traditional server architectures. While compute logic has scaled rapidly, the ability to feed data to these engines is hitting critical barriers: "the Memory and Communication wall”. As AI clusters scale to tens of thousands of XPUs, traditional networking stacks face unprecedented challenges with predictable latency, and the sheer bandwidth requirements of synchronous training operations (collectives like AllReduce, AllToAll, etc.), long context inference (KV cache transfer and offloading), and data movement energy efficiency (pJ/bit).
The community faces open questions: How can we address the limitation of copper interconnects? What is the optimal combination of interconnect technology and topology for massive scale-up networks? Is the answer in tighter integration, or in radical disaggregation?
This workshop aims to bring together academic researchers and industry practitioners to collectively define the future of AI networking infrastructure. We aim to bridge the gap between theoretical exploration and real-world deployment, inviting discussions on paradigms ranging from node-centric optimizations to emerging fabric-centric designs.
We encourage submissions on topics including, but not limited to:
- Next-Generation Interconnects & Physical Fabrics
- Scale-Up to Scale-Out Integration: Bridging the gap between high-bandwidth intra-pod fabrics (e.g., NVLink, UALink) and datacenter scale out networks (Ethernet, InfiniBand) for massive AI clusters with innovative fabric design.
- Optical Interconnects & Photonics: Application of silicon photonics, co-packaged optics (CPO), or optical circuit switching (OCS) to enable scale up datacenter networks, optically connected disaggregated memory and break bandwidth/energy barriers.
- Energy-Efficient Data Movement: Innovations in system topology, link-level technologies, and routing to minimize the energy cost (pJ/bit) of data transfer during massive-scale training and inference.
- Rack-Scale Topologies: Comparative analysis of monolithic versus disaggregated system designs, evaluating thermal, power, and throughput trade-offs.
- Memory Tiering & Disaggregation for AI
- Fabric-Attached, Pooled or Shared Memory: Architectures, protocols, and coherence mechanisms for CXL and UCIe-based memory expansion, targeting AI challenges like KV-cache offloading, etc.
- Heterogeneous Memory Hierarchies: Architectural exploration combining high-bandwidth (HBM) and high-capacity memory technologies, including hardware or software mechanisms for transparent tiering and migration.
- Emerging Memory Technologies: Integration of novel memory devices, 3D-stacking, and near data processing within the AI memory hierarchy to alleviate bandwidth walls.
- Hardware-Software Co-Design & AI Workloads
- Full-Stack System Optimization: Co-design across the OS, compiler, and runtime stack to expose heterogeneous memory tiers and new interconnect topologies to modern AI serving frameworks (e.g., PyTorch, vLLM).
- AI Workload Characterization: In depth benchmarking of foundation models including Mixture-of-Experts (MoE) and long-context LLM architectures on emerging hardware, identifying critical bottlenecks in capacity, bandwidth, and serving latency.
- Model Design Beyond Hardware Limits: Investigating future AI model architectures assuming unconstrained memory capacity and interconnect bandwidth. This topic aims to decouple algorithmic innovation from current hardware availability, identifying models that could emerge if memory and communication walls were removed.
We invite researchers and industry practitioners to submit research papers, whitepapers, and position papers that address critical memory, systems, and interconnect challenges in AI systems. We welcome diverse contributions ranging from early-stage ideas and novel architectural concepts to rigorous simulation or emulation-based studies. To bridge the gap between theory and practice, we also solicit industry perspectives on new standards, product designs, and emerging system challenges, including retrospectives on the limitations of current hardware. Furthermore, we encourage cross-layer studies on software stack optimizations – spanning compilers, operating systems, and runtimes, that are necessary to tackle infrastructure bottlenecks and support new memory and interconnect designs.
Submissions must be original, unpublished work, and not under consideration at another conference and journal. We will accept:
Research Paper: At most six (6) pages long, including all figures, tables, references and appendices in two-column 10 pt ACM format.
Whitepaper/Position Paper: Same format as research paper but at most two (2) pages long with one additional page for reference only.
Please submit your paper via https://mosaic26.hotcrp.com
| Submission Deadline | May 8, 2026, AoE |
|---|---|
| Acceptance Notification | Jun 5, 2026, AoE |
| Camera-ready deadline | Jun 20, 2026, AoE |
| Workshop date | Aug 17, 2026, AoE |
| Organizers | Affiliations |
|---|---|
| Yash Nishant | Marvell |
| Ravi Mahatme | Marvell |
| Jing Ding | Marvell |
| Chandrish Ambati | Marvell |
| Technical Program Committee | Affiliations |
| John Paul Shen | CMU |
| Trung Diep | Marvell |
| Martinus Bos | Marvell |
| Ganesh Balamurugan | Marvell |
| Mathieu Le Goc | Marvell |
ynishant@marvell.com, rmahatme@marvell.com, jingd@marvell.com, cambati@marvell.com