1st Workshop on Memory-Semantic Networking for AI-Scale Systems (MemNetAI)
The rapid scaling of foundation model training and AI inference is reshaping data center architectures. AI systems are increasingly limited not just by compute or packet-network bandwidth, but by memory capacity, bandwidth, and tail-latency sensitivity—making memory a primary bottleneck.
Modern AI servers span multiple communication tiers: accelerator-scale interconnects (e.g., NVLink-class fabrics) provide high-bandwidth intra-node communication, while cluster-wide coordination relies on RDMA over InfiniBand or Ethernet. Emerging technologies such as silicon photonics and CXL introduce switched, memory-semantic fabrics that enable rack-scale memory pooling and load/store communication across hosts. While server CPUs support CXL-based memory expansion, coherent memory-fabric integration across heterogeneous AI accelerators remains limited. This gap complicates scalable scale-up memory architectures and motivates research into interoperable memory-fabric designs.
This evolution creates a new architectural tier between backplane interconnects and data center networks. Memory traffic now traverses switched infrastructure and experiences network-like effects—congestion, contention, fairness, and failures—making memory performance dependent on scheduling and congestion control.
AI workloads amplify these challenges: training induces bursty synchronization and memory amplification; inference requires distributed KV cache capacity under strict tail latency constraints; and mixture-of-experts models create skewed, dynamic access patterns. As accelerator, memory, storage, and packet-network tiers increasingly interact, performance becomes tightly coupled across layers. Decisions in one tier—such as congestion control, flow scheduling, or memory placement—can cascade into latency amplification, throughput degradation, or instability in another.
Yet the networking community lacks a unified abstraction for memory-semantic fabrics, principled congestion models for load/store traffic over switched infrastructure, and comprehensive simulation and benchmarking methodologies tailored to AI-scale systems.
MemNetAI aims to define the networking principles of memory-semantic fabrics in AI-scale environments and to build a research community around this emerging frontier.
Topics of interest include, but are not limited to:
A. Scale-Up and Cross-Tier Fabric Interaction (Primary Focus)
- Interaction between accelerator-scale fabrics and rack-scale memory-semantic fabrics
- Congestion propagation and feedback across scale-up, memory, and network tiers
- Congestion control and fairness for load/store traffic
- Tail-latency amplification across fabric boundaries
- Coherence and consistency across heterogeneous fabric domains
B. Memory-Semantic Fabric Design
- Switched load/store fabrics and rack-scale memory pooling
- Credit-based flow control and congestion management
- Multi-tier memory fabrics extending beyond a single rack
- Failure domains, resilience, and recovery in pooled memory systems
- Telemetry and observability for memory-semantic fabrics
- Optical and photonic extensions of memory fabrics
- Compute-enabled memory fabrics and near-memory compute
C. AI-Driven Memory and Tiered Architectures
- Distributed KV-cache architectures
- Memory amplification and burst dynamics in large-scale training
- Elastic and disaggregated memory provisioning
- Tiered memory systems, including interaction with disaggregated storage tiers in AI-scale systems
D. Scheduling and Cross-Layer Control
- Memory placement and migration policies
- Coupling between cluster schedulers and fabric congestion
- Cross-layer coordination across compute, memory, storage, and network tiers
E. Simulation, Metrics, and Benchmark Standardization
- Load/store-aware simulation frameworks
- Cross-layer modeling of scale-up and memory fabrics
- Trace-driven AI workload generation
- Metrics for congestion sensitivity, fairness, and iteration-time impact
- Benchmark standardization efforts, including workload suites and reproducible evaluation methodologies for memory-semantic networking
- Pure DRAM device design
- GPU microarchitecture without fabric implications
- AI model optimization without networking relevance
- Storage-only disaggregation without memory semantics
We invite researchers and practitioners to submit original research papers, including position papers on disruptive ideas and early-stage work with potential to become full papers in the future.
Reviewing will be double-blind. Authors must make a good-faith effort to anonymize their submissions. Papers must not include author names and affiliations, and avoid implicitly disclosing the authors’ identity (e.g., via self-citation, funding acknowledgments).
We accept two types of submissions:
Regular research papers of up to 6 pages, excluding references and appendices. Submissions must be original, unpublished work, and not under consideration at another conference or journal. Authors of accepted submissions are expected to present their work at the workshop. Accepted submissions will be included in the workshop proceedings.
Extended abstracts of up to 2 pages, excluding references, in the same format as the regular papers. Submissions are for early-stage works and position papers that are still in progress, allowing authors to showcase their preliminary ideas and receive early-stage feedback at the workshop. The authors are expected to present their work as a lightning talk and/or a poster during the workshop. Authors of accepted submissions will have the option to opt out of including the submissions in the workshop proceedings.
Please submit your paper via https://memnetai26.hotcrp.com
Submissions must be printable PDF files. When creating your submission, you must use the _sigconf _proceedings template (two-column format, 10-pt font size) available on the official ACM site. LaTeX submissions should use the acmart.cls template (sigconf option), with the 10-pt font.
| Submission deadline | April 27, 2026 |
|---|---|
| Acceptance notification | May 24, 2026 |
| Camera-ready deadline | June 20, 2026 |
| Workshop date | August 17, 2026 |
| Organizers | Institution |
|---|---|
| Rinku Shah | IIIT Delhi |
| Praveen Tammana | IIT Hyderabad |
| Abed Mohammad Kamaluddin | Marvell Technology |
| Satananda Burla | Marvell Technology |
| Technical Program Committee | Institution |
| Abed Mohammad Kamaluddin | Marvell, India |
| Abhijit Das | IIT Hyderabad |
| Arnab Kumar Paul | BITS Goa |
| Divyanshu Saxena | UT Austin |
| Jinsun Yoo | Georgia Institute of Technology |
| Michal Kalderon | Marvell, Israel |
| Mythili Vutukuru | IIT Bombay |
| Nathan Tallent | PNNL |
| Praveen Tammana | IIT Hyderabad |
| Priyanka Naik | IBM Research, India |
| Purushottam Kulkarni (Puru) | IIT Bombay |
| Ravichandra Mynidi | Marvell, India |
| Rinku Shah | IIIT Delhi |
| Satananda Burla | Marvell, USA |
| Sathya Peri | IIT Hyderabad |
| Senad Durakovic | Marvell, USA |
| Ulf Hanebutte | Marvell, USA |
| Annus Zulfiqar | University of Michigan, USA |
| Muhammad Shahbaz | University of Michigan, USA |
| Rip Sohan | AMD, USA |
| Prankur Gupta | Meta, USA |