Conference Program
Kindly note that the schedule below is tentative and subject to updates. You may also subscribe to the SIGCOMM conference schedule on Google calendar.
-
Room: Agata
- Topic Preview 1
-
Topic Preview 1 will take place at Agata room. The topics covered will be Scheduling, Software Defined Networking & Network Function Virtualization, and Datacenter Networking.
-
7:00pm - 9:00pm Welcome Reception
-
-
The Welcome Reception will take place at the Oceania Convention Center. Drinks and hors d'oeuvre will be served. Please visit the Social Events page for further information.
-
8:00pm - 10:30pm N2Women Dinner
-
-
SIGCOMM'16 is experimenting a new format for the N2Women event, which will be held over a dinner at Canto do Mar Ingleses Restaurant. Check out the Social Events page for further information, and please RSVP if you plan to attend the dinner.
-
8:00pm - 11:00pm Award Dinner
-
-
Participation in this event is by invitation only.
-
8:45am - 10:30am Opening Session and Keynote
Session Chair: Marinho Barcellos (UFRGS), Jon Crowcroft (University of Cambridge)
Room: Diamante
- Opening Session and Keynote
-
-
Keynote: Networking Research, Education, Mentoring and Service: Ten Insights
Jim Kurose (University of Massachusetts, Amherst), 2016 ACM SIGCOMM Lifetime Achievement Award Recipient
Bio: Jim Kurose received a B.A. degree in physics from Wesleyan University and his Ph.D. degree in computer science from Columbia University. He is currently Distinguished University Professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst. He has been a Visiting Scientist at IBM Research, INRIA, Institut EURECOM , the University of Paris, the Laboratory for Information, Network and Communication Sciences, and Technicolor Research Labs.
His research interests include network protocols and architecture, network measurement, sensor networks, and multimedia communication. Dr. Kurose has served as Editor-in- Chief of the IEEE Transactions on Communications and was the founding Editor-in- Chief of the IEEE/ACM Transactions on Networking. He has been Technical Program Co-Chair for the IEEE Infocom, ACM SIGCOMM, ACM SIGMETRICS, and ACM Internet Measurement conferences. He has won several conference best paper awards and received the IEEE Infocom Achievement Award and the ACM Sigcomm Test of Time Award.
He has received a number of awards for his teaching including outstanding teacher awards from the National Technological University, the UMass College of Natural Science and Mathematics, and the Northeast Association of Graduate Schools, and the IEEE Taylor Booth Education Medal. He has recently served on the Board of Directors of the Computing Research Association, and on the scientific advisory boards of IMDEA Networks in Madrid and the Laboratory for Information, Network and Communication Sciences (LINCS) in Paris. With Keith Ross, he is the co-author of the textbook, Computer Networking, a top down approach (7th edition), published by Addison-Wesley/Pearson. He is a Fellow of the ACM and the IEEE.
Since January 2015, he is serving as an Assistant Director of the US National Science Foundation, where he leads the Directorate for the Computer and Information Science and Engineering (CISE) in its mission to fundamental research in computer and information science and engineering and transformative advances in cyberinfrastructure. Dr. Kurose oversees the CISE budget of more than $900 million. He also serves as co-chair of the Networking and Information Technology Research and Development (NITRD) Subcommittee of the National Science and Technology Council Committee on Technology, helping coordinate the activities of 17 government agencies.
-
10:30am - 11:00am Coffee Break
- Coffee Break
-
11:00am - 12:40pm Session 1 - SDN & NFV I
Session Chair: Nate Foster (Cornell University)
Room: Diamante
- Session 1 - SDN & NFV I
-
Bojie Li (USTC / Microsoft Research), Kun Tan (Microsoft Research), Layong (Larry) Luo (Microsoft), Yanqing Peng (SJTU / Microsoft Research), Renqian Luo (USTC / Microsoft Research), Ningyi Xu (Microsoft Research), Yongqiang Xiong (Microsoft Research), Peng Cheng (Microsoft Research), Enhong Chen (USTC)
Abstract: Highly flexible software network functions (NFs) are crucial components to enable multi-tenancy in the clouds. However, software packet processing on a commodity server has limited capacity and induces high latency. While software NFs could scale out using more servers, doing so adds significant cost. This paper focuses on accelerating NFs with programmable hardware, i.e., FPGA, which is now a mature technology and inexpensive for datacenters. However, FPGA is predominately programmed using low-level hardware description languages (HDLs), which are hard to code and difficult to debug. More importantly, HDLs are almost inaccessible for most software programmers. This paper presents ClickNP, a FPGA-accelerated platform for highly flexible and high-performance NFs with commodity servers. ClickNP is highly flexible as it is completely programmable using high-level C-like languages, and exposes a modular programming abstraction that resembles Click Modular Router. ClickNP is also high performance. Our prototype NFs show that they can process traffic at up to 200 million packets per second with ultra-low latency (< 2\mu s). Compared to existing software counterparts, with FPGA, ClickNP improves throughput by 10x, while reducing latency by 10x. To the best of our knowledge, ClickNP is the first FPGA-accelerated platform for NFs, written completely in high-level language and achieving 40 Gbps line rate at any packet size.
-
Anirudh Sivaraman (MIT CSAIL), Alvin Cheung (University of Washington, Seattle), Mihai Budiu (VMWare Research), Changhoon Kim (Barefoot Networks), Mohammad Alizadeh (MIT CSAIL), Hari Balakrishnan (MIT CSAIL), George Varghese (Microsoft Research), Nick McKeown (Stanford University), Steve Licking (Barefoot Networks)
Abstract: Many algorithms for congestion control, scheduling, network measurement, active queue management, and traffic engineering require custom processing of packets in the data plane of a network switch. To run at line rate, these data-plane algorithms must be implemented in hardware. With today's switch hardware, algorithms cannot be changed, nor new algorithms installed, after a switch has been built.
This paper shows how to program data-plane algorithms in a high-level language and compile those programs into low-level microcode that can run on emerging programmable line-rate switching chips. The key challenge is that many data-plane algorithms create and modify algorithmic state. To achieve line-rate programmability for stateful algorithms, we introduce the notion of a packet transaction: a sequential packet-processing code block that is atomic and isolated from other such code blocks.
We have developed this idea in Domino, a C-like imperative language to express data-plane algorithms. We show with many examples that Domino provides a convenient way to express sophisticated data-plane algorithms, and show that these algorithms can be run at line rate with modest estimated chip-area overhead.
-
Mina Tahmasbi Arashloo (Princeton University), Yaron Koral (Princeton University), Michael Greenberg (Pomona College), Jennifer Rexford (Princeton University), David Walker (Princeton University)
Abstract: Early programming languages for software-defined networking (SDN) were built on top of the simple match-action paradigm offered by OpenFlow 1.0. However, emerging hardware and software switches offer much more sophisticated support for persistent state in the data plane, without involving a central controller. Nevertheless, managing stateful, distributed systems efficiently and correctly is known to be one of the most challenging programming problems. To simplify this new SDN problem, we introduce SNAP.
SNAP offers a simpler "centralized" stateful programming model, by allowing programmers to develop programs on top of one big switch rather than many. These programs may contain reads and writes to global, persistent arrays, and as a result, programmers can implement a broad range of applications, from stateful firewalls to fine-grained traffic monitoring. The SNAP compiler relieves programmers of having to worry about how to distribute, place, and optimize access to these stateful arrays by doing it all for them. More specifically, the compiler discovers read/write dependencies between arrays and translates one-big-switch programs into an efficient internal representation based on a novel variant of binary decision diagrams. This internal representation is used to construct a mixed-integer linear program, which jointly optimizes the placement of state and the routing of traffic across the underlying physical topology. We have implemented a prototype compiler and applied it to about 20 SNAP programs over various topologies to demonstrate our techniques' scalability.
-
Anirudh Sivaraman (MIT CSAIL), Suvinay Subramanian (MIT CSAIL), Mohammad Alizadeh (MIT CSAIL), Sharad Chole (Cisco Systems), Shang-Tse Chuang (Cisco Systems), Anurag Agrawal (Barefoot Networks), Hari Balakrishnan (MIT CSAIL), Tom Edsall (Cisco Systems), Sachin Katti (Stanford University), Nick McKeown (Stanford University)
Abstract: Switches today provide a small menu of scheduling algorithms. While we can tweak scheduling parameters, we cannot modify algorithmic logic, or add a completely new algorithm, after the switch has been designed. This paper presents a design for a {m programmable} packet scheduler, which allows scheduling algorithms---potentially algorithms that are unknown today---to be programmed into a switch without requiring hardware redesign.
Our design uses the property that scheduling algorithms make two decisions: in what order to schedule packets and when to schedule them. Further, we observe that in many scheduling algorithms, definitive decisions on these two questions can be made when packets are enqueued. We use these observations to build a programmable scheduler using a single abstraction: the push-in first-out queue (PIFO), a priority queue that maintains the scheduling order or time.
We show that a PIFO-based scheduler lets us program a wide variety of scheduling algorithms. We present a hardware design for this scheduler for a 64-port 10 Gbit/s shared-memory (output-queued) switch. Our design costs an additional 4% in chip area. In return, it lets us program many sophisticated algorithms, such as a 5-level hierarchical scheduler with programmable decisions at each level.
-
12:40pm - 2:00pm Lunch Break
- Lunch Break
-
2:00pm - 3:15pm Session 2 - Wide Area Networks
Session Chair: Bruce Maggs (Duke University)
Room: Diamante
- Session 2 - Wide Area Networks
-
Ramesh Govindan (Google / USC), Ina Minei (Google), Mahesh Kallahalla (Google), Bikash Koley (Google), Amin Vahdat (Google)
Abstract: Maintaining the highest levels of availability for content providers is challenging in the face of scale, network evolution and complexity. Little, however, is known about failures large content providers are susceptible to, and what mechanisms they employ to ensure high availability. From a detailed analysis of over 100 high-impact failure events in a global-scale content provider encompassing several data centers and two WANs, we quantify several dimensions of availability failures. We find that failures are evenly distributed across different network types and planes, but that a large number of failures happen when a management operation is in progress within the network. We discuss some of these failures in detail, and also describe our design principles for high availability motivated by these failures, including using defense in depth, maintaining consistency across planes, failing open on large failures, carefully preventing and avoiding failures, and assessing root cause quickly. Our findings suggest that, as networks become more complicated, failures lurk everywhere, and, counter-intuitively, continuous incremental evolution of the network can, when applied together with our design principles, result in a more robust network.
-
Xin Jin (Princeton University), Yiran Li (Tsinghua University), Da Wei (Tsinghua University), Siming Li (Stony Brook University), Jie Gao (Stony Brook University), Lei Xu (Sodero Networks), Guangzhi Li (AT&T Labs), Wei Xu (Tsinghua University), Jennifer Rexford (Princeton University)
Abstract: Bulk transfer on the wide-area network (WAN) is a fundamental service to many globally-distributed applications. It is challenging to efficiently utilize expensive WAN bandwidth to achieve short transfer completion time and meet mission-critical deadlines. Advancements in software-defined networking (SDN) and optical hardware make it feasible and beneficial to quickly reconfigure optical devices in the optical layer, which brings a new opportunity for traffic management on the WAN.
We present Owan, a novel traffic management system that optimizes wide-area bulk transfers with centralized joint control of the optical and network layers. \sysname can dynamically change the network-layer topology by reconfiguring the optical devices. We develop efficient algorithms to jointly optimize optical circuit setup, routing and rate allocation, and dynamically adapt them to traffic demand changes. We have built a prototype of Owan with commodity optical and electrical hardware. Testbed experiments and large-scale simulations on two ISP topologies and one inter-DC topology show that \sysname completes transfers up to 4.45x faster on average, and up to 1.36x more transfers meet their deadlines, as compared to prior methods that only control the network layer.
-
Virajith Jalaparti (Microsoft), Ivan Bliznets (St. Petersburg Academic University), Srikanth Kandula (Microsoft), Brendan Lucier (Microsoft), Ishai Menache (Microsoft)
Abstract: Neither traffic engineering nor fixed prices (e.g., $/GB) alone fully address the challenges of highly utilized inter-datacenter WANs. The former offers more service to users who overstate their demands and poor service overall. The latter offers no service guarantees to customers, and providers have no lever to steer customer demand to lightly loaded paths/times. To address these issues, we design and evaluate Pretium -- a framework that combines dynamic pricing with traffic engineering for inter-datacenter bandwidth. In Pretium, users specify their required rates or transfer sizes with deadlines, and a price module generates a price quote for different guarantees (promises) on these requests. The price quote is generated using internal prices (which can vary over time and links) which are maintained and periodically updated by Pretium based on history. A supplementary schedule adjustment module gears the agreed-upon network transfers towards an efficient operating point by optimizing time-varying operation costs. Experiments using traces from a large production WAN show that Pretium improves total system efficiency (value of routed transfers minus operation costs) by more than 3.5X relative to current usage-based pricing schemes, while increasing the provider profits by 2X.
-
3:15pm - 4:15pm Posters and Demos I (includes coffee break from 3:35pm-4:15pm)
Room: Topazio and Agata
- Posters and Demos I (includes coffee break from 3:35pm-4:15pm)
-
Posters session will take place at Topazio room. Demos session will take place at Agata room. Includes coffee break from 3:35pm-4:15pm.
-
4:15pm - 5:30pm Session 3 - Monitoring and Diagnostics
Session Chair: Jeff Mogul (Google)
Room: Diamante
- Session 3 - Monitoring and Diagnostics
-
Zaoxing Liu (Johns Hopkins University), Antonis Manousis (Carnegie Mellon University), Gregory Vorsanger (Johns Hopkins University), Vyas Sekar (Carnegie Mellon University), Vladimir Braverman (Johns Hopkins University)
Abstract: Network management requires accurate estimates of metrics for traffic engineering (e.g., heavy hitters), anomaly detection (e.g., entropy of source addresses), and security (e.g., DDoS detection). Obtaining accurate estimates given router CPU and memory constraints is a challenging problem. Existing approaches fall in one of two undesirable extremes: (1) low fidelity general-purpose approaches such as sampling, or (2) high fidelity but complex algorithms customized to specific application-level metrics. Ideally, a solution should be both general (i.e., supports many applications) and provide accuracy comparable to custom algorithms. This paper presents UnivMon, a framework for flow monitoring which leverages recent theoretical advances and demonstrates that it is possible to achieve both generality and high accuracy. UnivMon uses an application-agnostic data plane monitoring primitive; different (and possibly unforeseen) estimation algorithms run in the control plane, and use the statistics from the data plane to compute application-level metrics. We present a proof-of-concept implementation of UnivMon using P4 and develop simple coordination techniques to provide a ``one-big-switch'' abstraction for network-wide monitoring. We evaluate the effectiveness of UnivMon using a range of trace-driven evaluations and show that it offers comparable (and sometimes better) accuracy relative to custom sketching solutions.
-
Masoud Moshref (University of Southern California), Minlan Yu (University of Southern California), Ramesh Govindan (University of Southern California), Amin Vahdat (Google)
Abstract: As data centers grow larger and strive to provide tight performance and availability SLAs, their monitoring infrastructure must move from passive systems that provide aggregated inputs to human operators, to active systems that enable programmed control. In this paper, we propose Trumpet, an event monitoring system that leverages CPU resources and end-host programmability, to monitor every packet and report events at millisecond timescales. Trumpet users can express many *network-wide events*, and the system efficiently detects these events using *triggers* at end-hosts. Using careful design, Trumpet can evaluate triggers by inspecting every packet at full line rate even on future generations of NICs, scale to thousands of triggers per end-host while bounding packet processing delay to a few microseconds, and report events to a controller within 10 milliseconds, even in the presence of attacks. We demonstrate these properties using an implementation of Trumpet, and also show that it allows operators to describe new network events such as detecting correlated bursts and loss, identifying the root cause of transient congestion, and detecting short-term anomalies at the scale of a data center tenant.
-
Ang Chen (University of Pennsylvania), Yang Wu (University of Pennsylvania), Andreas Haeberlen (University of Pennsylvania), Wenchao Zhou (Georgetown University), Boon Thau Loo (University of Pennsylvania)
Abstract: In this paper, we propose a new approach to diagnosing problems in complex distributed systems. Our approach is based on the insight that many of the trickiest problems are anomalies. For instance, in a network, problems often affect only a small fraction of the traffic (e.g., perhaps a certain subnet), or they only manifest infrequently. Thus, it is quite common for the operator to have “examples” of both working and non-working traffic readily available – perhaps a packet that was misrouted, and a similar packet that was routed correctly. In this case, the cause of the problem is likely to be wherever the two packets were treated differently by the network.
We present the design of a debugger that can leverage this information using a novel concept that we call differential provenance. Differential provenance tracks the causal connections between network states and state changes, just like classical provenance, but it can additionally perform root-cause analysis by reasoning about the differences between two provenance trees. We have built a diagnostic tool that is based on differential provenance, and we have used our tool to debug a number of complex, realistic problems in two scenarios: software-defined networks and MapReduce jobs. Our results show that differential provenance can be maintained at relatively low cost, and that it can deliver very precise diagnostic information; in many cases, it can even identify the precise root cause of the problem.
-
7:00pm - 11:30pm Conference Banquet
-
-
The Conference Banquet will take place at the Slaviero Essential Florianópolis Ingleses. Please visit the Social Events page for further information.
-
8:30am - 10:10am Session 4 - Scheduling
Session Chair: Sergey Gorinsky (IMDEA Networks Institute)
Room: Diamante
- Session 4 - Scheduling
-
Jonathan Mace (Brown University), Peter Bodik (Microsoft), Madanlal Musuvathi (Microsoft), Rodrigo Fonseca (Brown University), Krishnan Varadarajan (Microsoft)
Abstract: In many important cloud services, different tenants execute their requests in the thread pool of the same process, requiring fair sharing of resources. However, using fair queue schedulers to provide fairness in this context is difficult because of high execution concurrency, and because request costs are unknown and have high variance. Using fair schedulers like WFQ and WF²Q in such settings leads to bursty schedules, where large requests block small ones for long periods of time. In this paper, we propose Two-Dimensional Fair Queueing (2DFQ), which spreads requests of different costs across di erent threads and minimizes the impact of tenants with unpredictable requests. In evaluation on production workloads from Azure Storage, a large-scale cloud system at Microsoft, we show that 2DFQ reduces the burstiness of service by 1-2 orders of magnitude. On workloads where many large requests compete with small ones, 2DFQ improves 99th percentile latencies by up to 2 orders of magnitude.
-
Hong Zhang (Hong Kong University of Science and Technology), Li Chen (Hong Kong University of Science and Technology), Bairen Yi (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology), Mosharaf Chowdhury (University of Michigan), Yanhui Geng (Huawei Noah's Ark Lab)
Abstract: Leveraging application-level requirements using coflows has recently been shown to improve application-level communication performance in data-parallel clusters. However, existing coflow-based solutions rely on modifying applications to extract coflows, making them inapplicable to many practical scenarios.
In this paper, we present CODA, a first attempt at automatically identifying and scheduling coflows without any application-level modifications. We employ an incremental clustering algorithm to perform fast, application-transparent coflow identification and complement it by proposing an error-tolerant coflow scheduler to mitigate occasional identification errors. Testbed experiments and large-scale simulations with production workloads show that CODA can identify coflows with over 90% accuracy, and its scheduler is robust to inaccuracies, enabling communication stages to complete 2.4x (5.1x) faster on average (95-th percentile) compared to per-flow mechanisms. Overall, CODA's performance is comparable to that of solutions requiring application modifications.
-
Li Chen (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology), Wei Bai (Hong Kong University of Science and Technology), Mohammad Alizadeh (MIT)
Abstract: Cloud applications generate a mix of flows with and without deadlines. Scheduling such mix-flows is a key challenge; our experiments show that trivially combining existing schemes for deadline/non-deadline flows is problematic. For example, prioritizing deadline flows hurts flow completion time (FCT) for non-deadline flows, with minor improvement for deadline miss rate.
We present Karuna, a first systematic solution for scheduling mix-flows. Our key insight is that deadline flows should meet their deadlines while minimally impacting the FCT of non-deadline flows. To achieve this goal, we design a novel Minimal-impact Congestion control Protocol (MCP) that handles deadline flows with as little bandwidth as possible. For non-deadline flows, we extend an existing FCT minimization scheme to schedule flows with known and unknown sizes. Karuna requires no switch modifications and is back- ward compatible with legacy TCP/IP stacks. Our testbed experiments and simulations show that Karuna effectively schedules mix-flows, for example, reducing the 95th percentile FCT of non-deadline flows by up to 47.78% at high load compared to pFabric, while maintaining low (<5.8%) deadline miss rate.
-
Kanthi Nagaraj (Stanford University), Dinesh Bharadia (Stanford University), Hongzi Mao (M.I.T), Sandeep Chinchali (Stanford University), Mohammad Alizadeh (M.I.T.), Sachin Katti (Stanford University)
Abstract: We present xFabric, a novel datacenter transport design that provides flexible and fast bandwidth allocation control. xFabric is flexible: it enables operators to specify how bandwidth is allocated amongst contending flows to optimize for different service-level objectives such as minimizing flow completion times, weighted allocations, different notions of fairness, etc. xFabric is also very fast, it converges to the specified allocation one-to-two order of magnitudes faster than prior schemes. Underlying xFabric, is a novel distributed algorithm that uses in-network packet scheduling to rapidly solve general network utility maximization problems for bandwidth allocation. We evaluate xFabric using realistic datacenter topologies and highly dynamic workloads and show that it is able to provide flexibility and fast convergence in such stressful environments.
-
10:10am - 10:40am Coffee Break
- Coffee Break
-
10:40am - 12:20pm Session 5 - Datacenters I
Session Chair: Alex Snoren (UC San Diego)
Room: Diamante
- Session 5 - Datacenters I
-
Ki Suh Lee (Cornell University), Han Wang (Cornell University), Vishal Shrivastav (Cornell University), Hakim Weatherspoon (Cornell University)
Abstract: In this paper, we present Datacenter Time Protocol (DTP), a clock synchronization protocol that does not use packets at all, but is able to achieve nanosecond precision. In essence, DTP uses the physical layer of network devices to implement a decentralized clock synchronization protocol. By doing so, DTP eliminates most non-deterministic elements in clock synchronization protocols. Further, DTP uses control messages in the physical layer for communicating hundreds of thousands of protocol messages without interfering with higher layer packets. Thus, DTP has virtually zero overhead since it does not add load at layers 2 or higher layers. It does require replacing network devices, which can be done incrementally. We demonstrate that the precision provided by DTP is bounded by 25.6 nanoseconds for directly connected nodes, and in general, is bounded by 4TD where D is the longest distance between any two servers in a network in terms of number of hops and T is the period of the fastest clock (≈ 6.4ns). Moreover, in software, a DTP daemon can access the DTP clock with usually better than 4T (≈ 25.6ns) precision. As a result, the end-to-end precision can be better than 4T D + 8T nanoseconds. By contrast, the precision of the state of the art protocol is not bounded: The precision is hundreds of nanoseconds when a network is idle and can decrease to hundreds of microseconds when a network is heavily congested.
-
Yu-Wei Eric Sung (Facebook), Xiaozheng Tie (Facebook), Starsky H.Y. Wong (Facebook), Hongyi Zeng (Facebook)
Abstract: Network management facilitates a healthy and sustainable network. However, its practice is not well understood outside the network engineering community. In this paper, we present Robotron, a system for managing a massive production network in a top-down fashion. The system's goal is to reduce effort and errors on management tasks by minimizing direct human interaction with network devices. Engineers use Robotron to express high-level design intent, which is translated into low-level device configurations and deployed safely. Robotron also monitors devices' operational state to ensure it does not deviate from the desired state. Since 2008, Robotron has been used to manage tens of thousands of network devices connecting hundreds of thousands of servers globally at Facebook.
-
Chuanxiong Guo (Microsoft Research), Haitao Wu (Microsoft), Zhong Deng (Microsoft), Gaurav Soni (Microsoft), Jianxi Ye (Microsoft), Jitu Padhye (Microsoft Research), Marina Lipshteyn (Microsoft)
Abstract: Over the past one and half years, we have been using RDMA over commodity Ethernet (RoCEv2) to support some of Microsoft's highly-reliable, latency-sensitive services. This paper describes the challenges we encountered during the process and the solutions we devised to address them. In order to scale RoCEv2 beyond VLAN, we have designed a DSCP-based priority flow-control (PFC) mechanism to ensure large-scale deployment. We have addressed the safety challenges brought by PFC-induced deadlock (yes, it happened!), RDMA transport livelock, and the NIC PFC pause frame storm problem. We have also built the monitoring and management systems to make sure RDMA works as expected. Our experiences show that the safety and scalability issues of running RoCEv2 at scale can all be addressed, and RDMA can replace TCP for intra data center communications and achieve low latency, low CPU overhead, and high throughput.
-
Monia Ghobadi (Microsoft Research), Ratul Mahajan (Microsoft Research), Amar Phanishayee (Microsoft Research), Nikhil Devanur (Microsoft Research), Janardhan Kulkarni (Microsoft Research), Gireeja Ranade (Microsoft Research), Pierre-Alexandre Blanche (University of Arizona), Houman Rastegarfar (University of Arizona), Madeleine Glick (University of Arizona), Daniel Kilper (University of Arizona)
Abstract: We explore a novel, free-space optics based approach for building data center interconnects. It uses a digital micromirror device (DMD) and mirror assembly combination as a transmitter and a photodetector on top of the rack as a receiver (Figure 1). Our approach enables all pairs of racks to establish direct links, and we can reconfigure such links (i.e., connect different rack pairs) within 12 us. To carry traffic from a source to a destination rack, transmitters and receivers in our interconnect can be dynamically linked in millions of ways. We develop topology construction and routing methods to exploit this flexibility, including a flow scheduling algorithm that is a constant factor approximation to the offline optimal solution. Experiments with a small prototype point to the feasibility of our approach. Simulations using realistic data center workloads show that, compared to the conventional folded-Clos interconnect, our approach can improve mean flow completion time by 30-95% and reduce cost by 25-40%.
-
12:20pm - 1:30pm Lunch Break
- Lunch Break
-
Room: Diamante
- Topic Preview 2
-
Topic Preview 2 will take place at Diamante room. The topics covered will be Verification, Datacenter Networking / TCP, Policies, and Neutrality.
-
1:30pm - 3:10pm Session 6 - Verification
Session Chair: Ramesh Govindan (University of Southern California)
Room: Diamante
- Session 6 - Verification
-
Aaron Gember-Jacobson (University of Wisconsin-Madison), Raajay Viswanathan (University of Wisconsin-Madison), Aditya Akella (University of Wisconsin-Madison), Ratul Mahajan (Microsoft Research)
Abstract: Networks employ complex, and hence error-prone, routing control plane configurations. In many cases, the impact of errors manifests only under failures and leads to devastating effects. Thus, it is important to proactively verify control plane behavior under arbitrary link failures. State-of-the-art verifiers are either too slow or impractical to use for such verification tasks. In this paper we propose a new high level abstraction for control planes, ARC, that supports fast control plane analyses under arbitrary failures. ARC can check key invariants without generating the data plane--which is the main reason for current tools' ineffectiveness. This is possible because of the nature of verification tasks and the constrained nature of control plane designs in networks today. We develop algorithms to derive a network's ARC from its configuration files. Our evaluation over 314 networks shows that ARC computation is quick, and that ARC can verify key invariants in under 1s in most cases, which is orders-of-magnitude faster than the state-of-the-art.
-
Radu Stoenescu (University Politehnica of Bucharest), Matei Popovici (University Politehnica of Bucharest), Lorina Negreanu (University Politehnica of Bucharest), Costin Raiciu (University Politehnica of Bucharest)
Abstract: We present SymNet, a network static analysis tool based on symbolic execution. SymNet injects symbolic packets and tracks their evolution through the network. Our key novelty is SEFL, a language we designed for expressing data plane processing in a symbolic-execution friendly manner. SymNet statically analyzes an abstract data plane model that consists of the SEFL code for every node and the links between nodes. SymNet can check networks containing routers with hundreds of thousands of prefixes and NATs in seconds, while verifying packet header memory-safety and covering network functionality such as dynamic tunneling, stateful processing and encryption. We used SymNet to debug mid- dlebox interactions from the literature, to check properties of our department’s network and the Stanford backbone. Modeling network functionality is not easy. To aid users we have developed parsers that automatically generate SEFL models from router and switch tables, firewall configura- tions and arbitrary Click modular router configurations. The parsers rely on prebuilt models that are exact and fast to an- alyze. Finally, we have built an automated testing tool that combines symbolic execution and testing to check whether the model is an accurate representation of the real code.
-
Ryan Beckett (Princeton University), Ratul Mahajan (Microsoft), Todd Millstein (University of California, Los Angeles), Jitendra Padhye (Microsoft), David Walker (Princeton University)
Abstract: We develop Propane, a language and compiler to help network operators with a challenging, error-prone task—bridging the gap between network-wide routing objectives and low-level configurations of devices that run complex, distributed protocols. The language allows operators to specify their objectives naturally, using high-level constraints on both the shape and relative preference of traffic paths. The compiler automatically translates these specifications to router-level BGP configurations, using an effective intermediate representation that compactly encodes the flow of routing information along policy-compliant paths. It guarantees that the compiled configurations correctly implement the specified policy under all possible combinations of failures. We show that Propane can effectively express the policies of datacenter and backbone networks of a large cloud provider; and despite its strong guarantees, our compiler scales to networks with hundreds or thousands of routers.
-
Avichai Cohen (Hebrew University), Yossi Gilad (Boston University and MIT), Amir Herzberg (Bar Ilan University), Michael Schapira (Hebrew University)
Abstract: Extensive standardization and R&D efforts are dedicated to establishing secure interdomain routing. These efforts focus on two mechanisms: origin authentication with RPKI, and path validation with BGPsec. However, while RPKI is finally gaining traction, the adoption of BGPsec seems not even on the horizon due to inherent, possibly insurmountable, obstacles, including the need to replace today's routing infrastructure, the overhead of online cryptography, and meagre benefits in partial deployment. Consequently, secure interdomain routing remains a distant dream. We propose an easily deployable, modest extension to RPKI, called ``path-end validation'', which does not entail replacing/upgrading today's BGP routers nor online cryptographic operations. We show, through rigorous security analyses and extensive simulations on empirically-derived datasets, that path-end validation yields significant security benefits even in very limited partial adoption. We present an open-source, readily deployable prototype implementation of path-end validation.
-
3:10pm - 4:10pm Posters and Demos II (includes coffee break from 3:30pm-4:10pm)
Room: Topazio and Agata
- Posters and Demos II (includes coffee break from 3:30pm-4:10pm)
-
Posters session will take place at Topazio room. Demos session will take place at Agata room. Includes cofee break. Includes coffee break from 3:30pm-4:10pm.
-
4:10pm - 5:25pm Session 7 - Networked Applications
Session Chair: Nick Feamster (Princeton University)
Room: Diamante
- Session 7 - Networked Applications
-
Yurong Jiang (University of Southern California), Lenin Ravindranath (Microsoft Research), Suman Nath (Microsoft Research), Ramesh Govindan (University of Southern California)
Abstract: Developers deploying web applications in the cloud often need to determine how changes such as service tiers or runtime loads may affect user-perceived page load time. We devise and evaluate a systematic methodology for exploring such "what-if" questions at the time a web application is deployed. Given a website, a web request, and "whatif" scenario, with a hypothetical configuration and runtime condition, our methodology, embedded in a system called WebPerf, can estimate a distribution of end-to-end response times for the request under the "what-if" scenario. WebPerf makes three contributions: (1) automated instrumentation of web sites written with increasingly popular task parallel libraries, to capture causal call dependencies of various computation and asynchronous I/O calls; (2) an algorithm to use the call dependencies, together with online- and offlineprofiled models of various I/O calls to estimate a distribution of end-to-end latency of the request; and (3) an algorithm to optimize modeling errors by deciding how many measurements to take within a limited time. We have implemented WebPerf for Microsoft Azure. Our experiments with five real websites and seven scenarios show that the median error of WebPerf’s estimation is within 7% for all applications and scenarios.
-
Yi Sun (ICT/CAS), Xiaoqi Yin (CMU), Junchen Jiang (CMU), Vyas Sekar (CMU), Fuyuan Lin (ICT/CAS), Nanshu Wang (ICT/CAS), Tao Liu (iQIYI), Bruno Sinopoli (CMU)
Abstract: Bitrate adaptation is critical in ensuring good users’ quality-of-experience (QoE) in Internet video delivery system. Several efforts have argued that accurate throughput prediction can dramatically improve (1) initial bitrate selection for low startup delay and high initial resolution; (2) midstream bitrate adaptation for high QoE. However, prior ef- forts did not systematically quantify real-world throughput predictability or develop good prediction algorithms. To bridge this gap, this paper makes three key technical contributions: First, we analyze the throughput characteristics in a dataset with 20M+ sessions. We find: (a) Sessions sharing similar key features (e.g., ISP, region) present similar initial values and dynamical patterns; (b) There is a natural “stateful” dynamical behavior within a given session. Second, building on these insights, we develop CS2P, a better throughput prediction system. CS2P leverages data-driven approach to learn (a) clusters of similar sessions, (b) an initial throughput predictor, and (c) a Hidden-Markov-Model based midstream predictor modeling the stateful evolution of throughput. Third, we develop a prototype system and show by trace-driven simulation and real-world experiments that CS2P outperforms state-of-art by 40% and 50% median pre- diction error respectively for initial and midstream through- put and improves QoE by 14% over buffer-based adaptation algorithm.
-
Junchen Jiang (Microsoft Research / CMU), Rajdeep Das (Microsoft Research), Ganesh Ananthanarayanan (Microsoft Research), Philip A. Chou (Microsoft Research), Venkata Padmanabhan (Microsoft Research), Vyas Sekar (CMU), Esbjorn Dominique (Microsoft), Marcin Goliszewski (Microsoft), Dalibor Kukoleca (Microsoft), Renat Vafin (Microsoft), Hui Zhang (CMU)
Abstract: Interactive real-time streaming applications such as audio-video conferencing, online gaming and app streaming, place stringent requirements on the network in terms of delay, jitter, and packet loss. Many of these applications inherently involve client-to-client communication, which is particularly challenging since the performance requirements need to be met while traversing the public wide-area network (WAN). This is different from the typical situation of cloud-to-client communication, where the WAN can often be bypassed by moving a communication end-point to a cloud “edge”, close to the client. Can we nevertheless take advantage of cloud resources to improve the performance of real-time client-to-client streaming over the WAN?
In this paper, we start by analyzing data from a large VoIP provider whose clients are spread across over 21,000 AS’es and nearly all the countries, to understand the challenges faced by interactive audio streaming in the wild. We find that while inter-AS and international paths exhibit significantly worse performance than intra-AS and domestic paths, the pattern of poor performance is nevertheless quite scattered, both temporally and spatially. So any effort to improve performance would have to be fine-grained and dynamic.
Then, we turn to the idea of overlay routing, but in the context of the well-provisioned, managed network of a cloud provider rather than peer-to-peer as has been considered in past work. Such a network typically has a global footprint and peers with a large number of network providers. When the performance of a call via the direct path is predicted to be poor, the call traffic could be directed to enter the managed network close to one end point and exit it close to the other end point, thereby avoiding wide-area communication over the public Internet. We present and evaluate data-driven techniques to deciding whether to relay a call through the managed network and if so how to pick the ingress and egress relays to maximize performance, all while operating within a budget for relaying calls via the managed overlay network. We show that call performance can potentially improve by 40%-80% on average, with our techniques closely matching it.
-
5:30pm - 6:30pm Community Feedback
Room: Diamante
- Community Feedback
-
-
7:30pm - 10:00pm Student Dinner
-
-
The SIGCOMM 2016 Student Dinner will take place at Ataliba Churrascarias Florianópolis. Please visit the Social Events page for further information.
-
8:30am - 10:35am Session 8 - Wireless
Session Chair: Bruce Maggs (Duke University)
Room: Diamante
- Session 8 - Wireless
-
Vikram Iyer (University of Washington), Vamsi Talla (University of Washington), Bryce Kellogg (University of Washington), Shyamnath Gollakota (University of Washington), Joshua Smith (University of Washington)
Abstract: We introduce inter-technology backscatter, a novel approach that transforms wireless transmissions from one technology to another, on the air. Specifically, we show for the first time that Bluetooth transmissions can be used to create Wi-Fi and ZigBee-compatible signals using backscatter communication. Since Bluetooth, Wi-Fi and ZigBee radios are widely available, this approach enables a backscatter design that works using only commodity devices.
We build prototype backscatter hardware using an FPGA and experiment with various Wi-Fi, Bluetooth and ZigBee devices. Our experiments show we can create 2--11~Mbps Wi-Fi standards-compliant signals by backscattering Bluetooth transmissions. To show the generality of our approach, we also demonstrate generation of standards-complaint ZigBee signals by backscattering Bluetooth transmissions. Finally, we build proof-of-concepts for previously infeasible applications including the first contact lens form-factor antenna prototype and an implantable neural recording interface that communicate directly with commodity devices such as smartphones and watches, thus enabling the vision of Internet connected implanted devices.
-
Pengyu Zhang (University of Massachusetts Amherst), Mohammad Rostami (University of Massachusetts Amherst), Pan Hu (University of Massachusetts Amherst), Deepak Ganesan (University of Massachusetts Amherst)
Abstract: In this paper, we look at making backscatter practical for ultra-low power on-body sensors by leveraging radios on existing smartphones and wearables (e.g. WiFi and Bluetooth). The difficulty lies in the fact that in order to extract the weak backscattered signal, the system needs to deal with self-interference from the wireless carrier (WiFi or Bluetooth) without relying on built-in capability to cancel or reject the carrier interference.
Frequency-shifted backscatter (or FS-Backscatter) is based on a novel idea --- the backscatter tag shifts the carrier signal to an adjacent non-overlapping frequency band (i.e. adjacent WiFi or Bluetooth band) and isolates the spectrum of the backscattered signal from the spectrum of the primary signal to enable more robust decoding. We show that this enables communication of up to 4.8 meters using commercial WiFi and Bluetooth radios as the carrier generator and receiver. We also show that we can support a range of bitrates using packet-level and bit-level decoding methods. We build on this idea and show that we can also leverage multiple radios typically present on mobile and wearable devices to construct multi-carrier or multi-receiver scenarios to improve robustness. Finally, we also address the problem of designing an ultra-low power tag that can frequency shift by 20MHz while consuming tens of micro-watts. Our results show that FS-Backscatter is practical in typical mobile and static on-body sensing scenarios while only using commodity radios and antennas.
-
Pan Hu (University of Massachusetts Amherst), Pengyu Zhang (University of Massachusetts Amherst), Mohammad Rostami (University of Massachusetts Amherst), Deepak Ganesan (University of Massachusetts Amherst)
Abstract: While many radio technologies are available for mobile devices, none of them are designed to deal with asymmetric available energy. Battery capacities of mobile devices vary by up to three orders of magnitude between laptops and wearables, and our inability to deal with such asymmetry has limited the lifetime of constrained portable devices.
This paper presents a radically new design for low-power radios --- one that is capable of dynamically splitting the power burden of communication between the transmitter and receiver in proportion to the available energy on the two devices. We achieve this with a novel carrier offload method that dynamically moves carrier generation across end points. While such a design might raise the specter of a high-power, large form-factor radio, we show that this integration can be achieved with no more than a BLE-style active radio augmented with a few additional components. Our design, Braidio is a low-power, tightly integrated, low-cost radio capable of operating as an active and passive transceiver. When these modes operate in an interleaved (braided) manner, the end result is a power-proportional low-power radio that is able to achieve 1:2546 to 3546:1 power consumption ratios between a transmitter and a receiver, all while operating at low power.
-
Ezzeldin Hamed (Massachusetts Institute of Technology), Hariharan Rahul (Massachusetts Institute of Technology), Mohammed A. Abdelghany (Massachusetts Institute of Technology), Dina Katabi (Massachusetts Institute of Technology)
Abstract: Recent years have seen a lot of work in moving distributed MIMO from theory to practice. While this prior work demonstrates the feasibility of synchronizing multiple transmitters in time, frequency, and phase, none of them deliver a full-fledged PHY capable of supporting distributed MIMO in real-time. Further, none of them can address dynamic environments or mobile clients. Addressing these challenges, requires new solutions for low-overhead and fast tracking of wireless channels, which are the key parameters of any distributed MIMO system. It also requires a software-hardware architecture that can deliver a distributed MIMO within a full-fledged 802.11 PHY, while still meeting the tight timing constraints of the 802.11 protocol. This architecture also needs to perform coordinated power control across distributed MIMO nodes, as opposed to simply letting each node perform power control as if it were operating alone. This paper describes the design and implementation of MegaMIMO 2.0, a system that achieves these goals and delivers the first real-time fully distributed 802.11 MIMO system.
-
Deepak Vasisht (MIT), Swarun Kumar (CMU), Hariharan Rahul (MIT), Dina Katabi (MIT)
Abstract: This paper focuses on a simple, yet fundamental question: ``Can a node infer the wireless channels on one frequency band by observing the channels on a different frequency band?'' This question arises in cellular networks, where the uplink and the downlink operate on different frequencies. Addressing this question is critical for the deployment of key 5G solutions such as massive MIMO, multi-user MIMO, and distributed MIMO, which require channel state information.
We introduce R2-F2, a system that enables LTE base stations to infer the downlink channels to a client by observing the uplink channels from that client. By doing so, R2-F2 extends the concept of reciprocity to LTE cellular networks, where downlink and uplink transmissions occur on different frequency bands. It also removes a major hurdle for the deployment of 5G MIMO solutions. We have implemented R2-F2 in software radios and integrated it within the LTE OFDM physical layer. Our results show that the channels computed by R2-F2 deliver accurate MIMO beamforming (to within 0.7~dB of beamforming gains with ground truth channels) while eliminating channel feedback overhead.
-
10:35am - 11:05am Coffee Break
- Coffee Break
-
11:05am - 12:20pm Session 9 - Datacenters II
Session Chair: Sergey Gorinsky (IMDEA Networks Institute)
Room: Diamante
- Session 9 - Datacenters II
-
Bryce Cronkite-Ratcliff (VMware / Stanford), Aran Bergman (Technion), Shay Vargaftik (Technion), Madhusudhan Ravi (VMware), Nick McKeown (Stanford), Ittai Abraham (VMware), Isaac Keslassy (VMware / Stanford / Technion)
Abstract: New congestion control algorithms are rapidly improving datacenters by reducing latency, overcoming incast, increasing throughput and improving fairness. Ideally, the operating system in every server and virtual machine is updated to support new congestion control algorithms. However, legacy applications often cannot be upgraded to a new operating system version, which means the advances are off-limits to them. Worse, as we show, legacy applications can be squeezed out, which in the worst case prevents the entire network from adopting new algorithms.
Our goal is to make it easy to deploy new and improved congestion control algorithms into multitenant datacenters, without having to worry about TCP-friendliness with non-participating virtual machines. This paper presents a solution we call virtualized congestion control. The datacenter owner may introduce a new congestion control algorithm in the hypervisors. Internally, the hypervisors translate between the new congestion control algorithm and the old legacy congestion control, allowing legacy applications to enjoy the benefits of the new algorithm. We have implemented proof-of-concept systems for virtualized congestion control in the Linux kernel and in VMware’s ESXi hypervisor, achieving improved fairness, performance, and control over guest bandwidth allocations.
-
Keqiang He (University of Wisconsin-Madison), Eric Rozner (IBM Research), Kanak Agarwal (IBM), Yu (Jason) Gu (IBM), Wes Felter (IBM Research), John Carter (IBM), Aditya Akella (University of Wisconsin-Madison)
Abstract: Multi-tenant datacenters are successful because tenants can seamlessly port their applications and services to the cloud. Virtual Machine (VM) technology plays an integral role in this success by enabling a diverse set of software to be run on a unified underlying framework. This flexibility, however, comes at the cost of dealing with out-dated, inefficient, or misconfigured TCP stacks implemented in the VMs. This paper investigates if administrators can take control of a VM's TCP congestion control algorithm without making changes to the VM or network hardware. We propose AC/DC TCP, a scheme that exerts fine-grained control over arbitrary tenant TCP stacks by enforcing per-flow congestion control in the virtual switch (vSwitch). Our scheme is light-weight, flexible, scalable and can police non-conforming flows. In our evaluation the computational overhead of AC/DC TCP is less than one percentage point and we show implementing an administrator-defined congestion control algorithm in the vSwitch (i.e., DCTCP) closely tracks its native performance, regardless of the VM's TCP stack.
-
Behnaz Arzani (University of Pennsylvania), Selim Ciraci (Microsoft), Boon Thau Loo (University of Pennsylvania), Assaf Schuster (Technion - Israel Institute of Technology), Geoff Outhred (Microsoft)
Abstract: Today, root cause analysis of failures in data centers is mostly done through manual inspection. More often than not, cus- tomers blame the network as the culprit. However, other components of the system might have caused these failures. To troubleshoot, huge volumes of data are collected over the entire data center. Correlating such large volumes of diverse data collected from different vantage points is a daunting task even for the most skilled technicians. In this paper, we revisit the question: how much can you infer about a failure in the data center using TCP statistics collected at one of the endpoints? Using an agent that cap- tures TCP statistics we devised a classification algorithm that identifies the root cause of failure using this information at a single endpoint. Using insights derived from this classi- fication algorithm we identify dominant TCP metrics that indicate where/why problems occur in the network. We val- idate and test these methods using data that we collect over a period of six months in a production data center.
-
12:20pm - 1:30pm Lunch Break
- Lunch Break
-
1:30pm - 2:45pm Session 10 - Censorship and Choice
Session Chair: Nick Feamster (Princeton University)
Room: Diamante
- Session 10 - Censorship and Choice
-
Tobias Flach (University of Southern California / Google), Pavlos Papageorge (Google), Andreas Terzis (Google), Luis Pedrosa (University of Southern California), Yuchung Cheng (Google), Tayeb Karim (Google), Ethan Katz-Bassett (University of Southern California), Ramesh Govindan (University of Southern California / Google)
Abstract: Large flows like videos consume significant bandwidth. Some ISPs actively manage these high volume flows with techniques like policing, which enforces a flow rate by dropping excess traffic. While the existence of policing is well known, our contribution is an Internet-wide study quantifying its prevalence and impact on video quality metrics. We developed a heuristic to identify policing from server-side traces and built a pipeline to deploy it at scale on traces from a large online content provider, collected from hundreds of servers worldwide. Using a dataset of 270 billion packets served to 28,400 client ASes, we find that, depending on region, up to 7% of lossy transfers are policed. Loss rates are on average six times higher when a trace is policed, and it impacts video playback quality. We show that alternatives to policing, like pacing and shaping, can achieve traffic management goals while avoiding the deleterious effects of policing.
-
Yiannis Yiakoumis (Stanford University), Sachin Katti (Stanford University), Nick McKeown (Stanford University)
Abstract: Should applications receive special treatment from the network? And if so, who decides which applications are preferred? This discussion, known as net neutrality, goes beyond technology and is a hot political topic. In this paper we approach net neutrality from a user's perspective. Through user studies, we demonstrate that users do indeed want some services to receive preferential treatment; and their preferences have a heavy-tail: a one-size-fits-all approach is unlikely to work. This suggests that users should be able to decide how their traffic is treated. A crucial part to enable user preferences, is the mechanism to express them. To this end, we present network cookies, a general mechanism to express user preferences to the network. Using cookies, we prototype Boost, a user-defined fast-lane and deploy it in 161 homes.
-
James McCauley (UC Berkeley / ICSI), Mingjie Zhao (UESTC / ICSI), Ethan J. Jackson (UC Berkeley), Barath Raghavan (ICSI), Sylvia Ratnasamy (UC Berkeley / ICSI), Scott Shenker (UC Berkeley / ICSI)
Abstract: A major staple of layer 2 has long been the combination of flood-and-learn Ethernet switches with some variant of the Spanning Tree Protocol. However, STP has significant shortcomings -- chiefly, that it throws away network capacity by removing links, and that it can be relatively slow to reconverge after topology changes. In recent years, attempts to rectify these shortcomings have been made by either making L2 look more like L3 (notably TRILL and SPB, which both incorporate L3-like routing) or by replacing L2 switches with "L3 switching" hardware and extending IP all the way to the host. In this paper, we examine an alternate point in the L2 design space, which is simple (in that it is a single data plane mechanism with no separate control plane), converges quickly, delivers packets during convergence, utilizes all available links, and can be extended to support both equal-cost multipath and efficient multicast.
-
2:45pm - 3:45pm Posters and Demos III (includes coffee break from 3:05pm-3:45pm)
Room: Topazio and Agata
- Posters and Demos III (includes coffee break from 3:05pm-3:45pm)
-
Posters and demos from main track papers session will take place at Topazio room. Industrial demos session will take place at Agata room. Includes coffee break from 3:05pm-3:45pm.
-
3:45pm - 5:00pm Session 11 - SDN & NFV II
Session Chair: Aditya Akella (University of Wisconsin Madison)
Room: Diamante
- Session 11 - SDN & NFV II
-
Anat Bremler-Barr (The Interdisciplinary Center, Herzliya), Yotam Harchol (The Hebrew University of Jerusalem), David Hay (The Hebrew University of Jerusalem)
Abstract: We present OpenBox — a software-defined framework for network-wide development, deployment, and management of network functions (NFs). OpenBox effectively decouples the control plane of NFs from their data plane, similarly to SDN solutions that only address the network’s forwarding plane. OpenBox consists of three logic components. First, user-defined OpenBox applications provide NF specifications through the OpenBox north-bound API. Second, a logically-centralized OpenBox controller is able to merge logic of multiple NFs, possibly from multiple tenants, and to use a network-wide view to efficiently deploy and scale NFs across the network data plane. Finally, OpenBox instances constitute OpenBox’s data plane and are implemented either purely in software or contain specific hardware accelerators (e.g., a TCAM). In practice, different NFs carry out similar processing steps on the same packet, and our experiments indeed show a significant improvement of the network performance when using OpenBox. Moreover, OpenBox readily supports smart NF placement, NF scaling, and multi-tenancy through its controller.
-
Muhammad Shahbaz (Princeton University), Sean Choi (Stanford University), Ben Pfaff (VMware), Changhoon Kim (Barefoot Networks), Nick Feamster (Princeton University), Nick McKeown (Stanford University), Jennifer Rexford (Princeton University)
Abstract: Hypervisors use software switches to steer packets to and from virtual machines (VMs). These switches frequently need upgrading and customization—to support new protocol headers or encapsulations for tunneling and overlays, to improve measurement and debugging features, and even to add middlebox-like functions. Software switches are typically based on a large body of code, including kernel code, and changing the switch is a formidable undertaking requiring domain mastery of network protocol design and developing, testing, and maintaining a large, complex codebase. Changing how a software switch forwards packets should not require intimate knowledge of its implementation. Instead, it should be possible to specify how packets are processed and forwarded in a high-level domain-specific language (DSL) such as P4, and compiled to run on a software switch. We present PISCES, a software switch derived from Open vSwitch (OVS), a hard-wired hypervisor switch, whose behavior is customized using P4. PISCES is not hard-wired to specific protocols; this independence makes it easy to add new features. We also show how the compiler can analyze the high-level specification to optimize forwarding performance. Our evaluation shows that PISCES performs comparably to OVS and that PISCES programs are about 40 times shorter than equivalent changes to OVS source code.
-
László Molnár (Ericsson Research, Hungary), Gergely Pongrácz (Ericsson Research, Hungary), Gábor Enyedi (Ericsson Research, Hungary), Zoltán Kis (Ericsson Research, Hungary), Levente Csikor (Budapest University of Technology and Economics), Ferenc Juhász (Budapest University of Technology and Economics), Attila Körösi (Budapest University of Technology and Economics), Gábor Rétvári (Budapest University of Technology and Economics)
Abstract: OpenFlow is an amazingly expressive dataplane programming language, but this expressiveness comes at a severe performance price as switches must do excessive packet classification in the fast path. The prevalent OpenFlow software switch architecture is therefore built on flow caching, but this imposes intricate limitations on the workloads that can be supported efficiently and may even open the door to malicious cache overflow attacks. In this paper we argue that instead of enforcing the same universal flow cache semantics to all OpenFlow applications and optimize for the common case, a switch should rather automatically specialize its dataplane piecemeal with respect to the configured workload. We introduce ESwitch, a novel switch architecture that uses on-the-fly template-based code generation to compile any OpenFlow pipeline into efficient machine code, which can then be readily used as fast path. We present a proof-of-concept prototype and we demonstrate on illustrative use cases that ESwitch yields a simpler architecture, superior packet processing speed, improved latency and CPU scalability, and predictable performance. Our prototype can easily scale beyond 100 Gbps on a single Intel blade even with complex OpenFlow pipelines.
-
5:00pm - 5:50pm Best of CCR
Session Chair: Srinivasan Keshav (University of Waterloo)
Room: Diamante
- Best of CCR
-
H. Metwalley (Politecnico di Torino), S. Traverso (Politecnico di Torino), M. Mellia (Politecnico di Torino), S. Miskovic (Symantec Corp.), M. Baldi (Politecnico di Torino)
Abstract: Individuals lack proper means to supervise the services they contact and the information they exchange when surfing the web. This security task has become challenging due to the complexity of the modern web, of the data delivering technology, and even to the adoption of encryption, which, while improving privacy, makes in-network services ineffective. The implications are serious, from a person contacting undesired services or unwillingly exposing private information, to a company being unable to control the flow of its information to the outside world. To empower transparency and the capability of taking informed choices in the web, we propose CROWDSURF, a system for comprehensive and collaborative auditing of data exchanged with Internet services. Similarly to crowdsourced efforts, we enable users to contribute in building awareness, supported by the semi-automatic analysis of data offered by a cloud-based system. The result is the creation of "suggestions" that individuals can transform in enforceable "rules" to customize their web browsing policy. CROWDSURF provides the core infrastructure to let individuals and enterprises regain visibility and control on their web activity. Preliminary results obtained executing a prototype implementation demonstrate the feasibility and potential of CROWDSURF.
-
Carsten Orwat (Karlsruhe Institute of Technology), Roland Bless (Karlsruhe Institute of Technology)
Abstract: Many technical systems of the Information and Communication Technology (ICT) sector enable, structure and/or constrain social interactions. Thereby, they influence or implement certain values, including human rights, and affect or raise conflicts among values. The ongoing developments toward an "Internet of everything" is likely to lead to further value conflicts. This trend illustrates that a better understanding of the relationships between social values and networks is urgently needed because it is largely unknown what values lie behind protocols, design principles, or technical and organizational options of the Internet. This paper focuses on the complex steps of realizing human rights in Internet architectures and protocols as well as in Internet-based products and services. Besides direct implementation of values in Internet protocols, there are several other options that can indirectly contribute to realizing human rights via political processes and market choices. Eventually, a better understanding of what values can be realized by networks in general, what technical measures may affect certain values, and where complementary institutional developments are needed may lead toward a methodology for considering technical and institutional systems together.
-
5:50pm - 6:20pm Closing
Room: Diamante
- Closing