Half-Day Tutorial: Network Verification
Monday 17th August, Afternoon Session
Presenters
George Varghese, Microsoft Research
Nikolaj Bjorner, Microsoft Research
Tutorial location
The tutorial will take place in Sherfield Building, room SALC10. For directions inside Imperial College check the campus map (building number 20).
Tutorial timetable
12:30-14:00 buffet lunch
14:00 Afternoon Tutorial start
15:30-16:00 coffee break
17:30 Tutorial finish
Summary
This half-day classroom format tutorial seeks to collect together the recent surge of work in network verification from various research groups at Cornell (Foster et al), Princeton (Walker et al), UIUC (Godfrey et al), Stanford, (McKeown et al), Berkeley (Shenker et al), Texas Austin(Lam et al), Purdue (Rao et al), Naval Postgraduate school. (Xie et al), and many other groups including MSR and present these results in a coherent way from the lens of 30 years of program verification (model checking, proof assistants, SAT solvers, automatic test generation programs like DART and Klee). Besides specific tools we wish to survey some of the major ideas in verification such as symbolic model checking, abstraction and partial order methods to reduce state space explosion, and white and black box test generation. We will also present specific bugs found in production.
We hope networking students will learn about verification methods and also about what is different about the domain that makes network verification different from standard verification in that old ideas have to be rethought. We also hope that they will be exposed to new research directions and methods.
Network Verification: On the surface, networks with their collections of routers, Network Interface Cards, and cabling do not resemble code. But one can view the entire network as a “program” that takes packets from the input edges of the network and outputs packets to the output edge after possibly rewriting. The forwarding table at a router can, for instance, be viewed as a large Switch statement across destination addresses. One can also view a network as a circuit using an EDA (Electronic Design Automation) lens [1]. If design rule checking is analogous to static checking, what is the analog of synthesis?
These analogies have led networking researchers to frame a new research agenda, made compelling by the ubiquity of cloud services, called Network Verification. They ask: what are the equivalents of compilers/synthesis tools, debuggers, and static checkers for networks? Early work has already produced static reachability checkers of impressive speed and scale [4, 5], algorithms to synthesize network firewalls from policy specifications [6], debuggers [7,8], and automatic network testers [2]. While networking research has a long history of producing tools such as for network tomography, tools inspired by EDA and Software were rarer (see [9] for an early example).
Motivation
In a world of services accessed from tablets and mobile devices, network failures are a debilitating source of operational expenditures. Gartner estimates the public cloud market to be $139B dollars, and TBR estimates the private cloud market at $10B. Companies such as Microsoft which traditionally sold software on disks are transitioning to XBOX and Office Online. McKeown [1] suggests the potential market opportunity for network verification by noting that in both software and chip worlds a $10B tool industry supports a $250B (software) and $300B (chip) industry. While money is no guarantee of research importance, it suggests this is a problem industry cares about. A survey conducted on the NANOG (the organization of network operators) mailing list revealed that 35% of respondents had at least 25 tickets per month that took more than 1 hour to resolve [2].
At the same time, the router market is transitioning to software defined (SDN) routers Merchant silicon vendors such as Broadcom provide cheap ASICs that can be assembled into cheap routers. A language called P4 [3] promises to go beyond the first generation of SDN routers and allow even the forwarding process of routers to be redefined at run-time. While SDNs enable simple and efficient routers that can be tailor-made to an organization’s needs, SDNs can amplify the opportunity for errors.
Topics Covered
Introduction to Verification: We will cover basic topics in verification such as model checking,symbolic model checking, SAT Solvers, symmetry reduction, abstraction, SAT solves etc.
Static Checkers: We will cover the range of work on static checking starting with the pioneering work of Xie on IP reachability to Anteater, HSA, Veriflow,and NetPlumber. We will also cover some of Simon Lam’s work on predicate abstraction as well as recent work by Xie and Sanjay Rao at Purdue
Semantics: Assigning programs mathematical meaning enables mathematical proofs of correctness and can suggest optimizations in a principled way. Work at Cornell and Princeton (for example) NetKat [19] provides a denotational and axiomatic semantics for reasoning about. Are there analogs to symmetry reduction for model checking [20]. We will try and introduce networking students to semantics.
Language Design and Synthesis: Designing either software or chips today is inconceivable without languages such as C++ or Verilog. For networks, there is a need for languages at two very different layers of abstraction. At the router level, there are already languages for reprogramming routers such as Openflow [24] and P4 [3]. However, they operate at a very low level of abstraction akin to machine code without support for modularity, types, or other higher level features. We need a richer set of language features while retaining the ability to implement the language specification at Terabit speeds.
On the other hand, there is a need for a higher level language to express the network policy in terms of access control, quality-of-service at an explicit, declarative level. One way to do this is to regard the network as “one big switch” [6]. Managers specify connectivity among “switch” endpoints, and this “policy” is compiled into forwarding rules at routers while respecting resource constraints. While this is a very good start, in practice our networks are too large to specify every pair of addresses that can communicate. Using a programming language viewpoint, we can ask what are the equivalent of types (e.g., customer VMs, router controllers etc)
Dynamic Verification: Inspired by software testing where inputs are selected to maximize code coverage, in ATPG [2] the inputs are packets selected to maximize coverage of every link, queue, or forwarding rule. Unlike software testing, ATPG was used for performance testing and isolating links or queues that were causing end-to-end performance issues. While such testing is adequate for smaller networks, perhaps analogs of white or black box fuzz testing [23] are needed for larger networks.
Future directions: Many intriguing directions are suggested by existing work in programming languages. For example, are there network equivalents of work [15, 16] on discovering specifications? For example, if most customer VMs cannot access a service but a few can, the inconsistency may indicate a possible bug without a priori knowledge of which services a VM can access. Second, we must move beyond modeling the data plane to modelling the routing process [17] that computes forwarding entries to answer questions such as “How is link load affected by single link failures?” Third, while current tools verify a model of a router’s code, a recent paper broke new ground by modelling the actual code of a Click Software Router [18]. Can this be done for router hardware specified in Verilog? Finally, there has been recent work in Data Center Networks that provide deadlines. Despite much prior work in embedded systems, the challenge is to make real-time verification scale to large header spaces.
Biographies
The course will be co-taught by George Varghese (networking) and Nikolaj Bjorner (who co-wrote the Z3 set of verification tools that are widely used and is currently working on checker tools for Azure networks). The planned structure is a class incorporating several demos.
References
- N. McKeown, Mind the Gap. SIGCOMM Keynote, 2014.
- H. Zeng et al . Automatic Test Packet Generation, CoNEXT 2012.
- P. Bosshart et al, P4: programming protocol-independent packet processors. Computer Communication Review 44(3): 87-95 (2014)
- A. Khurshid, X. Zou, W. Zhou, M. Caesar, and P. B. Godfrey. VeriFlow: verifying network-wide invariants in real time. NSDI, 2013.
- P. Kazemian, M. Chang, H. Zeng, G. Varghese, N. McKeown, and S. Whyte. Real time network policy using Header Space Analysis, NSDI 2013.
- N. Kang, Z. Liu, J. Rexford, and D. Walker. Optimizing the ‘One Big Switch’ Abstraction in Software-defined Networks, CoNEXT 2013.
- R. Fonseca, G. Porter, R. Katz, S. Shenker, and I. Stoica. X-trace: A pervasive network tracing framework, NSDI 2007.
- N. Handigol, B. Heller, V. Jeyakumar, D. Mazieres, and N. McKeown. I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks, NSDI 2014.
- N. Feamster and H. Balakrishnan. Detecting BGP Configurations with Static Analysis. NSDI, 2005.
- P. Kazemian, G. Varghese and N. McKeown. Header space analysis. NSDI, 2012.
- H. Mai, A. Khurshid, R. Agarwal, M. Caesar, B. Godfrey, and S. King. Debugging the data plane with Anteater. SIGCOMM, 2011.
- J. R. Burch, E. M. Clark and et al. Symbolic model checking: 1020 states and beyond. LICS, 1990.
- Randal E. Bryant. Boolean Function Manipulation. IEEE Trans Computers, C-35(8):677–691, 1986.
- N. Lopes et al. Checking Beliefs in Dynamic Networks, NSDI 2015.
- D. Engler, D. Chen, and A. Chou. Bugs as deviant behavior,SOSP 2001.
- W. Li, A. Forin, S. Seshia, Scalable Specification Mining , 47th Design Automation Conference.
- A. Fogel et a;. A general approach to static analysis of network configurations. NSDI 2015.
- M. Dobrescu and K. Argyraki . Software Dataplane Verification, NSDI 2014.
- C. Anderson, N. Foster, A. Guha J. Jeannin, D. Kozen, and D. Walker. NetKAT: Semantic foundations for networks, POPL 2014.
- E. Clarke, A. Emerson, S. Jha, and A. Sistla. Symmetry reductions in model checking, CAV 98.
- M. Budiu. A Brief Critique of P4, unpublished note.
- L. Yan, L. Jose, G. Varghese, and N. McKeown. Compiling Packet Programs to Reconfigurable Switches, NSDI 2015.
- P. Godefroid, M. Levin and D. Molnar. SAGE: Whitebox fuzzing for security testing, CACM 2012.
- OpenFlow Switch Specification Version 1.4.0, https://www.opennetworking.org