Session 6: Routing and Forwarding (Chair: Jia Wang, AT&T Research) Scribe: fabienne.anhalt@ens-lyon.fr -------------------------------------------------------------------------------------- Stable and Flexible iBGP Ashley Flavel (University of Adelaide), Matthew Roughan (University of Adelaide) Presenter: Ashley Flavel (University of Adelaide) Problem: BGP can oscillate (it does not converge) within a single ISP (iBGP). This makes routing becoming unpredictable. Solution: To avoid oscillation, one BGP decision step is added prior to checking the MED attribute whithout any changes in the information. Why is this new ? The problem is well know, previous solutions were - to ingore MED - the design of the networks could be limited to stable designs, but this restricts routing flexibility. Instead of modifying the design of the network, the approach of this work is to create simple routing protocol building blocks and combine them. This shifts the responsibility from the design to the protocol. BGP basics: it distributes routes beween ASes (eBGP) and inside ASes (iBGP), and propagates the routes to the neighbors until they are stable. It sends the information on only one hop interiorly. To disbribute external routes to internal routers, creating a hierarchy (route-reflection) limits the iBGP sessions. In this hierarchy, route reflectors are considered as parents, clients as children. Multiple layers are possible. How oscillation occurs: - Signalling iBGP network is different from the underlying topology. - Only best routes are revealed, so the diversity of learned routes is limited. Existing solutions how to fix it: - Check all cases for stability. - Detect and fix oscillations, which takes time. - Design the network for stability, which reduces flexiblity (the networks change over time). The proposed solution: fix the protocol instead of changing the network design. This requires only small changes. - Use of routing algebra to describe route-reflection iBGP topology, which consists of lables, signatures, weights, and functions that maps signatures into weights. - Routes are constructed from link labels and signatures. Invalid results are note considered. - The criteria which guarantees convergence is strict monotonicity: the preference of route strictly decreases when it is propagated. - To guarantee strict monotonicity, routes with the fewest iBGP hops are chosen. - For stability, the iBGP hop step is included before the MED step. Similar concepts show up in router virtualization where the control plane is virtualized. Q: About the flexibility ? A: Yes, you could use weights for more flexibility Q: Does inserting the hops decision before the iBGP decision affect on the ability to find the shortest exit point ? A: You can pick a longer iBGP route. Stability is the first concern, length is not so important if the network is not slow. It is an optional step: if you don't want to use it, don't. -------------------------------------------------------------------------------------- LIPSIN: Line speed Publish/Subscribe Inter-Networking Petri Jokela (Ericsson), Andras Zahemszky (Ericsson), Somaya Arianfar (Ericsson), Pekka Nikander (Ericsson), Christian Esteve (University of Campinas) Presenter: Petri Jokela (Ericsson) Excuse me, Sir, but can we deliver packets without ip address ? - Yes we can. Context: Content deliveries. - Data is considered as a first class citizen (without caring about the host it comes from). - To avoid DDoS problems, data is delivered without end-to-end adressing and only when it is explicitly requested. - When we publish data, multicast is used to disseminate it. - Consumers and producers are timely separated, so data cache is used in the network. Publisher/Subscriber based network architecture: - A publisher creates an ID for the topic. - When the subscriber comes with the same topic ID, the system finds out where the data is. - The topology is not touched. - The publisher can deliver data to the subscriber. Problems: - In topology based networks, routing can be based on topic IDs, but there are too much states in the routers which leads to a scalability problem. - In IP networks, a source routing system is used. But if instead of IPs, for example node IDs are used in a sequence for the routing, this sequence can be huge, and packets become large. They can be compressed into a Bloom filter and the path is not seen anymore. Proposed solution: - Not nodes, but links are identified by IDs (a chain of bits), one for each direction (A->B, B->A). - Trees are created with the link ID's connection information and a source routing approach is used to forward packets. - All the link ID's of a tree are incoded into a Bloom filter placed in the packet header (zFilter) (A->B, B->C : A->B->C). - Hashing is skipped, not needed here. - Multicast is supported - Forwarding decisions are made as follows: The zFilter is compared to the outgoing interface links. If the link ID is ANDed with the zFilter, the packet is forwarded along the link. Another problem is that false positives in Bloom filters can lead to wrong forwarding decisions. Solution: - Several LIT (Link Identity Tags) are added to each link ID. This makes the results better because the topology manager calculates different candidate zFilters and can select the one which performs best according to different criterias. Simulation results: An ns3 simulation is done on intra-domain AS topologies from Rocketfuel and SNDlib data-sets. - With up to 23 subscribers, the forwarding efficiency is above 90%. - With an optimized number of LITs, there is more than 92% forwarding efficiency and a lot less additionnal traffic in the network. An other problem is if there are dense multicast trees or long paths. - The solution can be optimized using virtual trees. A virtual tree is given a single link ID configured to affect all nodes. Results make things a little bit better. Implementation: - FreeBSD7.x end-host with forwarding - NetFPGA-based forwarding node. available on http://www.psirp.org Summary: - It's a link ID based source-routing system - Extremly small forwarding tables on the routers - Very simple forwarding decisions with "AND" and verification operations - Can prevent unwanted traffic - Forwarding implemented previously on software, and with NetFPGA on hardware Q: The link ID is a routing metric. If you have n servers, do you need to create n^2 number of links? A: If you have 5 nodes you can see directly, you have 5 link IDs. Q: But everyone can see everyone (mesh) A: A single router is only concerned on which interface it has to use. Q: Links are physical ? A: Yes Q: The forwarding efficiency goes down when a certain number of subscribers go to a single object. What happens if you could have millions of subscribers? A: Virtual trees with single link ID cover a lot of subscribers. -------------------------------------------------------------------------------------- PLUG: Flexible lookup modules for rapid deployment of new protocols in high-speed routers Lorenzo De Carli (University of Wisconsin-Madison), Yi Pan (University of Wisconsin-Madison), Amit Kumar (University of Wisconsin-Madison), Cristian Estan (University of Wisconsin-Madison), Karthikeyan Sankaralingam (University of Wisconsin-Madison) Presenter: Amit Kumar (University of Wisconsin-Madison) Problem: Deployment of new protocols in high performance and low power routers. - Need for many lookup modules and expensive and long design cycles - Limited upgradability of physical equipments - The functionality is defined in hardware Can there be a single flexible and efficient lookup module ? Goal: Deploy new data plane protocols without changing the hardware The proposition: - A future line card with a PLUG (Pipelined Lookup Grid) chip - PLUG replaces the lookup modules at layer 2 and above Example of forwarding table implementation: - In Ethernet the lookup is in hashtable - IP lookup uses a tree These lookup modules implement different datastructures in hardware, with different sizes of memeory, and different processing elements. Similarities are that they use simple processing elements and that computation is moved to data structures. PLUG captures similarity in hardware and explores difference in software. Design: - Lookup objects are specified with dataflow graphs of data blocks and accessed with a single lookup up. - The dataflow graph consists of logical pages connected with each other. - Codeblocks read the pages and specify which is the next logical page. - Hash table variants are used for the lookup objects. - Keys can be split if they are too long for the data-block (e. g. Ethane). Architecture: - Scalable tiled architecture with simple processing elements and a simple on-chip network. - A couple of routers for communication and connect tiles together to form the PLUG chip. - Delays are fixed and instruction ordering is static. PLUG compilation: - The collection of log pages may not match with hardware size: logical pages are converted into smaller physical pages. - Small physical pages are mapped to physical tiles. Execution in hardware: - Lookup is transformed into a physical pipline on the chip. - Multiples of these pipelines can be activated. - There is static scheduling. - It is a memory dominated design, no memory conflicts will arise. PLUG is implemented using a C++ framework. Evaluations: - Parameters: 256kb SRAM, 16 tiles, 4MB of storage, 32 processing elements - Comparison to netlogic NLA9000 specific design Results: - Latency is the sum of the code-block latencies (between 10 and 82 cycles). - Memory: there is no overhead for hash table using protocols, for the others there is acceptable overhead. - The power consumption is under 2 Watts for all applications. Compared to NLA9000, it saves power, latency and throughput. Simple hardware provides the efficiency. Conclusion: PLUG is a flexible and programmable lookup module which enables the deployment of new functionnality The core ideas are an intuitive dataflow programming model and simple tiled hardware. Future work are optimizing the scheduler, the code generator and the compiler Q: How dependent is the framework on the network processor ? A: We add as much work as possible on the PLUG itself, it is very easy to take other processors. Q: If you use just regular Intel processors or change the core number ? A: We don't exepect any major changes. Q: Is there a bottleneck in the bandwidth ? A: It depends on how much throughput you want ? Q: How do you map a regluar data flow ? A: It is converted into logical pages.