Routing Stability and Convergence
An Experimental Study of Delayed Internet Routing Convergence
Craig Labovitz (Microsoft Research)
Abha Ahuja (University of Michigan)
Abhijit Abose (University of Michigan)
Farnam Jahanian (University of Michigan)
This paper examines the latency in Internet path failure, fail-over and repair due to the convergence properties of inter-domain routing. Unlike switches in the public telephony network which exhibit fail-over on the order of milliseconds, we show inter-domain routers in the packet switched Internet may take several minutes to reach a consistent view of the network topology after a fault. This delay stems from the independent computation and route selection of the BGP path vector algorithm on each backbone router. During these periods of delayed convergence, end-to-end Internet paths will experience intermittent loss of connectivity, as well as increased packet loss and latency. We present a two-year study of Internet routing convergence based on the experimental instrumentation of key portions of the Internet infrastructure, including both passive data collection and fault-injection machines at major Internet exchange points. Based on data from the injection and measurement of several hundred thousand inter-domain routing faults, we describe several unexpected properties of convergence and show that the measured upper bound on Internet