Measured Impact of Crooked Traceroute

By: 
Matthew Luckie, Amogh Dhamdhere, kc claffy, and David Murrell
Appears in: 
CCR January 2011

Data collected using traceroute-based algorithms underpins research into the Internet’s router-level topology, though it is possible to infer false links from this data. One source of false inference is the combination of per-flow load-balancing, in which more than one path is active from a given source to destination, and classic traceroute, which varies the UDP destination port number or ICMP checksum of successive probe packets, which can cause per-flow load-balancers to treat successive packets as distinct flows and forward them along different paths. Consequently, successive probe packets can solicit responses from unconnected routers, leading to the inference of false links. This paper examines the inaccuracies induced from such false inferences, both on macroscopic and ISP topology mapping. We collected macroscopic topology data to 365k destinations, with techniques that both do and do not try to capture load balancing phenomena. We then use alias resolution techniques to infer if a measurement artifact of classic traceroute induces a false router-level link. This technique detected that 2.71% and 0.76% of the links in our UDP and ICMP graphs were falsely inferred due to the presence of load-balancing. We conclude that most per-flow load-balancing does not induce false links when macroscopic topology is inferred using classic traceroute. The effect of false links on ISP topology mapping is possibly much worse, because the degrees of a tier-1 ISP’s routers derived from classic traceroute were inflated by a median factor of 2.9 as compared to those inferred with Paris traceroute.

Public Review By: 
R. Teixeira

The research community has applied traceroute-style probing to measure Internet topologies for more than a decade with systems such as Skitter/Ark, Dimes, or Rocketfuel. These topologies are the basis of many other research efforts. Unfortunately, recent studies showed that classic traceroute can report false links when a router in the path performs load balancing. Although new probing techniques correct measurement artifacts under per-flow load balancing, we cannot correct topologies that have already been collected using classic traceroute and no prior work has studied how these errors affect inferred topologies. A natural question is then: how accurate are the topologies that we have all been using in our research?
This paper gives us a mixed answer. Measurement artifacts due to per-flow load balancing introduce only few errors when traceroute is used to discover a macroscopic topology (i.e., an Internet-wide topology), but they introduce significant errors when discovering the topology of an ISP. Such a sharp difference in the fraction of false links between the macroscopic topology and the ISP topology suggests that the error really depends on the set of vantage points and the networks traversed. This paper studies only one source of errors in inferred Internet topologies. As the authors point out: "the state of the art in Internet topology measurement is essentially and necessarily a set of hacks, which introduce many sources of possible errors". Hopefully, new studies will follow to understand the caveats of measured Internet topologies and to measure more accurate topologies. In the mean time, this paper confirms that we should be cautions when using inferred Internet topologies.