Spurious routes in public BGP data

M. Luckie
Appears in: 
CCR July 2014

Researchers depend on public BGP data to understand the structure and evolution of the AS topology, as well as the operational security and resiliency of BGP. BGP data is provided voluntarily by network operators who establish BGP sessions with route collectors that record this data. In this paper, we show how trivial it is for a single vantage point (VP) to introduce thousands of spurious routes into the collection by providing examples of five VPs that did so. We explore the impact these misbehaving VPs had on AS relationship inference, showing these misbehaving VPs introduced thousands of AS links that did not exist, and caused relationship inferences for links that did exist to be corrupted. We evaluate methods to automatically identify misbehaving VPs, although we find the result unsatisfying because the limitations of real-world BGP practices and AS relationship inference algorithms produce signatures similar to those created by misbehaving VPs. The most recent misbehaving VP we discovered added thousands of spurious routes for nine consecutive months until 8 November 2012. This misbehaving VP barely impacts (0.1%) our validation of our AS relationship inferences, but this number may be misleading since most of our validation data relies on BGP and RPSL which validates only existing links, rather than asserting the non-existence of links. We have only a few assertions of non-existent routes, all received via our public-facing website that allows operators to provide validation data through our interactive feedback mechanism. We only discovered this misbehavior because two independent operators corrected some inferences, and we noticed that the spurious routes all came from the same VP. This event highlights the limitations of even the best available topology data, and provides additional evidence that comprehensive ground truth validation from operators is essential to scientific research on Internet topology.

Public Review By: 
Renata Teixeira

Public BGP datasets provided by RouteViews and RIPE RIS feed research on Internet topology. This paper reveals the presence of spurious routes in these datasets. By spurious route, the author means a route that the origin AS (Autonomous System) never announced or one that reports an AS path that is different from the signalling path the route took. In particular, this paper studies one AS that sent thousands of spurious routes to public route collectors, because of a route optimizer inside this AS. The paper shows that the spurious routes introduced by this misbehaving AS added some false links in the AS topology (even though the fraction is modest compared to the total number of AS links) and affected the inference of AS relationships for other existing links. Although previous work has pointed out issues with public BGP data, this paper studies a new source of errors. The false links resulting from spurious routes are particularly hard to detect because all data available for validation can only be used to verify that a given link exists, but not confirm that an inferred link is false. Reviewers had mixed opinions about this paper. They all agreed that this paper reveals a new type of error in public BGP data. Reviewers are concerned, however, that the impact of spurious routes on inferred topologies may be limited. Reviewers would have liked more analysis of how spurious routes impact previous work and more concrete recommendations of how future work should deal with spurious routes. Reviewers also asked for a more concrete definition of spurious route and how to measure that a route is spurious. The author addressed some of the reviewers’ concerns in the revision of the paper, but not all. In the end, we decided to accept the paper because the issues reported in the paper are new and researchers working on Internet topology and AS relationship inference should be aware of these issues.