Estimating internet address space usage through passive measurements

By: 
Alberto Dainotti, Karyn Benson, Alistair King, kc claffy, Michael Kallitsis, Eduard Glatz, Xenofontas Dimitropoulos
Appears in: 
CCR January 2014
One challenge in understanding the evolution of Internet infrastructure is the lack of systematic mechanisms for monitoring the extent to which allocated IP addresses are actually used. Address utilization has been monitored via actively scanning the entire IPv4 address space. We evaluate
the potential to leverage passive network traffic measurements in addition to or instead of active probing. Passive traffic measurements introduce no network traffic overhead, do not rely on unfiltered responses to probing, and could potentially apply to IPv6 as well. We investigate two chal-
lenges in using passive traffic for address utilization inference: the limited visibility of a single observation point; and the presence of spoofed IP addresses in packets that can distort results by implying faked addresses are active. We propose a methodology for removing such spoofed traf-
fic on both darknets and live networks, which yields results comparable to inferences made from active probing. Our preliminary analysis reveals a number of promising findings, including novel insight into the usage of the IPv4 address space that would expand with additional vantage points.
Public Review By: 
Renata Teixeira

This paper presents a novel approach for estimating the fraction of the IP address space that is actively used. The state-of-the-art in this area, ISI's Census project, issues active probes to every address block on the IPv4 space. Active probing suffers from high probing overhead. With the adoption of IPv6, any technique based solely on probing the entire address space may no longer work. The solution presented in this paper passively observes traffic to infer the fraction of used IPv4 address space. They say that an address block is used if it is sending or receiving traffic. Passive measurements introduce no probing overhead and hence the technique can potentially scale for IPv6. The use of passive measurements, however, brings two challenges. First, one single vantage point cannot observe traffic from all active addresses. Second, spoofed addresses may cause the technique to infer that an address is active when it is not. The main contributions of this paper are: (i) to show empirically that passive measurements do observe a large fraction of the used address space; and (ii) a technique to filter spoofed addresses. All reviewers appreciated the well thought-out approach presented in this paper. Although the estimation technique is simple (i.e., observed addresses minus spoofed ones), reviewers particularly liked the techniques to filter out spoofed addresses in two types of datasets: netflow traces and packet traces collected at darknets. Reviewers also acknowledged the validation and evaluation effort in the paper. Reviewers did give a number of suggestions to improve the presentation of the paper both to clarify explanations and get the ideas across more concisely. For example, the comparison with the ISI