Mark Allman

Measuring IPv6 adoption

By: 
Jakub Czyz, Mark Allman, Jing Zhang, Scott Iekel-Johnson, Eric Osterweil, Michael Bailey
Appears in: 
CCR August 2014

After several IPv4 address exhaustion milestones in the last three years, it is becoming apparent that the world is running out of IPv4 addresses, and the adoption of the next generation Internet protocol, IPv6, though nascent, is accelerating. In order to better understand this unique and disruptive transition, we explore twelve metrics using ten global-scale datasets to create the longest and broadest measurement of IPv6 adoption to date.

A middlebox-cooperative TCP for a non end-to-end internet

By: 
Ryan Craven, Robert Beverly, Mark Allman
Appears in: 
CCR August 2014

Understanding, measuring, and debugging IP networks, particularly across administrative domains, is challenging. One particularly daunting aspect of the challenge is the presence of transparent middleboxes—which are now common in today’s Internet. In-path middleboxes that modify packet headers are typically transparent to a TCP, yet can impact end-to-end performance or cause blackholes. We develop TCP HICCUPS to reveal packet header manipulation to both endpoints of a TCP connection.

On modern DNS behavior and properties

By: 
Thomas Callahan, Mark Allman, Michael Rabinovich
Appears in: 
CCR July 2013

The Internet crucially depends on the Domain Name System (DNS) to both allow users to interact with the system in human-friendly terms and also increasingly as a way to direct traffic to the best content replicas at the instant the content is requested. This paper is an initial study into the behavior and properties of the modern DNS system. We passively monitor DNS and related traffic within a residential network in an effort to understand server behavior--as viewed through DNS responses?and client behavior--as viewed through both DNS requests and traffic that follows DNS responses.

Public Review By: 
Sharad Agarwal

Studies by online content providers including Amazon, Google and Microsoft, and by network game researchers have quantified the impact of network latency on user behavior. DNS is an important part of that latency -- both in contributing to initial connection setup latency but also in picking a server that has low network distance and low load for the client to use. A number of measurement studies of DNS behavior on the Internet have been published in the past. This paper is a more recent one. The authors have studied 14 months of data from a 90 home neighborhood in the US, served by bi-directional 1 Gbps fiber links. This data includes 200 million DNS queries and 1.1 billion flows. There are a number of notable findings in this paper. 63% of hostnames were requested only once throughout the 14 month window. Google's public DNS resolver served only 1% of queries. 75% of hostnames mapped to only 1 IP address, and those tended to not be optimized for geographic locality to the client. Two-thirds of DNS transactions completed in under 1ms, but 25% took between 10ms and 1s. 40% of DNS responses went unused, perhaps as a result of DNS prefetching. While the contribution of this paper is time-bounded until DNS behavior changes again, there is value to the community here. DNS researchers will find the results of interest, either in confirming that previously observed behavior is still happening or in seeing new behavior. Other researchers may find the data useful in building models for evaluation. However, as all the reviewers pointed out, the findings could be skewed by the small population of fiber-connected homes in the US. For instance, the paper finds heavy use of the Chrome web browser among their users, but Chrome commands roughly 16% of the browser market. This can skew some numbers, such as DNS prefetching.

Findings and implications from data mining the IMC review process

By: 
Robert Beverly, Mark Allman
Appears in: 
CCR January 2013

The computer science research paper review process is largely human and time-intensive. More worrisome, review processes are frequently questioned, and often non-transparent. This work advocates applying computer science methods and tools to the computer science review process. As an initial exploration, we data mine the submissions, bids, reviews, and decisions from a recent top-tier computer networking conference. We empirically test several common hypotheses, including the existence of readability, citation, call-for-paper adherence, and topical bias.

Public Review By: 
Sharad Agarwal

The debate on how to improve the conference paper review process rages on. This highly competitive, manual and lengthy process can have a big impact on the dissemination of new ideas, and author morale and careers. The goal of this paper is to encourage our community to analyze data on the review process, both during and after the review process, to help expose and/or correct biases (or lack thereof). This paper analyzes review data from ACM Internet Measurement Conference 2010. The authors find there is no bias with respect to readability, nor reviewer bidding scores. However, they find a topic bias and a citation bias, neither of which I find surprising and both are likely benign. We have to treat the findings with care. This paper uses only one conference's data. The cause of any bias (or lack of bias) has not been uncovered, though that is not a stated goal of the paper. The paper is far from comprehensive in exploring all possible biases. Individual analyses can be improved -- for example, language sophistication is probably not a best fit for technical papers. I expect this paper will generate discussion in the ACM SIGCOMM community. I hope there will be follow-on work by TPC chairs of other conferences and workshops. At the very least, we can help novice authors better understand with objective metrics what the bar is for different venues. We can take solace in knowing that no immediate cause for alarm has been identified in this paper.

Comments on bufferbloat

By: 
Mark Allman
Appears in: 
CCR January 2013

While there has been much buzz in the community about the large depth of queues throughout the Internet—the socalled “bufferbloat” problem—there has been little empirical understanding of the scope of the phenomenon. Yet, the supposed problem is being used as input to engineering decisions about the evolution of protocols. While we know from wide scale measurements that bufferbloat can happen, we have no empirically-based understanding of how often bufferbloat does happen. In this paper we use passive measurements to assess the bufferbloat phenomena.

Public Review By: 
Nikolaos Laoutaris

The large buffers found in devices across the Internet – the so-called “bufferbloat” problem – has been discussed anecdotally for quite some time. Still, as of now we seem to lack any systematic study shedding light into the scope and frequency of the phenomenon. This paper presents a first such quantification attempt. The analysis is decomposed in two categories : evaluating the prevalence of the phenomenon and then its impact on the initial window of TCP, based on traffic collected from a FTTH deployment attached to a university, as well some laboratory data. The preliminary findings of the empiric study is that while buffer bloat exists, the magnitude of the phenomenon seems to be quite modest, with little or no proof of large scale persistent queues. The authors note that although their findings are not conclusive, they raise a useful question on the true extent of the bufferbloat problem. All reviewers have agreed to that. Initially bufferbloat was thought to be largely a residential broadband network problem (in the upload direction in particular). While the bufferbloat community later began to suspect it was “everywhere,” the most likely source of bufferbloat remained the residential broadband networks (and in particular the upload link). The reviewers note that the “residential” network studied in the paper is not likely representative of such networks. They recommend further analysis based on “more typical” residential broadband networks. A provocative “myth-busting” type of paper that can generate interesting and useful discussions.

On building inexpensive network capabilities

By: 
Craig A. Shue, Andrew J. Kalafut, Mark Allman, Curtis R. Taylor
Appears in: 
CCR April 2012

There are many deployed approaches for blocking unwanted traffic, either once it reaches the recipient's network, or closer to its point of origin. One of these schemes is based on the notion of traffic carrying capabilities that grant access to a network and/or end host. However, leveraging capabilities results in added complexity and additional steps in the communication process: Before communication starts a remote host must be vetted and given a capability to use in the subsequent communication.

Public Review By: 
Stefan Saroiu

Leveraging capabilities in network architectures is a hot area of research today. A number of researchers have argued that capabilities could help improve network security (especially DoS attacks) because an attacker would lack the ability to generate traffic unless it acquires the appropriate capability first. This paper puts forward a interesting insight -- we could try leveraging DNS as a capability system and configure servers to change their IP addresses frequently (perhaps by changing IP translations in the NAT box placed in front of the server). A host needs to perform a DNS lookup before initiating a connection to the server. The paper does a nice job of describing how DNS could be used as a capability system. All reviewers acknowledged that the paper’s observation (i.e., “Hey! Here’s how to turn DNS into a capability system”) is a really nice one. The paper is also well-written and thought-provoking, and thus a very nice new addition on a long line of previous papers on the theme of how to introduce new functionality by piggy-backing on existing networking systems. The reviewers' main concern was understanding the exact nature of the threats that such a system would prevent. The reviewers felt that many DoS attacks today rely on flooding the network (rather than on sending a small number of packets only) and this system falls short from preventing such attacks. For example, even without the server's current IP address, an attacker could still flood the NAT box if they were to know a previously valid server IP. While the reviewers' concerns are very specific – the paper’s threat model is not clearly articulated – they get to a much deeper issue of this research area. The nature of the argument put forward appears to be recursive. On one hand, capabilities can stop DoS attacks on the network. But, how do we stop DoS attacks on the capabilities system itself?

On grappling with meta-information in the internet

By: 
Tom Callahan, Mark Allman, Michael Rabinovich, and Owen Bell
Appears in: 
CCR October 2011

The Internet has changed dramatically in recent years. In particular, the fundamental change has occurred in terms of who generates most of the content, the variety of applications used and the diverse ways normal users connect to the Internet. These factors have led to an explosion of the amount of user-specific meta-information that is required to access Internet content (e.g., email addresses, URLs, social graphs).

Public Review By: 
Stefan Saroiu

For most users, the Internet is increasingly becoming like a messy drawer. It is full of notes, lists, scraps of papers, old photos, new photos, tools, and so on, that users have accumulated over the years. Users have two choices. The first choice is to use a collection of item-specific organizers (i.e., content-specific applications) – such as an organizer for photos, an organizer for notes, and one for lists. The second choice is to hire a person (i.e. the “cloud”) – the external organizer who will clean up and keep track of everything. The first choice is difficult and the second requires delegating trust. Both are suboptimal. This paper tries to clean up the messy drawer. The authors put forward an architecture for dealing with meta-information – all these user-generated content that people hang to. The system is a combination of a personal naming system (DNS) and a distributed file store. It provides unified personal naming, user-directed actions on receipt of communication, sharing application state across devices, and sharing application configuration across devices. The paper is quick to point out that many of these solutions have been implemented as point-solutions already and the main contribution is to simply show the power and extensibility of the architecture. Although the paper brings together a collection of well-known techniques, the paper’s main goal (as the authors themselves point out) is “to start a conversation and not close a door.” The reviewers themselves went back-and-forth on weighting the paper’s motivation against its lack of technical novelty. In the end, this paper felt like a good fit for CCR because it does its job well – it starts a conversation around the need for organizing meta-information in the Internet.

On Building Special-Purpose Social Networks for Emergency Communication

By: 
Mark Allman
Appears in: 
CCR October 2010

In this paper we propose a system that will allow people to communicate their status with friends and family when they find themselves caught up in a large disaster (e.g., sending “I’m fine” in the immediate aftermath of an earthquake). Since communication between a disaster zone and the non-affected world is often highly constrained we design the system around lightweight triggers such that people can communicate status with only crude infrastructure (or even sneaker-nets).

Public Review By: 
S. Saroiu

This paper presents the design of a social network designed for emergency situations. In the case of a catastrophic event, each user can publish a small notification, which is then relayed to a small number of contacts. There are limits on how large a message can be and how many contacts one can list in the system. The system’s goal is to send at least one notification per hour on behalf of each user.
The system’s design is relatively simple. The user registers their list of contacts with a server, and receives in return a hard-to-guess ID. These contacts are then stored on a collection of tens to hundreds of servers that self-organize in a DHT. In case of emergency, a user publishes a message using their ID to the server, which in turn sends it to the user’s contacts. The design also addresses the system’s security needs and uses CAPTCHAS for the user registration step and the sparseness of the users’ ID space to make it hard for spammers to impersonate a user.
The system’s functionality is constrained. Yet, it is precisely the minimalist design of the system combined with the originality of the scenario that make the paper such an interesting read. As a user, I feel I would use such a system if it were real, and as a system designer, I see little reason for this design not to work in practice.
The reviewers had little criticism to offer to the paper. The design does not take a stance on what communication technology people will use to sign up for the system or to relay the notification messages. While the paper uses e-mail for its “back-of-the-envelope” calculations, it also leaves open the possibility of using multiple channels, such as voice calls, SMS messages, or IM messages. The reviewers also wondered what constitutes an “emergency” and whether people should be allowed to use the system in case of personal emergencies, such as in a car accident. Finally, reviewers wondered whether the paper’s reluctance to make use of today’s social networks (e.g., Facebook, Twitter) is warranted given these systems’ popularity and ease-of-use.

Comments on Selecting Ephemeral Ports

By: 
Mark Allman
Appears in: 
CCR April 2009

Careless selection of the ephemeral port number portion of a transport protocol’s connection identifier has been shown to potentially degrade security by opening the connection up to injection attacks from “blind” or “off path” attackers—or, attackers that cannot directly observe the connection. This short paper empirically explores a number of algorithms for choosing the ephemeral port number that attempt to obscure the choice from such attackers and hence make mounting these blind attacks more difficult.

Public Review By: 
Kevin Almeroth

The author describes an algorithm to select “ephemeral ports,” those ports on the client side of a transport session. Instead of using an easily predicted method, which has the disadvantage of being more suceptible to injection attacks, the author evaluates a set different algorithms for port selection (the last one is a newly proposed algorithm) and compares their performance in terms of how quickly they can establish connections without port number collisions.
Overall, the paper is quite good and brings awareness to a problem and a corresponding set of solutions having widespread relevance. The best part is, once the author describes the problem, the evaluation is thorough and rigorous, in particular, the author does a good job considering the impact of NATs.
The real challenge is whether the problem addressed by the paper requires any significant new creative contribution, or whether it is a simple matter of a straightforward problem with a straightforward solution. Further, given the paper’s discussion of cryptography as a way of protecting data within the transport session, how easy is it to inject false data? Is better ephemeral port selection really the best way to solve the problem?

Syndicate content