The proof is in the pudding. Never have truer words been spoken. If only research was a lavish meal that concluded with a pudding. But now, what is a researcher to do?
Sally Floyd wins 2007 SIGCOMM Award
CALL FOR NOMINATIONS: EDITOR-IN-CHIEF, ACM Computer Communications Review (CCR)
Modeling Internet growth is important both for understanding the current network and to predict and improve its future. To date, Internet models have typically attempted to explain a subset of the following characteristics: network structure, traffic flow, geography, and economy. In this paper we present a discrete, agent-based model, that integrates all of them. We show that the model generates networks with topologies, dynamics, and more speculatively spatial distributions that are similar to the Internet.
In the last few years, several studies have analyzed the performance of flooding and random walks as querying mechanisms for unstructured wireless sensor networks. However, most of the work is theoretical in nature and while providing insights into the asymptotic behavior of these querying mechanisms, does not account for the non-idealities faced by the network in real deployments. In this paper, we propose a 3-way handshake protocol as a reliable implementation of a random walk and compare its performance with °ooding in real environments. The metrics considered are delay, reliability and transmission cost. Our initial results suggest that flooding is better suited for low-interference environments, while random walks might be a better option in networks with high interference. We also present possible research directions to improve the performance of flooding and random walks.
A key requirement for IETF recognition of new TCP algorithms is having an independent, interoperable implementation. This paper describes our BSD-licensed implementation of H-TCP within FreeBSD 7.0, publicly available as a dynamically loadable kernel module. Based on our implementation experience we provide a summary description of the H-TCP algorithm to assist other groups build further interoperable implementations. Using data from our live testbed we demonstrate that our version exhibits expected H-TCP behavior, and describe a number of implementation-specific issues that influence H-TCP’s dynamic behavior. Finally, we illustrate the actual collateral impact on path latency of using H-TCP instead of NewReno. In particular we illustrate how, compared to NewReno, H-TCP’s cwnd growth strategy can cause faster fluctuations in queue sizes at, yet lower median latency through, congestion points. We believe these insights will prove valuable predictors of H-TCP’s potential impact if deployed in consumer end-hosts in addition to specialist, high-performance network environments.
A broad spectrum of network measurement applications demand passive multipoint measurements in which data from multiple observation points has to be correlated. Examples are the passive measurement of one-way delay or the observation of the path that a packet takes through a network. Nevertheless, due to high data rates and the need for fine granular measurements, the resource consumption for passive measurements can be immense. Furthermore, the resource consumption depends on the traffic in the network, which usually is highly dynamic. Packet and ow-selection methods provide a solution to reduce and control the resource consumption for passive measurements. In order to apply such techniques to multipoint measurements the selection processes need to be synchronized. Hash-based selection is a deterministic packet selection based on a hash function computed on selected parts of the packet content. This selection decision is consistent throughout the network and enables packet tracing and the measurement of delay between network nodes. Because the selection is based on deterministic function it can introduce bias which leads to wrong estimation of traffic characteristics. In this paper we define a set of quality criteria and select methods to investigate which hash function is most suitable for hash-based packet selection. We analyze 23 non-cryptographic and 2 cryptographic hash functions. Experiments are performed with real traffic traces from different networks. Based on the results we recommend 2 fast hash functions which show low bias and sample a representative subset of the population.
Rate control protocols that utilise explicit feedback from routers are able to achieve fast convergence to an equilibrium which approximates processor-sharing on a single bottleneck link, and hence such protocols allow short flows to complete quickly. For a network, however, processor-sharing is not uniquely defined but corresponds with a choice of fairness criteria, and proportional fairness has a reasonable claim to be the network generalization of processor-sharing.
In this paper, we develop a variant of RCP (rate control protocol) that achieves alpha-fairness when buffers are small, including proportional fairness as the case alpha = 1. At the level of theoretical abstraction treated, our model incorporates a general network topology, and heterogeneous propagation delays. For our variant of the RCP algorithm, we establish a simple decentralized sufficient condition for local stability.
With steady improvement in the reliability and performance of communication devices, routing instabilities now contribute to many of the remaining service degradations and interruptions in modern networks. This has led to a renewed interest in centralized routing systems that, compared to distributed routing, can provide greater control over routing decisions and better visibility of the results. One benefit of centralized control is the opportunity to readily eliminate transient routing loops, which arise frequently after network changes because of inconsistent routing states across devices. Translating this conceptual simplicity into a solution with tolerable message complexity is non-trivial. Addressing this issue is the focus of this paper. We identify when and why avoiding transient loops might require a significant number of messages in a centralized routing system, and demonstrate that this is the case under many common failure scenarios. We also establish that minimizing the number of required messages is NP-hard, and propose a greedy heuristic that we show to perform well under many conditions. The paper’s results can facilitate the deployment and evaluation of centralized architectures by leveraging their strengths without incurring unacceptable overhead.
This writeup presents a critique of the field of “Wireless Sensor Networks (WSNs)”. Literature in this domain falls into two main, distinct categories: (1) algorithms or protocols, and (2) application-centric system design. A striking observation is that references across these two categories are minimal, and superficial at best. We argue that this is not accidental, and is the result of three main flaws in the former category of work. Going forward, an application-driven, bottom-up approach is required for meaningful articulation and subsequent solution of any networking issues in WSNs.
When Jim Kurose invited me to write a piece for the "recommended reading" series in CCR, I thought it would be a fun exercise and accepted the invitation right away. Little did I realize how challenging it would be to put together this list, as much because I was only allowed to pick 10 papers as because I had to steer clear of the many fine papers that had already been picked in previous lists. Rather than focusing on a single topic, I decided to pick papers from across sub-areas of networking that have left a lasting impression on me (and quite possibly on other researchers as well) because of their elegance, the insights they provide, and in many cases both of these. I hope many readers will enjoy reading these papers, if they have not already done so.
The Internet Research Task Force’s (IRTF) Internet Measurement Research Group (IMRG) continued its practice of sponsoring workshops on current topics in computer network measurement with the Workshop on Application Classification and Identification (WACI) held October 3, 2007, at BBN Technologies in Cambridge, Massachusetts. There has been much general interest within the community recently in finding techniques to identify traffic without examination of the service ports used in the TCP or UDP header, as these increasingly do not accurately indicate the application generating the traffic. The workshop agenda was formed around six abstracts that were selected from an open call by the workshop organizing committee.
The goal of this column this time was to address major scientific issues and propose novel scientific methods for doing the same thing under different names. However, it turned out to be too complicated as it required mastery of the english alphabet. Then, I had an epiphany (greek word): Why propose something new, when you can complain about something that already exists? I think I am turning into an angry old man prematurely.
The Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS) was organized to “bring together conference organizers (past, present, and future) and other interested people to discuss the issues they confront.” In conjunction with WOWCS, we survey some previous publications that discuss open issues related to organizing computer systems conferences, especially concerning conduct and management of the review process. We also list some topics about which we wish WOWCS had received submissions, but did not; these could be good topics for future articles.
Several users of our AS relationship inference data asked us why it contained AS relationship cycles, e.g., cases where AS A is a provider of AS B, B is a provider of C, and C is a provider of A, or other cycle types. Having been answering these questions in private communications, we have eventually decided to write down our answers here for future reference.
As anyone who has operated a large network can attest, enterprise networks are difficult to manage. That they have remained so despite significant commercial and academic efforts suggests the need for a different network management paradigm. Here we turn to operating systems as an instructive example in taming management complexity...
Flow records gathered by routers provide valuable coarse-granularity traffic information for several measurement-related network applications. However, due to high volumes of traffic, flow records need to be sampled before they are gathered. Current techniques for producing sampled flow records are either focused on selecting flows from which statistical estimates of traffic volume can be inferred, or have simplistic models for applications. Such sampled flow records are not suitable for many applications with more specific needs, such as ones that make decisions across flows.
As a first step towards tailoring the sampling algorithm to an application’s needs, we design a generic language in which any particular application can express the classes of traffic of its interest. Our evaluation investigates the expressive power of our language, and whether flow records have sufficient information to enable sampling of records of relevance to applications. We use templates written in our custom language to instrument sampling tailored to three different applications—BLINC, Snort, and Bro. Our study, based on month-long datasets gathered at two different network locations, shows that by learning local traffic characteristics we can sample relevant flow records near-optimally with low false negatives in diverse applications.
Network virtualization is a powerful way to run multiple architectures or experiments simultaneously on a shared infrastructure. However, making efficient use of the underlying resources requires effective techniques for virtual network embedding—mapping each virtual network to specific nodes and links in the substrate network. Since the general embedding problem is computationally intractable, past research restricted the problem space to allow efficient solutions, or focused on designing heuristic algorithms. In this paper, we advocate a different approach: rethinking the design of the substrate network to enable simpler embedding algorithms and more efficient use of resources, without restricting the problem space. In particular, we simplify virtual link embedding by: i) allowing the substrate network to split a virtual link over multiple substrate paths and ii) employing path migration to periodically re-optimize the utilization of the substrate network. We also explore node-mapping algorithms that are customized to common classes of virtualnetwork topologies. Our simulation experiments show that path splitting, path migration, and customized embedding algorithms enable a substrate network to satisfy a much larger mix of virtual networks.
Current network protocols must comply with rigid interfaces and rules of behavior to fit into well defined, vertical proto- col stacks. It is difficult for network designers to offer a wide spectrum of alternative protocols suitable for diverse situa- tions, and to make the stack evolve to match new needs. The tendency is to design protocols that can adapt to the widest possible spread of use. However, even the best adaptive pro- tocols cannot possibly cope with all situations. When their adaptivity limits are reached, the ability to switch to other protocols becomes a clear advantage.
Our aim in this paper is to present Lightweight Au- tonomous resIlient Networks (LAIN), a framework that ex- ploits the multiplicity of alternative protocol, and exposes the spectrum of choice to the advantage of the applications. The system runs continuous experiments with alternative protocols online, in parallel as well as serially, in order to select automatically those that best match the application’s needs under the current network conditions. We report first results on the feasibility of the approach and point out the need for such a system in network and protocol evolution.
As the saying goes, “In theory there is no difference between theory and practice. But, in practice, there is.” Networking research has a wealth of good papers on both sides of the theory-practice divide. However, many practical papers stop short of having a sharp problem formulation or a rigorously considered solution, and many theory papers overlook or assume away some key aspect of the system they intend to model. Still, every so often, a paper comes along that nails a practical question with just the right bit of theory. When that happens, it’s a thing of beauty. These are my ten favorite examples. In some cases, I mention survey papers that cover an entire body of work, or a journal paper that presents a more mature overview of one or more conference papers, rather than single out an individual research result. (As an aside, I think good survey papers are a wonderful contribution to the community, and wish more people invested the considerable time and energy required to write them.)
The July 2007 issue of CCR elicited review process horror stories. I expect that everyone has their own vast collection. I certainly do. However, I found that picking my favorite story to be like choosing my favorite offspring. Therefore, rather than focusing on a single tale of woe I have tried to extrapolate some key points from across the suboptimal reviewing I have observed. I write this essay from the perspective of an author who has years of accepts and rejects.1 However, this note is also greatly informed by my refereeing activities over the years (on PCs, reviewing for journals, editorial boards, etc.). My intent is to make general observations in the hopes of contributing to a conversation that improves our overall review processes and ultimately helps us establish a stronger set of community values with regards to what we expect and appreciate in papers. While I strive for generality I do not claim the observations are unbiased or that I have closed all my open wounds in this area.
In this article, we discuss the lessons in innovation from the last twenty years of the Internet that might be applied in the cellular telephone industry.
This paper focuses on BGP communities, a particular BGP attribute that has not yet been extensively studied by the research community. It allows an operator to group destinations in a single entity to which the same routing decisions might be applied. In this paper, we show that the usage of this attribute has increased and that it also contributes to routing table growth. In addition, we propose a taxonomy of BGP community attributes to allow operators to better document their communities. We further manually collect information on BGP communities and tag it according to our taxonomy. We show that a large proportion of the BGP communities are used for traffic engineering purposes.
There is a growing sentiment among academics in computing that a shift to multicore processors in commodity computers will demand that all programmers become parallel programmers. This is because future general-purpose processors are not likely to improve the performance of a single thread of execution; instead, the presence of multiple processor cores on a CPU will improve the performance of groups of threads. In this article, I argue that there is another trend underway, namely integration, which will have a greater near-term impact on developers of system software and applications. This integration, and its likely impact on general-purpose computers, is clearly illustrated in the architecture of modern mobile phones.
Despite the large number of papers on network topology modeling and inference, there still exists ambiguity about the real nature of the Internet AS and router level topology. While recent findings have illustrated the inaccuracies in maps inferred from BGP peering and traceroute measurements, existing topology models still produce static topologies, using simplistic assumptions about power law observations and preferential attachment.
Today, topology generators are tightly bound to the observed data used to validate them. Given that the actual properties of the Internet topology are not known, topology generators should strive to reproduce the variability that characterizes the evolution of the Internet topology over time. Future topology generators should be able to express the variations in local connectivity that makes today’s Internet: peering relationships, internal AS topology and routing policies each changing over time due to failures, maintenance, upgrades and business strategies of the network. Topology generators should capture those dimensions, by allowing a certain level of randomness in the outcome, rather than enforcing structural assumptions as the truths about Internet’s evolving structure, which may never be discovered.
This whitepaper proposes OpenFlow: a way for researchers to run experimental protocols in the networks they use every day. OpenFlow is based on an Ethernet switch, with an internal flow-table, and a standardized interface to add and remove flow entries. Our goal is to encourage networking vendors to add OpenFlow to their switch products for deployment in college campus backbones and wiring closets. We believe that OpenFlow is a pragmatic compromise: on one hand, it allows researchers to run experiments on heterogeneous switches in a uniform way at line-rate and with high port-density; while on the other hand, vendors do not need to expose the internal workings of their switches. In addition to allowing researchers to evaluate their ideas in real-world traffic settings, OpenFlow could serve as a useful campus component in proposed large-scale testbeds like GENI. Two buildings at Stanford University will soon run OpenFlow networks, using commercial Ethernet switches and routers. We will work to encourage deployment at other schools; and We encourage you to consider deploying OpenFlow in your university network too.
For those of us in academia, tenure is great. Unless you don't have it, in which case it pretty much sucks. In fact, it goes beyond sucking: it kills. Mainly your personal life, sometimes your spirit. I claim that assistant professors are an endangered species. Are we heading towards an environmental disaster that is going to destabilize the academic ecosystem or did I just have too much to drink last night? Either way, I think I deserve the Noble prize for Peace.
ACM SIGCOMM Computer Communication Review (CCR – www.sigcomm.org/ccr/) fills a unique niche in the spectrum of computer communications literature. It seeks to quickly publish articles containing high-quality research, especially new ideas and visions, in order to allow the community to react and comment. CCR is unique in that its reviewing process turn-over is less than 3 months, which guarantees a timely publication of high quality scientific articles.
This paper claims that Shadow Technical Program Committee (TPC) should be organized on a regular basis for attractive conferences in the networking domain. It helps ensuring that young generations of researchers have experience with the process of reviewing and selecting papers before they actually become part of regular TPCs. We highlight several reasons why a shadow TPC offers a unique educational experience, as compared with the two most traditional learning process: “delegated review” and “learn on the job”. We report examples taken from the CoNEXT 2007 shadow TPC and announce the CoNEXT 2008 Shadow TPC.
Knowledge of the largest traffic flows in a network is important for many network management applications. The problem of finding these flows is known as the heavy-hitter problem and has been the subject of many studies in the past years. One of the most efficient and well-known algorithms for finding heavy hitters is lossy counting .
In this work we introduce probabilistic lossy counting (PLC), which enhances lossy counting in computing network traffic heavy hitters. PLC uses on a tighter error bound on the estimated sizes of traffic flows and provides probabilistic rather than deterministic guarantees on its accuracy. The probabilistic-based error bound substantially improves the memory consumption of the algorithm. In addition, PLC reduces the rate of false positives of lossy counting and achieves a low estimation error, although slightly higher than that of lossy counting.
We compare PLC with state-of-the-art algorithms for finding heavy hitters. Our experiments using real traffic traces find that PLC has 1) between 34.4% and 74% lower memory consumption, 2) between 37.9% and 40.5% fewer false positives than lossy counting, and 3) a small estimation error.
An aggregate congestion control mechanism, namely Probe- Aided MulTCP (PA-MulTCP), is proposed in this paper. It is based on MulTCP, a proposal for enabling an aggregate to emulate the behavior of multiple concurrent TCP connections. The objective of PA-MulTCP is to ensure the fair sharing of the bottleneck bandwidth between the aggregate and other TCP or TCP-friendly flows while keeping lightweightness and responsiveness. Unlike MulTCP, there are two congestion window loops in PA-MulTCP, namely the probe window loop and the adjusting window loop. The probe window loop constantly probes the congestion situation and the adjusting window loop dynamically adjusts the congestion window size for the arriving and departing flows within the aggregate. Our simulations demonstrate that PA-MulTCP is more stable and fairer than MulTCP over a wide range of the weight N in steady conditions as well as in varying congestion conditions. PA-MulTCP is also responsive to flow arrival/departure and thus reduces the latency of short-lived transfers. Furthermore, PA-MulTCP is lightweight, since it enjoys above advantages at the cost of only an extra probe window loop, which has a marginal influence on the implementation complexity. Finally, the design of PA-MulTCP decouples the congestion management from the other functionalities in the aggregate flow management. As a result, PA-MulTCP could be potentially applied to a wider range of scenarios, e.g. wireless TCP proxies, edge-to-edge overlays, QoS provisioning and mass data transport.
As significant resources are directed towards clean-slate networking research, it is imperative to understand how cleanslate architectural research compares to the diametrically opposite paradigm of evolutionary research. This paper approaches the “evolution versus clean-slate” debate through a biological metaphor. We argue that evolutionary research can lead to less costly (more competitive) and more robust designs than clean-slate architectural research. We also argue that the Internet architecture is not ossified, as recently claimed, but that its core protocols play the role of “evolutionary kernels”, meaning that they are conserved so that complexity and diversity can emerge at the lower and higher layers. We then discuss the factors that determine the deployment of new architectures or protocols, and argue, based on the notion of “auto-catalytic sets”, that successful innovations are those that become synergistic components in closed loops of existing modules. The paper closes emphasizing the role of evolutionary Internet research.
Despite the flurry of anomaly-detection papers in recent years, effective ways to validate and compare proposed solutions have remained elusive. We argue that evaluating anomaly detectors on manually labeled traces is both important and unavoidable. In particular, it is important to evaluate detectors on traces from operational networks because it is in this setting that the detectors must ultimately succeed. In addition, manual labeling of such traces is unavoidable because new anomalies will be identified and characterized from manual inspection long before there are realistic models for them. It is well known, however, that manual labeling is slow and error-prone. In order to mitigate these challenges, we presentWebClass, a web-based infrastructure that adds rigor to the manual labeling process. WebClass allows researchers to share, inspect, and label traffic timeseries through a common graphical user interface. We are releasing WebClass to the research community in the hope that it will foster greater collaboration in creating labeled traces and that the traces will be of higher quality because the entire community has access to all the information that led to a given label.
The available bandwidth of a path directly impacts the performance of throughput sensitive applications, e.g., p2p content replication or podcasting. Several tools have been devised to estimate the available bandwidth. The vast majority of these tools follow either the Probe Rate Model (PRM) or the Probe Gap Model (PGM).
Lao et al.  and Liu et al.  have identified biases in the PGM approach that lead to consistent underestimations of the available bandwidth. Those results were obtained under the ideal assumption of stationary cross traffic.
In this note, we confirm the existence of these biases experimentally, i.e., for the case of non stationary cross traffic. To do so, we compare one representative of the PRM family, namely Pathload, and one representative of the PGM family, namely Spruce, using long term (several day long) traces collected on an example path.
We first propose a methodology to compare operational results of two available bandwidth measurement tools. Based on the sanitized data obtained using the previous method- ology, we next show that the biases identified by previous works are clearly observable on the long term, even with non stationary cross traffic. We further uncover the formal link that exists between the work by Liu et al. and the one by Lao et al.
The networking research community lacks a tradition of sharing experimental data, or using such data for reproducing results. But are we really that bad? Are we worse than researchers in other fields? And if so, how can we do better?
A major contribution to global warming has been the number of new workshops publishing proceedings with the prefix hot. In this article, I propose that we counter this trend in an attempt to remain carbon neutral with a set of “anti-workshops” on cold topics.
We suggest a number of heuristics for detecting when a topic has gone cold, and give some examples of the application of these heuristics. Of course, some cold topics warm up again, and so there is a risk of over dampening in our heuristic. Over a long period, dynamic equilibrium should be assured, but from time to time, our scheme may prejudice against surprising results in boring areas of communications research. Nevertheless, it may leave room for more surprising results in interesting areas of research, which cannot be a bad thing.
In this paper we propose a radical solution to data hosting and delivery for the Internet of the future. The current data delivery architecture is “network centric”, with content stored in data centers connected directly to Internet backbones. This approach has multiple drawbacks among which complexity of deploying data centers, power consumption, and lack of scalability are the most critical ones. We propose a totally innovative and orthogonal approach to traditional data centers, through what we call “nano” data centers, which are essentially boxes deployed at the edge of the network (e.g., in home gateways, set-top-boxes, etc.) that cooperate in a peer-to-peer manner. Unlike traditional peer-to-peer clients, however, our nano data centers operate under a common management authority, e.g., the ISP who installs and maintains the set-top-boxes, and can thus cooperate more effectively and achieve a higher aggregate performance. Nano data centers are, therefore, better suited for providing guaranteed quality to new emerging applications such as online gaming, interactive IPTV and VoD, and user generated content.
Anomalous events that affect the performance of networks are a fact of life. It is therefore not surprising that recent years have seen an explosion in research on network anomaly detection. What is quite surprising, however, is the lack of controlled evaluation of these detectors. In this paper we argue that there are numerous important questions regarding the effectiveness of anomaly detectors that cannot be answered by the evaluation techniques employed today. We present four central requirements of a rigorous evaluation that can only be met by simulating both the anomaly and its surrounding environment. While simulation is necessary, it is not sufficient. We therefore present an outline of an evaluation methodology that leverages both simulation and traces from operational networks.
This paper presents Ethane, a new network architecture for the enterprise. Ethane allows managers to define a single networkwide fine-grain policy, and then enforces it directly. Ethane couples extremely simple flow-based Ethernet switches with a centralized controller that manages the admittance and routing of flows. While radical, this design is backwards-compatible with existing hosts and switches.
We have implemented Ethane in both hardware and software, supporting both wired and wireless hosts. Our operational Ethane network has supported over 300 hosts for the past four months in in Stanford University’s network, and this deployment experience has significantly affected Ethane’s design.
Localizing the sources of performance problems in large enterprise networks is extremely challenging. Dependencies are numerous, complex and inherently multi-level, spanning hardware and software components across the network and the computing infrastructure. To exploit these dependencies for fast, accurate problem localization, we introduce an Inference Graph model, which is welladapted to user-perceptible problems rooted in conditions giving rise to both partial service degradation and hard faults. Further, we introduce the Sherlock system to discover Inference Graphs in the operational enterprise, infer critical attributes, and then leverage the result to automatically detect and localize problems. To illuminate strengths and limitations of the approach, we provide results from a prototype deployment in a large enterprise network, as well as from testbed emulations and simulations. In particular, we find that taking into account multi-level structure leads to a 30% improvement in fault localization, as compared to two-level approaches.
Modern enterprise networks are of sufficient complexity that even simple faults can be difficult to diagnose — let alone transient outages or service degradations. Nowhere is this problem more apparent than in the 802.11-based wireless access networks now ubiquitous in the enterprise. In addition to the myriad complexities of the wired network, wireless networks face the additional challenges of shared spectrum, user mobility and authentication management. Not surprisingly, few organizations have the expertise, data or tools to decompose the underlying problems and interactions responsible for transient outages or performance degradations. In this paper, we present a set of modeling techniques for automatically characterizing the source of such problems. In particular, we focus on data transfer delays unique to 802.11 networks—media access dynamics and mobility management latency. Through a combination of measurement, inference and modeling we reconstruct sources of delay —from the physical layer to the transport layer—as well as the interactions among them. We demonstrate our approach using comprehensive traces of wireless activity in the UCSD Computer Science building.