CCR Papers from 2015

  • Dina Papagiannaki

    Welcome to the last CCR issue for the year 2015. We have had quite an intensive year. Throughout the entire year of 2015 we received 105 submissions and published 19 papers (both technical and editorial). In August, we held our most successful ever “best of CCR session” that saw tremendous attendance but also very positive feedback. I am very excited about what CCR has achieved so far and thank all the authors for their interest in CCR.

    This last issue of 2015 features five papers, out of which four are technical contributions and one is an editorial. The technical papers cover topics such as secure DNS, OpenFlow, network analytics, and transparency in the Web. The editorial presents a position around what the authors call edge-centric computing. I hope you do enjoy all the articles.

    Our industrial liaison board column features Dr. George Varghese. Dr. Varghese discusses his experience in moving academic knowledge to commercial products. He uses 3 different examples and clearly demonstrates that there is no “one size fits all” approach to technology transfer. I find his “lessons learnt” a useful guideline to consider before embarking in such a journey.

    Finally, this issue sees the end of term for two of our associate editors. Prof. Phillipa Gill, from Stonybrook University, and Prof. Joel Sommers, from Colgate University, are ending their tenure at CCR. I would like to thank them both for having produced some of the most thought provoking public reviews and for always having provided considerate, constructive feedback in all the papers they had to deal with in the past 2 years.

    Our farewell to Joel and Phillipa comes with our welcome to two new associate editors. It is my great pleasure to welcome to the CCR editorial board Prof. Fahad Dogar, TUMS University, and Prof. David Choffnes, Northeastern University. They both join with tremendous energy. I am delighted they will lend us their expertise for the next 2 years.

    With this, it was a great pleasure to see some of you in London. I found SIGCOMM to be a stimulating, vibrant venue with ever increasing reach. I left London with lots of new ideas, and a sense that our community has the potential to make a true difference in the world. Let’s keep it up! Dina Papagiannaki CCR Editor

  • Hassan Metwalley, Stefano Traverso, Marco Mellia (Politecnico di Torino), Stanislav Miskovic (Symantec Corp.), Mario Baldi (Politecnico di Torino)

    Individuals lack proper means to supervise the services they contact and the information they exchange when surfing the web. This security task has become challenging due to the complexity of the modern web, of the data delivering technology, and even to the adoption of encryption, which, while improving privacy, makes innetwork services ineffective. The implications are serious, from a person contacting undesired services or unwillingly exposing private information, to a company being unable to control the flow of its information to the outside world. To empower transparency and the capability of taking informed choices in the web, we propose CROWDSURF, a system for comprehensive and collaborative auditing of data exchanged with Internet services. Similarly to crowdsourced efforts, we enable users to contribute in building awareness, supported by the semi-automatic analysis of data offered by a cloud-based system. The result is the creation of “suggestions” that individuals can transform in enforceable “rules” to customize their web browsing policy. CROWDSURF provides the core infrastructure to let individuals and enterprises regain visibility and control on their web activity. Preliminary results obtained executing a prototype implementation demonstrate the feasibility and potential of CROWDSURF.

    Joseph Camp
  • Roland van Rijswijk-Deij (University of Twente and SURFnet), Anna Sperotto, Aiko Pras (University of Twente)

    The Domain Name System Security Extensions (DNSSEC) add authenticity and integrity to the DNS, improving its security. Unfortunately, DNSSEC is not without problems. DNSSEC adds digital signatures to the DNS, significantly increasing the size of DNS responses. This means DNSSEC is more susceptible to packet fragmentation and makes DNSSEC an attractive vector to abuse in amplificationbased denial-of-service attacks. Additionally, key management policies are often complex. This makes DNSSEC fragile and leads to operational failures. In this paper, we argue that the choice for RSA as default cryptosystem in DNSSEC is a major factor in these three problems. Alternative cryptosystems, based on elliptic curve cryptography (ECDSA and EdDSA), exist but are rarely used in DNSSEC. We show that these are highly attractive for use in DNSSEC, although they also have disadvantages. To address these, we have initiated research that aims to investigate the viability of deploying ECC at a large scale in DNSSEC.

    Phillipa Gill
  • Dimitrios Sarlis, Nikolaos Papailiou, Ioannis Konstantinou (CSLAB, NTUA), Georgios Smaragdakis (MIT & TU Berlin), Nectarios Koziris (CSLAB, NTUA)

    The ever-increasing Internet traffic poses challenges to network operators and administrators that have to analyze large network datasets in a timely manner to make decisions regarding network routing, dimensioning, accountability and security. Network datasets collected at large networks such as Internet Service Providers (ISPs) or Internet Exchange Points (IXPs) can be in the order of Terabytes per hour. Unfortunately, most of the current network analysis approaches are ad-hoc and centralized, and thus not scalable. In this paper, we present Datix, a fully decentralized, open-source analytics system for network traffic data that relies on smart partitioning storage schemes to support fast join algorithms and efficient execution of filtering queries. We outline the architecture and design of Datix and we present the evaluation of Datix using real traces from an operational IXP. Datix is a system that deals with an important problem in the intersection of data management and network monitoring while utilizing state-of-the-art distributed processing engines. In brief, Datix manages to efficiently answer queries within minutes compared to more than 24 hours processing when executing existing Pythonbased code in single node setups. Datix also achieves nearly 70% speedup compared to baseline query implementations of popular big data analytics engines such as Hive and Shark.

    Marco Mellia
  • Sajad Shirali-Shahreza, Yashar Ganjali (University of Toronto)

    The ability to manage individual flows is a major benefit of Software-Defined Networking. The overheads of this fine-grained control, e.g. initial flow setup delay, can overcome the benefits, for example when we have many time-sensitive short flows. Coarse-grained control of groups of flows, on the other hand, can be very complex: each packet may match multiple rules, which requires conflict resolution. In this paper, we present ReWiFlow, a restricted class of OpenFlow wildcard rules (the fundamental way to control groups of flows in OpenFlow), which allows managing groups of flows with flexibility and without loss of performance. We demonstrate how ReWiFlow can be used to implement applications such as dynamic proactive routing. We also present a generalization of ReWiFlow, called MultiReWiFlow, and show how it can be used to efficiently represent access control rules collected from Stanford’s backbone network.

    Hitesh Ballani
  • Pedro Garcia Lopez (Universitat Rovira i Virgili), Alberto Montresor (University of Trento), Dick Epema (Delft University of Technology), Anwitaman Datta (Nanyang Technological University), Teruo Higashino (Osaka University), Adriana Iamnitchi (University of South Florida), Marinho Barcellos (Universidade do Vale do Rio dos Sinos), Pascal Felber, Etienne Riviere (University of Neuchatel)

    In many aspects of human activity, there has been a continuous struggle between the forces of centralization and decentralization. Computing exhibits the same phenomenon; we have gone from mainframes to PCs and local networks in the past, and over the last decade we have seen a centralization and consolidation of services and applications in data centers and clouds. We position that a new shift is necessary. Technological advances such as powerful dedicated connection boxes deployed in most homes, high capacity mobile end-user devices and powerful wireless networks, along with growing user concerns about trust, privacy, and autonomy requires taking the control of computing applications, data, and services away from some central nodes (the “core”) to the other logical extreme (the “edge”) of the Internet. We also position that this development can help blurring the boundary between man and machine, and embrace social computing in which humans are part of the computation and decision making loop, resulting in a human-centered system design. We refer to this vision of human-centered edge-device based computing as Edge-centric Computing. We elaborate in this position paper on this vision and present the research challenges associated with its implementation.

  • Aditya Akella (University of Wisconsin-Madison)

    Dear students: I hope you have been enjoying reading the column. This column is similar to the previous one: it attempts to address a few more of your questions. Thanks for the great questions; keep them coming! Again, many thanks to Brighten Godfrey (UIUC) and Vyas Sekar (CMU) for contributing their thoughts.

  • Bruce Davie, Christophe Diot, Lars Eggert, Nick McKeown, Venkat Padmanabhan, Renata Teixeira (SIGCOMM Industrial Liaison Board)

    As networking researchers, we love to work on ideas that improve the practice of networking. In the early pioneering days of the Internet the link between networking researchers and practitioners was strong; the community was small and everyone knew each other. Not only were there many important ideas from the research community that affected the practice of networking, we were all very aware of them. Today, the networking industry is enormous and the practice of networking spans many network equipment vendors, operators, chip builders, the IETF, data centers, wireless and cellular, and so on. There continue to be many transfers of ideas, but there isn’t a forum to learn about them. The goal of this series is to create such a forum by presenting articles that shine a spotlight on specific examples; not only on the technology and ideas, but also on the path the ideas took to affect the practice. Sometimes a research paper was picked up by chance; but more often, the researchers worked hand-in-hand with the standards community, the open-source community or industry to develop the idea further to make it suitable for adoption. Their story is here. We are seeking CCR articles describing interesting cases of “research affecting the practice,” including ideas transferred from research labs (academic or industrial) that became: • Commercial products. • Internet standards • Algorithms and ideas embedded into new or existing products • Widely used open-source software • Ideas deployed first by startups, by existing companies or by distribution of free software • Communities built around a toolbox, language, dataset We also welcome stories of negative experiences, or ideas that seemed promising but ended-up not taking off. In this issue, George Varghese describes his experience with technology transfer of network algorithms.

  • George Varghese (Microsoft Research)

    I agree with the editors of this series that technology transfer from academia occurs in a variety of ways with different tradeoffs. I describe, to the best of my recollection, some of my experiences with ideas that originated or were described in academic papers, from Deficit Round Robin to Conga.

  • Paul Tune, Matthew Roughan

    Traffic matrices describe the volume of traffic between a set of sources and destinations within a network. These matrices are used in a variety of tasks in network planning and traffic engineering, such as the design of network topologies. Traffic matrices naturally possess complex spatiotemporal characteristics, but their proprietary nature means that little data about them is available publicly, and this situation is unlikely to change. Our goal is to develop techniques to synthesize traffic matrices for researchers who wish to test new network applications or protocols. The paucity of available data, and the desire to build a general framework for synthesis that could work in various settings requires a new look at this problem. We show how the principle of maximum entropy can be used to generate a wide variety of traffic matrices constrained by the needs of a particular task, and the available information, but otherwise avoiding hidden assumptions about the data. We demonstrate how the framework encompasses existing models and measurements, and we apply it in a simple case study to illustrate the value.

  • Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, Alex C. Snoeren

    Large cloud service providers have invested in increasingly larger datacenters to house the computing infrastructure required to support their services. Accordingly, researchers and industry practitioners alike have focused a great deal of effort designing network fabrics to efficiently interconnect and manage the traffic within these datacenters in performant yet efficient fashions. Unfortunately, datacenter operators are generally reticent to share the actual requirements of their applications, making it challenging to evaluate the practicality of any particular design. Moreover, the limited large-scale workload information available in the literature has, for better or worse, heretofore largely been provided by a single datacenter operator whose use cases may not be widespread. In this work, we report upon the network traffic observed in some of Facebook's datacenters. While Facebook operates a number of traditional datacenter services like Hadoop, its core Web service and supporting cache infrastructure exhibit a number of behaviors that contrast with those reported in the literature. We report on the contrasting locality, stability, and predictability of network traffic in Facebook's datacenters, and comment on their implications for network architecture, traffic engineering, and switch design.

  • Liang Zheng, Carlee Joe-Wong, Chee Wei Tan, Mung Chiang, Xinyu Wang

    Amazon's Elastic Compute Cloud (EC2) uses auction based spot pricing to sell spare capacity, allowing users to bid for cloud resources at a highly reduced rate. Amazon sets the spot price dynamically and accepts user bids above this price. Jobs with lower bids (including those already running) are interrupted and must wait for a lower spot price before resuming. Spot pricing thus raises two basic questions: how might the provider set the price, and what prices should users bid? Computing users' bidding strategies is particularly challenging: higher bid prices reduce the probability of, and thus extra time to recover from, interruptions, but may increase users' cost. We address these questions in three steps: (1) modeling the cloud provider's setting of the spot price and matching the model to historically offered prices, (2) deriving optimal bidding strategies for different job requirements and interruption overheads, and (3) adapting these strategies to MapReduce jobs with master and slave nodes having different interruption overheads. We run our strategies on EC2 for a variety of job sizes and instance types, showing that spot pricing reduces user cost by 90% with a modest increase in completion time compared to on-demand pricing.

  • Hirochika Asai, Yasuhiro Ohara

    Internet of Things leads to routing table explosion. An inexpensive approach for IP routing table lookup is required against ever growing size of the Internet. We contribute by a fast and scalable software routing lookup algorithm based on a multiway trie, called Poptrie. Named after our approach to traversing the tree, it leverages the population count instruction on bit-vector indices for the descendant nodes to compress the data structure within the CPU cache. Poptrie outperforms the state-of-the-art technologies, Tree BitMap, DXR and SAIL, in all of the evaluations using random and real destination queries on 35 routing tables, including the real global tier-1 ISP's full-route routing table. Poptrie peaks between 174 and over 240 Million lookups per second (Mlps) with a single core and tables with 500- 800k routes, consistently 4-578% faster than all competing algorithms in all the tests we ran. We provide the comprehensive performance evaluation, remarkably with the CPU cycle analysis. This paper shows the suitability of Poptrie in the future Internet including IPv6, where a larger route table is expected with longer prefixes.

  • Matthew K. Mukerjee, David Naylor, Junchen Jiang, Dongsu Han, Srinivasan Seshan, Hui Zhang
  • Brandon Schlinker, Radhika Niranjan Mysore, Sean Smith, Jeffrey C. Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, Michael Rubin

    The design space for large, multipath datacenter networks is large and complex, and no one design fits all purposes. Network architects must trade off many criteria to design costeffective, reliable, and maintainable networks, and typically cannot explore much of the design space. We present Condor, our approach to enabling a rapid, efficient design cycle. Condor allows architects to express their requirements as constraints via a Topology Description Language (TDL), rather than having to directly specify network structures. Condor then uses constraint-based synthesis to rapidly generate candidate topologies, which can be analyzed against multiple criteria. We show that TDL supports concise descriptions of topologies such as fat-trees, BCube, and DCell; that we can generate known and novel variants of fat-trees with simple changes to a TDL file; and that we can synthesize large topologies in tens of seconds. We also show that Condor supports the daunting task of designing multi-phase network expansions that can be carried out on live networks.

  • Pan Hu, Pengyu Zhang, Deepak Ganesan

    Backscatter provides dual-benefits of energy harvesting and low-power communication, making it attractive to a broad class of wireless sensors. But the design of a protocol that enables extremely power-efficient radios for harvesting-based sensors as well as high-rate data transfer for data-rich sensors presents a conundrum. In this paper, we present a new fully asymmetric backscatter communication protocol where nodes blindly transmit data as and when they sense. This model enables fully flexible node designs, from extraordinarily powerefficient backscatter radios that consume barely a few micro-watts to high-throughput radios that can stream at hundreds of Kbps while consuming a paltry tens of micro-watts. The challenge, however, lies in decoding concurrent streams at the reader, which we achieve using a novel combination of time-domain separation of interleaved signal edges, and phase-domain separation of colliding transmissions. We provide an implementation of our protocol, LF-Backscatter, and show that it can achieve an order of magnitude or more improvement in throughput, latency and power over state-of-art alternatives.

  • Alok Kumar, Sushant Jain, Uday Naik, Anand Raghuraman, Nikhil Kasinadhuni, Enrique Cauich Zermeno, C. Stephen Gunn, Jing Ai, Bj?rn Carlin, Mihai Amarandei-Stavila, Mathieu Robin, Aspi Siganporia, Stephen Stuart, Amin Vahdat

    WAN bandwidth remains a constrained resource that is economically infeasible to substantially overprovision. Hence, it is important to allocate capacity according to service priority and based on the incremental value of additional allocation. For example, it may be the highest priority for one service to receive 10Gb/s of bandwidth but upon reaching such an allocation, incremental priority may drop sharply favoring allocation to other services. Motivated by the observation that individual flows with fixed priority may not be the ideal basis for bandwidth allocation, we present the design and implementation of Bandwidth Enforcer (BwE), a global, hierarchical bandwidth allocation infrastructure. BwE supports: i) service-level bandwidth allocation following prioritized bandwidth functions where a service can represent an arbitrary collection of flows, ii) independent allocation and delegation policies according to user-defined hierarchy, all accounting for a global view of bandwidth and failure conditions, iii) multi-path forwarding common in trafficengineered networks, and iv) a central administrative point to override (perhaps faulty) policy during exceptional conditions. BwE has delivered more service-efficient bandwidth utilization and simpler management in production for multiple years.

  • Keon Jang, Justine Sherry, Hitesh Ballani, Toby Moncaster

    Many cloud applications can benefit from guaranteed latency for their network messages, however providing such predictability is hard, especially in multi-tenant datacenters. We identify three key requirements for such predictability: guaranteed network bandwidth, guaranteed packet delay and guaranteed burst allowance. We present Silo, a system that offers these guarantees in multi-tenant datacenters. Silo leverages the tight coupling between bandwidth and delay: controlling tenant bandwidth leads to deterministic bounds on network queuing delay. Silo builds upon network calculus to place tenant VMs with competing requirements such that they can coexist. A novel hypervisor-based policing mechanism achieves packet pacing at sub-microsecond granularity, ensuring tenants do not exceed their allowances. We have implemented a Silo prototype comprising a VM placement manager and a Windows filter driver. Silo does not require any changes to applications, guest OSes or network switches. We show that Silo can ensure predictable message latency for cloud applications while imposing low overhead.

  • Mosharaf Chowdhury, Ion Stoica

    Inter-coflow scheduling improves application-level communication performance in data-parallel clusters. However, existing efficient schedulers require a priori coflow information and ignore cluster dynamics like pipelining, task failures, and speculative executions, which limit their applicability. Schedulers without prior knowledge compromise on performance to avoid head-of-line blocking. In this paper, we present Aalo that strikes a balance and efficiently schedules coflows without prior knowledge. Aalo employs Discretized Coflow-Aware Least-Attained Service (D-CLAS) to separate coflows into a small number of priority queues based on how much they have already sent across the cluster. By performing prioritization across queues and by scheduling coflows in the FIFO order within each queue, Aalo's non-clairvoyant scheduler reduces coflow completion times while guaranteeing starvation freedom. EC2 deployments and trace-driven simulations show that communication stages complete 1.93x faster on average and 3.59x faster at the 95th percentile using Aalo in comparison to per-flow mechanisms. Aalo's performance is comparable to that of solutions using prior knowledge, and Aalo outperforms them in presence of cluster dynamics.

  • Xiaoqi Ren, Ganesh Ananthanarayanan, Adam Wierman, Minlan Yu

    As clusters continue to grow in size and complexity, providing scalable and predictable performance is an increasingly important challenge. A crucial roadblock to achieving predictable performance is stragglers, i.e., tasks that take significantly longer than expected to run. At this point, speculative execution has been widely adopted to mitigate the impact of stragglers. However, speculation mechanisms are designed and operated independently of job scheduling when, in fact, scheduling a speculative copy of a task has a direct impact on the resources available for other jobs. In this work, we present Hopper, a job scheduler that is speculationaware, i.e., that integrates the tradeoffs associated with speculation into job scheduling decisions. We implement both centralized and decentralized prototypes of the Hopper scheduler and show that 50% (66%) improvements over state-of-the-art centralized (decentralized) schedulers and speculation strategies can be achieved through the coordination of scheduling and speculation.

  • David Naylor, Kyle Schomp, Matteo Varvello, Ilias Leontiadis, Jeremy Blackburn, Diego R. L?pez, Konstantina Papagiannaki, Pablo Rodriguez Rodriguez, Peter Steenkiste

    A significant fraction of Internet traffic is now encrypted and HTTPS will likely be the default in HTTP/2. However, Transport Layer Security (TLS), the standard protocol for encryption in the Internet, assumes that all functionality resides at the endpoints, making it impossible to use in-network services that optimize network resource usage, improve user experience, and protect clients and servers from security threats. Re-introducing in-network functionality into TLS sessions today is done through hacks, often weakening overall security. In this paper we introduce multi-context TLS (mcTLS), which extends TLS to support middleboxes. mcTLS breaks the current "all-or-nothing" security model by allowing endpoints and content providers to explicitly introduce middleboxes in secure end-to-end sessions while controlling which parts of the data they can read or write. We evaluate a prototype mcTLS implementation in both controlled and "live" experiments, showing that its benefits come at the cost of minimal overhead. More importantly, we show that mcTLS can be incrementally deployed and requires only small changes to client, server, and middlebox software.

  • Yibo Zhu, Nanxi Kang, Jiaxin Cao, Albert Greenberg, Guohan Lu, Ratul Mahajan, Dave Maltz, Lihua Yuan, Ming Zhang, Ben Y. Zhao, Haitao Zheng

    Debugging faults in complex networks often requires capturing and analyzing traffic at the packet level. In this task, datacenter networks (DCNs) present unique challenges with their scale, traffic volume, and diversity of faults. To troubleshoot faults in a timely manner, DCN administrators must a) identify affected packets inside large volume of traffic; b) track them across multiple network components; c) analyze traffic traces for fault patterns; and d) test or confirm potential causes. To our knowledge, no tool today can achieve both the specificity and scale required for this task. We present Everflow, a packet-level network telemetry system for large DCNs. Everflow traces specific packets by implementing a powerful packet filter on top of "match and mirror" functionality of commodity switches. It shuffles captured packets to multiple analysis servers using load balancers built on switch ASICs, and it sends "guided probes" to test or confirm potential faults. We present experiments that demonstrate Everflow's scalability, and share experiences of troubleshooting network faults gathered from running it for over 6 months in Microsoft's DCNs.

  • Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, Ming Zhang

    Modern datacenter applications demand high throughput (40Gbps) and ultra-low latency (< 10 us per hop) from the network, with low CPU overhead. Standard TCP/IP stacks cannot meet these requirements, but Remote Direct Memory Access (RDMA) can. On IP-routed datacenter networks, RDMA is deployed using RoCEv2 protocol, which relies on Priority-based Flow Control (PFC) to enable a drop-free network. However, PFC can lead to poor application performance due to problems like head-of-line blocking and unfairness. To alleviates these problems, we introduce DCQCN, an end-to-end congestion control scheme for RoCEv2. To optimize DCQCN performance, we build a fluid model, and provide guidelines for tuning switch buffer thresholds, and other protocol parameters. Using a 3-tier Clos network testbed, we show that DCQCN dramatically improves throughput and fairness of RoCEv2 RDMA traffic. DCQCN is implemented in Mellanox NICs, and is being deployed in Microsoft's datacenters.

  • Sam Burnett, Nick Feamster

    Despite the pervasiveness of Internet censorship, we have scant data on its extent, mechanisms, and evolution. Measuring censorship is challenging: it requires continual measurement of reachability to many target sites from diverse vantage points. Amassing suitable vantage points for longitudinal measurement is difficult; existing systems have achieved only small, short-lived deployments. We observe, however, that most Internet users access content via Web browsers, and the very nature of Web site design allows browsers to make requests to domains with different origins than the main Web page. We present Encore, a system that harnesses crossorigin requests to measure Web filtering from a diverse set of vantage points without requiring users to install custom software, enabling longitudinal measurements from many vantage points. We explain how Encore induces Web clients to perform cross-origin requests that measure Web filtering, design a distributed platform for scheduling and collecting these measurements, show the feasibility of a global-scale deployment with a pilot study and an analysis of potentially censored Web content, identify several cases of filtering in six months of measurements, and discuss ethical concerns that would arise with widespread deployment.

  • Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, Bruno Sinopoli

    User-perceived quality-of-experience (QoE) is critical in Internet video applications as it impacts revenues for content providers and delivery systems. Given that there is little support in the network for optimizing such measures, bottlenecks could occur anywhere in the delivery system. Consequently, a robust bitrate adaptation algorithm in client-side players is critical to ensure good user experience. Previous studies have shown key limitations of state-of-art commercial solutions and proposed a range of heuristic fixes. Despite the emergence of several proposals, there is still a distinct lack of consensus on: (1) How best to design this client-side bitrate adaptation logic (e.g., use rate estimates vs. buffer occupancy); (2) How well specific classes of approaches will perform under diverse operating regimes (e.g., high throughput variability); or (3) How do they actually balance different QoE objectives (e.g., startup delay vs. rebuffering). To this end, this paper makes three key technical contributions. First, to bring some rigor to this space, we develop a principled control-theoretic model to reason about a broad spectrum of strategies. Second, we propose a novel model predictive control algorithm that can optimally combine throughput and buffer occupancy information to outperform traditional approaches. Third, we present a practical implementation in a reference video player to validate our approach using realistic trace-driven emulations.

  • Manikanta Kotaru, Kiran Joshi, Dinesh Bharadia, Sachin Katti

    This paper presents the design and implementation of SpotFi, an accurate indoor localization system that can be deployed on commodity WiFi infrastructure. SpotFi only uses information that is already exposed by WiFi chips and does not require any hardware or firmware changes, yet achieves the same accuracy as state-of-the-art localization systems. SpotFi makes two key technical contributions. First, SpotFi incorporates super-resolution algorithms that can accurately compute the angle of arrival (AoA) of multipath components even when the access point (AP) has only three antennas. Second, it incorporates novel filtering and estimation techniques to identify AoA of direct path between the localization target and AP by assigning values for each path depending on how likely the particular path is the direct path. Our experiments in a multipath rich indoor environment show that SpotFi achieves a median accuracy of 40 cm and is robust to indoor hindrances such as obstacles and multipath.

  • Virajith Jalaparti, Peter Bodik, Ishai Menache, Sriram Rao, Konstantin Makarychev, Matthew Caesar

    To reduce the impact of network congestion on big data jobs, cluster management frameworks use various heuristics to schedule compute tasks and/or network flows. Most of these schedulers consider the job input data fixed and greedily schedule the tasks and flows that are ready to run. However, a large fraction of production jobs are recurring with predictable characteristics, which allows us to plan ahead for them. Coordinating the placement of data and tasks of these jobs allows for significantly improving their network locality and freeing up bandwidth, which can be used by other jobs running on the cluster. With this intuition, we develop Corral, a scheduling framework that uses characteristics of future workloads to determine an offline schedule which (i) jointly places data and compute to achieve better data locality, and (ii) isolates jobs both spatially (by scheduling them in different parts of the cluster) and temporally, improving their performance. We implement Corral on Apache Yarn, and evaluate it on a 210 machine cluster using production workloads. Compared to Yarn's capacity scheduler, Corral reduces the makespan of these workloads up to 33% and the median completion time up to 56%, with 20-90% reduction in data transferred across racks.

  • Sanjit Biswas, John Bicket, Edmund Wong, Raluca Musaloiu-E, Apurv Bhartia, Dan Aguayo

    Meraki is a cloud-based network management system which provides centralized configuration, monitoring, and network troubleshooting tools across hundreds of thousands of sites worldwide. As part of its architecture, the Meraki system has built a database of time-series measurements of wireless link, client, and application behavior for monitoring and debugging purposes. This paper studies an anonymized subset of measurements, containing data from approximately ten thousand radio access points, tens of thousands of links, and 5.6 million clients from one-week periods in January 2014 and January 2015 to provide a deeper understanding of realworld network behavior. This paper observes the following phenomena: wireless network usage continues to grow quickly, driven most by growth in the number of devices connecting to each network. Intermediate link delivery rates are common indoors across a wide range of deployment environments. Typical access points share spectrum with dozens of nearby networks, but the presence of a network on a channel does not predict channel utilization. Most access points see 2.4 GHz channel utilization of 20% or more, with the top decile seeing greater than 50%, and the majority of the channel use contains decodable 802.11 headers.

  • Dinesh Bharadia, Kiran Raj Joshi, Manikanta Kotaru, Sachin Katti

    We present BackFi, a novel communication system that enables high throughput, long range communication between very low power backscatter IoT sensors and WiFi APs using ambient WiFi transmissions as the excitation signal. Specifically, we show that it is possible to design IoT sensors and WiFi APs such that the WiFi AP in the process of transmitting data to normal WiFi clients can decode backscatter signals which the IoT sensors generate by modulating information on to the ambient WiFi transmission. We show via prototypes and experiments that it is possible to achieve communication rates of up to 5 Mbps at a range of 1 m and 1 Mbps at a range of 5 meters. Such performance is an order to three orders of magnitude better than the best known prior WiFi backscatter system [27, 25]. BackFi design is energy efficient, as it relies on backscattering alone and needs insignificant power, hence the energy consumed per bit is small.

  • Stevens Le Blond, David Choffnes, William Caldwell, Peter Druschel, Nicholas Merritt

    Effectively anonymizing Voice-over-IP (VoIP) calls requires a scalable anonymity network that is resilient to traffic analysis and has sufficiently low delay for high-quality voice calls. The popular Tor anonymity network, for instance, is not designed for the former and cannot typically achieve the latter. In this paper, we present the design, implementation, and experimental evaluation of Herd, an anonymity network where a set of dedicated, fully interconnected cloud-based proxies yield suitably low-delay circuits, while untrusted superpeers add scalability. Herd provides caller/callee anonymity among the clients within a trust zone (e.g., jurisdiction) and under a strong adversarial model. Simulations based on a trace of 370 million mobile phone calls among 10.8 million users indicate that Herd achieves anonymity among millions of clients with low bandwidth requirements, and that superpeers decrease the bandwidth and CPU requirements of the trusted infrastructure by an order of magnitude. Finally, experiments using a prototype deployment on Amazon EC2 show that Herd has a delay low enough for high-quality calls in most cases.

  • Paolo Costa, Hitesh Ballani, Kaveh Razavi, Ian Kash

    Rack-scale computers, comprising a large number of microservers connected by a direct-connect topology, are expected to replace servers as the building block in data centers. We focus on the problem of routing and congestion control across the rack's network, and find that high path diversity in rack topologies, in combination with workload diversity across it, means that traditional solutions are inadequate. We introduce R2C2, a network stack for rack-scale computers that provides flexible and efficient routing and congestion control. R2C2 leverages the fact that the scale of rack topologies allows for low-overhead broadcasting to ensure that all nodes in the rack are aware of all network flows. We thus achieve rate-based congestion control without any probing; each node independently determines the sending rate for its flows while respecting the provider's allocation policies. For routing, nodes dynamically choose the routing protocol for each flow in order to maximize overall utility. Through a prototype deployed across a rack emulation platform and a packet-level simulator, we show that R2C2 achieves very low queuing and high throughput for diverse and bursty workloads, and that routing flexibility can provide significant throughput gains.

  • Hitesh Ballani, Paolo Costa, Christos Gkantsidis, Matthew P. Grosvenor, Thomas Karagiannis, Lazaros Koromilas, Greg O'Shea

    Many network functions executed in modern datacenters, e.g., load balancing, application-level QoS, and congestion control, exhibit three common properties at the data plane: they need to access and modify state, to perform computations, and to access application semantics -- this is critical since many network functions are best expressed in terms of application-level messages. In this paper, we argue that the end hosts are a natural enforcement point for these functions and we present Eden, an architecture for implementing network functions at end hosts with minimal network support. Eden comprises three components, a centralized controller, an enclave at each end host, and Eden-compliant applications called stages. To implement network functions, the controller configures stages to classify their data into messages and the enclaves to apply action functions based on a packet's class. Our Eden prototype includes enclaves implemented both in the OS kernel and on programmable NICs. Through case studies, we show how application-level classification and the ability to run actual programs on the data-path allows Eden to efficiently support a broad range of network functions at the network's edge.

  • Maria Konte, Roberto Perdisci, Nick Feamster

    Bulletproof hosting Autonomous Systems (ASes)--malicious ASes fully dedicated to supporting cybercrime--provide freedom and resources for a cyber-criminal to operate. Their services include hosting a wide range of illegal content, botnet C&C servers, and other malicious resources. Thousands of new ASes are registered every year, many of which are often used exclusively to facilitate cybercrime. A natural approach to squelching bulletproof hosting ASes is to develop a reputation system that can identify them for takedown by law enforcement and as input to other attack detection systems (e.g., spam filters, botnet detection systems). Unfortunately, current AS reputation systems rely primarily on data-plane monitoring of malicious activity from IP addresses (and thus can only detect malicious ASes after attacks are underway), and are not able to distinguish between malicious and legitimate but abused ASes.

    As a complement to these systems, in this paper, we explore a fundamentally different approach to establishing AS reputation. We present ASwatch, a system that identifies malicious ASes using exclusively the control-plane (i.e., routing) behavior of ASes. ASwatch's design is based on the intuition that, in an attempt to evade possible detection and remediation efforts, malicious ASes exhibit "agile" control plane behavior (e.g., short-lived routes, aggressive re-wiring). We evaluate our system on known malicious ASes; our results show that ASwatch detects up to 93% of malicious ASes with a 5% false positive rate, which is reasonable to effectively complement existing defense systems.

  • Renaud Hartert, Stefano Vissicchio, Pierre Schaus, Olivier Bonaventure, Clarence Filsfils, Thomas Telkamp, Pierre Francois

    SDN simpli??~Aes network management by relying on declarativity (high-level interface) and expressiveness (network ??~Bexibility). We propose a solution to support those features while preserving high robustness and scalability as needed in carrier-grade networks. Our solution is based on (i) a two-layer architecture separating connectivity and optimization tasks; and (ii) a centralized optimizer called DEFO, which translates high-level goals expressed almost in natural language into compliant network con??~Agurations. Our evaluation on real and synthetic topologies shows that DEFO improves the state of the art by (i) achieving better trade-o??~@s for classic goals covered by previous works, (ii) supporting a larger set of goals (re??~Aned tra??~Cc engineering and service chaining), and (iii) optimizing large ISP networks in few seconds. We also quantify the gains of our implementation, running Segment Routing on top of IS-IS, over possible alternatives (RSVP-TE and OpenFlow).

  • Chuanxiong Guo, Lihua Yuan, Dong Xiang, Yingnong Dang, Ray Huang, Dave Maltz, Zhaoyi Liu, Vin Wang, Bin Pang, Hua Chen, Zhi-Wei Lin, Varugis Kurien

    Can we get network latency between any two servers at any time in large-scale data center networks? The collected latency data can then be used to address a series of challenges: telling if an application perceived latency issue is caused by the network or not, defining and tracking network service level agreement (SLA), and automatic network troubleshooting. We have developed the Pingmesh system for largescale data center network latency measurement and analysis to answer the above question affirmatively. Pingmesh has been running in Microsoft data centers for more than four years, and it collects tens of terabytes of latency data per day. Pingmesh is widely used by not only network software developers and engineers, but also application and service developers and operators.

  • Stefano Vissicchio, Olivier Tilmans, Laurent Vanbever, Jennifer Rexford

    Centralizing routing decisions offers tremendous flexibility, but sacrifices the robustness of distributed protocols. In this paper, we present Fibbing, an architecture that achieves both flexibility and robustness through central control over distributed routing. Fibbing introduces fake nodes and links into an underlying linkstate routing protocol, so that routers compute their own forwarding tables based on the augmented topology. Fibbing is expressive, and readily supports flexible load balancing, traffic engineering, and backup routes. Based on high-level forwarding requirements, the Fibbing controller computes a compact augmented topology and injects the fake components through standard routing-protocol messages. Fibbing works with any unmodified routers speaking OSPF. Our experiments also show that it can scale to large networks with many forwarding requirements, introduces minimal overhead, and quickly reacts to network and controller failures.

  • Yasir Zaki, Thomas P?tsch, Jay Chen, Lakshminarayanan Subramanian, Carmelita G?rg

    Legacy congestion controls including TCP and its variants are known to perform poorly over cellular networks due to highly variable capacities over short time scales, self-inflicted packet delays, and packet losses unrelated to congestion. To cope with these challenges, we present Verus, an end-to-end congestion control protocol that uses delay measurements to react quickly to the capacity changes in cellular networks without explicitly attempting to predict the cellular channel dynamics. The key idea of Verus is to continuously learn a delay profile that captures the relationship between end-to-end packet delay and outstanding window size over short epochs and uses this relationship to increment or decrement the window size based on the observed short-term packet delay variations. While the delay-based control is primarily for congestion avoidance, Verus uses standard TCP features including multiplicative decrease upon packet loss and slow start. Through a combination of simulations, empirical evaluations using cellular network traces, and real-world evaluations against standard TCP flavors and state of the art protocols like Sprout, we show that Verus outperforms these protocols in cellular channels. In comparison to TCP Cubic, Verus achieves an order of magnitude (> 10x) reduction in delay over 3G and LTE networks while achieving comparable throughput (sometimes marginally higher). In comparison to Sprout, Verus achieves up to 30% higher throughput in rapidly changing cellular networks.

  • Ramakrishnan Durairajan, Paul Barford, Joel Sommers, Walter Willinger

    The complexity and enormous costs of installing new longhaul fiber-optic infrastructure has led to a significant amount of infrastructure sharing in previously installed conduits. In this paper, we study the characteristics and implications of infrastructure sharing by analyzing the long-haul fiber-optic network in the US. We start by using fiber maps provided by tier-1 ISPs and major cable providers to construct a map of the long-haul US fiber-optic infrastructure. We also rely on previously underutilized data sources in the form of public records from federal, state, and municipal agencies to improve the fidelity of our map. We quantify the resulting map's1 connectivity characteristics and confirm a clear correspondence between long-haul fiber-optic, roadway, and railway infrastructures. Next, we examine the prevalence of high-risk links by mapping end-to-end paths resulting from large-scale traceroute campaigns onto our fiber-optic infrastructure map. We show how both risk and latency (i.e., propagation delay) can be reduced by deploying new links along previously unused transportation corridors and rights-of-way. In particular, focusing on a subset of high-risk links is sufficient to improve the overall robustness of the network to failures. Finally, we discuss the implications of our findings on issues related to performance, net neutrality, and policy decision-making.

  • Fangfei Chen, Ramesh K. Sitaraman, Marcelo Torres

    Content Delivery Networks (CDNs) deliver much of the world's web, video, and application content on the Internet today. A key component of a CDN is the mapping system that uses the DNS protocol to route each client's request to a "proximal" server that serves the requested content. While traditional mapping systems identify a client using the IP of its name server, we describe our experience in building and rollingout a novel system called end-user mapping that identifies the client directly by using a prefix of the client's IP address. Using measurements from Akamai's production network during the roll-out, we show that end-user mapping provides significant performance benefits for clients who use public resolvers, including an eight-fold decrease in mapping distance, a two-fold decrease in RTT and content download time, and a 30% improvement in the time-to-first-byte. We also quantify the scaling challenges in implementing enduser mapping such as the 8-fold increase in DNS queries. Finally, we show that a CDN with a larger number of deployment locations is likely to benefit more from end-user mapping than a CDN with a smaller number of deployments.

  • Justine Sherry, Peter Xiang Gao, Soumya Basu, Aurojit Panda, Arvind Krishnamurthy, Christian Maciocco, Maziar Manesh, Jo?o Martins, Sylvia Ratnasamy, Luigi Rizzo, Scott Shenker

    Network middleboxes must offer high availability, with automatic failover when a device fails. Achieving high availability is challenging because failover must correctly restore lost state (e.g., activity logs, port mappings) but must do so quickly (e.g., in less than typical transport timeout values to minimize disruption to applications) and with little overhead to failure-free operation (e.g., additional per-packet latencies of 10-100s of us). No existing middlebox design provides failover that is correct, fast to recover, and imposes little increased latency on failure-free operations. We present a new design for fault-tolerance in middleboxes that achieves these three goals. Our system, FTMB (for Fault-Tolerant MiddleBox), adopts the classical approach of "rollback recovery" in which a system uses information logged during normal operation to correctly reconstruct state after a failure. However, traditional rollback recovery cannot maintain high throughput given the frequent output rate of middleboxes. Hence, we design a novel solution to record middlebox state which relies on two mechanisms: (1) 'ordered logging', which provides lightweight logging of the information needed after recovery, and (2) a 'parallel release' algorithm which, when coupled with ordered logging, ensures that recovery is always correct. We implement ordered logging and parallel release in Click and show that for our test applications our design adds only 30us of latency to median per packet latencies. Our system introduces moderate throughput overheads (5-30%) and can reconstruct lost state in 40-275ms for practical systems.

Syndicate content