Distributed consensus is fundamental in distributed systems for achieving fault-tolerance. The Paxos algorithm has long dominated this domain, although it has been recently challenged by algorithms such as Raft and Viewstamped Replication Revisited. These algorithms rely on Paxos's original assumptions, unfortunately these assumptions are now at odds with the reality of the modern internet. Our insight is that current consensus algorithms have significant availability issues when deployed outside the well defined context of the datacenter. To illustrate this problem, we developed Coracle, a tool for evaluating distributed consensus algorithms in settings that more accurately represent realistic deployments. We have used Coracle to test two examples of network configurations that contradict the liveness claims of the Raft algorithm. Through the process of exercising these algorithms under more realistic assumptions, we demonstrate wider availability issues faced by consensus algorithms when deployed on real world networks.
In this paper we challenge the applicability of entropy-based approaches for detecting and diagnosis network traffic anomalies, and claim that full statistics (i.e., empirical probability distributions) should be applied to improve the changedetection capabilities. We support our claim by detecting and diagnosing large-scale traffic anomalies in a real cellular network, caused by specific OTT (Over The Top) services and smartphone devices. Our results clearly suggest that anomaly detection and diagnosis based on entropy analysis is prone to errors and misses typical characteristics of traffic anomalies, particularly in the studied scenario.
We present Monocle, a system that systematically monitors the network data plane, and verifies that it corresponds to the view that the SDN controller builds and tries to enforce in the switches. Our evaluation shows that Monocle is capable of fine-grained per-rule monitoring for the majority of rules. In addition, it can help controllers to cope with switches that exhibit transient inconsistencies between their control plane and data plane states.
Many geo-distributed systems rely on a replica selection algorithms to communicate with the closest set of replicas. Unfortunately, the bursty nature of the Internet traffic and ever changing network conditions present a problem in identifying the best choices of replicas. Suboptimal replica choices result in increased response latency and reduced system performance. In this work we present GeoPerf, a tool that tries to automate testing of geo-distributed replica selection algorithms. We used GeoPerf to test Cassandra and MongoDB, two popular data stores, and found bugs in each of these systems.
Increasing network speeds challenge the packet processing performance of networked systems. This can mainly be attributed to processing overhead caused by the split between the kernel-space network stack and user-space applications. To mitigate this overhead, we propose Santa, an application agnostic kernel-level cache of frequent requests. By allowing user-space applications to o?oad frequent requests to the kernel-space, Santa o?ers drastic performance improvements and unlocks the speed of kernel-space networking for legacy server software without requiring extensive changes.
HTTP-based video streaming services have been dominating the global IP traffic over the last few years. Caching of video content reduces the load on the content servers. In the case of Dynamic Adaptive Streaming over HTTP (DASH), for every video the server needs to host multiple representations of the same video file. These individual representations are further broken down into smaller segments. Hence, for each video the server needs to host thousands of segments out of which, the client downloads a subset of the segments. Also, depending on the network conditions, the adaptation scheme used at the client-end might request a different set of video segments (varying in bitrate) for the same video. The caching of DASH videos presents unique challenges. In order to optimize the cache hits and minimize the misses for DASH video streaming services we propose an Adaptation Aware Cache (AAC) framework to determine the segments that are to be prefetched and retained in the cache. In the current scheme, we use bandwidth estimates at the cache server and the knowledge of the rate adaptation scheme used by the client to estimate the next segment requests, thus improving the prefetching at the cache.
We present a flexible scheme and an optimization algorithm for request routing in Content Delivery Networks (CDN). Our online approach, which is based on Lyapunov theory, provides a stable quality of service to clients, while improving content delivery delays. It also reduces data transport costs for operators.
In this paper, we introduce MMPTCP, a novel transport protocol which aims at unifying the way data is transported in data centres. MMPTCP runs in two phases; initially, it randomly scatters packets in the network under a single congestion window exploiting all available paths. This is beneficial to latency-sensitive flows. During the second phase, MMPTCP runs in Multi-Path TCP (MPTCP) mode, which has been shown to be very efficient for long flows. Initial evaluation shows that our approach significantly improves short flow completion times while providing high throughput for long flows and high overall network utilisation.
Many emerging applications and the ubiquitous wireless signals have accelerated the development of Device Free localization (DFL) techniques, which can localize objects without the need to carry any wireless devices. Most traditional DFL methods have a main drawback that as the pre-obtained Received Signal Strength (RSS) measurements (i.e., fingerprint) in one area cannot be directly applied to the new area for localization, and the calibration process of each area will result in the human effort exhausting problem. In this paper, we propose FALE, a fine-grained transferring DFL method that can adaptively work in different areas with little human effort and low energy consumption. FALE employs a rigorously designed transferring function to transfer the fingerprint into a projected space, and reuse it across different areas, thus greatly reduce the human effort. On the other hand, FALE can reduce the data volume and energy consumption by taking advantage of the compressive sensing (CS) theory. Extensive real-word experimental results also illustrate the effectiveness of FALE.
Authentication of smart objects is a major challenge for the Internet of Things (IoT), and has been left open in DTLS. Leveraging locally managed IPv6 addresses with identitybased cryptography (IBC), we propose an efficient end-to-end authentication that (a) assigns a robust and deploymentfriendly federation scheme to gateways of IoT subnetworks, and (b) has been evaluated with a modern twisted Edwards elliptic curve cryptography (ECC). Our early results demonstrate feasibility and promise efficiency after ongoing optimisations.
We propose eSDN; a practical approach for deploying new datacenter transports without requiring any changes to the switches. eSDN uses light-weight SDN controllers at the end-hosts for querying network state. It obviates the need for statistics collection by a centralized controller especially on short timescales. We show that eSDN can scale well and allow a range of datacenter transports to be realized.
A common pattern in the architectures of modern interactive web-services is that of large request fan-outs, where even a single end-user request (task ) arriving at an application server triggers tens to thousands of data accesses (sub-tasks) to different stateful backend servers. The overall response time of each task is bottlenecked by the completion time of the slowest sub-task, making such workloads highly sensitive to the tail of latency distribution of the backend tier. The large number of decentralized application servers and skewed workload patterns exacerbate the challenge in addressing this problem. We address these challenges through BetteR Batch (BRB). By carefully scheduling requests in a decentralized and taskaware manner, BRB enables low-latency distributed storage systems to deliver predictable performance in the presence of large request fan-outs. Our preliminary simulation results based on production workloads show that our proposed design is at the 99th percentile latency within 38% of an ideal system model while offering latency improvements over the state-of-the-art by a factor of 2.
Content-Centric Networking (CCN) has recently emerged as a clean-slate Future Internet architecture which has a completely different communication pattern compared with exiting IP network. Since the World Wide Web has become one of the most popular and important applications on the Internet, how to effectively support the dominant browser and server based web applications is a key to the success of CCN. However, the existing web browsers and servers are mainly designed for the HTTP protocol over TCP/IP networks and cannot directly support CCN-based web applications. Existing research mainly focuses on plug-in or proxy/gateway approaches at client and server sides, and these schemes seriously impact the service performance due to multiple protocol conversions. To address above problems, we designed and implemented a CCN web browser and a CCN web server to natively support CCN protocol. To facilitate the smooth evolution from IP networks to CCN, CCNBrowser and CCNxTomcat also support the HTTP protocol besides the CCN. Experimental results show that CCNBrowser and CCNxTomcat outperform existing implementations. Finally, a real CCN-based web application is deployed on a CCN experimental testbed, which validates the applicability of CCNBrowser and CCNxTomcat.
Many networking visioners agree that 5G will be much more than the incremental improvement, in terms of data rate, of 4G. Besides the mobile networks, 5G will fundamentally influence the core infrastructure as well. In our vision the realization of the challenging promises of 5G (e.g. extremely fast, low-overhead, low-delay access of mostly cloudified services and content) will require the massive use of multipathing equipped with low overhead transport solutions tailored to fast, reliable and secure data retrieval from cloud architectures. In this demo we present a prototype architecture supporting such services by making use of automatically configured multipath service chains implementing network coding based transport solutions over off-the-shelf software defined networking (SDN) components.
The Resource Public Key Infrastructure (RPKI) stores attestation objects for Internet resources. In this demo, we present RPKI MIRO, an open source software framework to monitor and inspect these RPKI objects. RPKI MIRO provides resource owners, RPKI operators, researchers, and lecturers with intuitive access to the content of the deployed RPKI repositories. It helps to optimize the repository structure and to identify failures.
SDN opens a new chapter in network troubleshooting as besides misconfigurations and firmware/hardware errors, software bugs can occur all over the SDN stack. As an answer to this challenge the networking community developed a wealth of piecemeal SDN troubleshooting tools aiming to track down misconfigurations or bugs of a specific nature (e.g. in a given SDN layer). In this demonstration we present EPOXIDE, an Emacs based modular framework, which can effectively combine existing network and software troubleshooting tools in a single platform and defines a possible way of integrated SDN troubleshooting.
The demand-led growth of datacenter networks has meant that many constituent technologies are beyond the budget of the wider community. In order to make and validate timely and relevant new contributions, the wider community requires accessible evaluation, experimentation and demonstration environments with specification comparable to the subsystems of the most massive datacenter networks. We demonstrate NetFPGA, an open-source platform for rapid prototyping of networking devices with I/O capabilities up to 100Gbps. NetFPGA offers an integrated environment that enables networking research by users from a wide range of disciplines: from hardware-centric research to formal methods.
The need for fault tolerance and scalability is leading to the development of distributed SDN operating systems and applications. But how can you develop such systems and applications reliably without access to an expensive testbed? We continue to observe SDN development practices using full system virtualization or heavyweight containers, increasing complexity and overhead while decreasing usability. We demonstrate a simpler and more efficient approach: using Mininet's cluster mode to easily deploy a virtual testbed of lightweight containers on a single machine, an ad hoc cluster, or a dedicated hardware testbed. By adding an open source, distributed network operating system such as ONOS, we can create a flexible and scalable open source development platform for distributed SDN system and application software development.
This demo presents WiMAC, a general-purpose wireless testbed for researchers to quickly prototype a wide variety of real-time MAC protocols for wireless networks. As the interface between the link layer and the physical layer, MAC protocols are often tightly coupled with the underlying physical layer, and need to have extremely small latencies. Implementing a new MAC requires a long time. In fact, very few MACs have ever been implemented, even though dozens of new MAC protocols have been proposed. To enable quick prototyping, we employ the mechanism vs. policy separation to decompose the functionality in the MAC layer and the PHY layer. Built on the separation framework, WiMAC achieves the independence of the software from the hardware, offering a high degree of function reuse and design flexibility. Hence, our platform not only supports easy cross-layer design but also allows protocol changes on the fly. Following the 802.11-like reference design, we demonstrate that deploying a new MAC protocol is quick and simple on the proposed platform through the implementation of the CSMA/CA and CHAIN protocols.
Modern web pages are very complex; each web page consists of hundreds of objects that are linked from various servers all over the world. While mechanisms such as caching reduce the overall number of end-to-end requests saving bandwidth and loading time, there is still a large portion of content that is re-fetched - despite not having changed. In this demo, we present Extreme Cache, a web caching architecture that enhances the web browsing experience through a smart pre-fetching engine. Our extreme cache tries to predict the rate of change of web page objects to bring cacheable content closer to the user.
We present an Intelligent Training Exercise Environment (itee1 ), a fully automated Cyber Defense Competition platform. The main features of i-tee are: automated attacks, automated scoring with immediate feedback using a scoreboard, and background traffic generation. The main advantage of the platform is easy integration into existing curricula and suitability for continuous education as well as on-site training at companies. The platform implements a modular approach called learning spaces for implementing different competitions and hands-on labs. The platform is highly automated to enable execution with up to 30 teams by one person using a single server. The platform is publicly available under MIT license.
The Resource Public Key Infrastructure (RPKI) allows BGP routers to verify the origin AS of an IP prefix. In this demo, we present a software extension which performs prefix origin validation in the web browser of end users. The browser extension shows the RPKI validation outcome of the web server infrastructure for the requested web domain. It follows the common plug-in concepts and does not require special modifications of the browser software. It operates on live data and helps end users as well as operators to gain better insight into the Internet security landscape.
We demonstrate an optical switch design that can scale up to a thousand ports with high per-port bandwidth (25 Gbps+) and low switching latency (40 ns). Our design uses a broadcast and select architecture, based on a passive star coupler and fast tunable transceivers. In addition we employ time division multiplexing to achieve very low switching latency. Our demo shows the feasibility of the switch data plane using a small testbed, comprising two transmitters and a receiver, connected through a star coupler.
Despite network monitoring and testing being critical for computer networks, current solutions are both extremely expensive and inflexible. This demo presents OSNT (www.osnt.org), a communitydriven, high-performance, open-source traffic generator and capture system built on top of the NetFPGA-10G board which enables flexible network testing. The platform supports full line-rate traffic generation regardless of packet size across the four card ports, packet capture filtering and packet thinning in hardware and sub-usec time precision in traffic generation and capture, corrected using an external GPS device. Furthermore, it provides a software APIs to test the dataplane performance of multi-10G switches, providing a starting point for a number of different test cases. OSNT flexibility is further demonstrated through the OFLOPS-turbo platform: an integration of OSNT with the OFLOPS OpenFlow switch performance evaluation platform, enabling control and data plane evaluation of 10G switches. This demo showcases the applicability of the OSNT platform to evaluate the performance of legacy and OpenFlowenabled networking devices, and demonstrates it using commercial switches.
The quickly growing demand for wireless networks and the numerous application-specific requirements stand in stark contrast to today's inflexible management and operation of WiFi networks. In this paper, we present and evaluate O PEN SDWN, a novel WiFi architecture based on an SDN/NFV approach. O PEN SDWN exploits datapath programmability to enable service differentiation and fine-grained transmission control, facilitating the prioritization of critical applications. O PEN SDWN implements per-client virtual access points and per-client virtual middleboxes, to render network functions more flexible and support mobility and seamless migration. O PEN SDWN can also be used to out-source the control over the home network to a participatory interface or to an Internet Service Provider.
We present Policy Graph Abstraction (PGA) that graphically expresses network policies and service chain requirements, just as simple as drawing whiteboard diagrams. Different users independently draw policy graphs that can constrain each other. PGA graph clearly captures user intents and invariants and thus facilitates automatic composition of overlapping policies into a coherent policy.
Network Function Virtualization promises to reduce the cost to deploy and to operate large networks by migrating various network functions from dedicated hardware appliances to software instances running on general purpose networking and computing platforms. In this paper we demonstrate Scylla a Programmable Network Fabric architecture for Enterprise WLANs. The framework supports basic Virtual Network Function lifecycle management functionalities such as instantiation, monitoring, and migration. We release the entire platform under a permissive license for academic use.
End-to-end service delivery often includes transparently inserted Network Functions (NFs) in the path. Flexible service chaining will require dynamic instantiation of both NFs and traffic forwarding overlays. Virtualization techniques in compute and networking, like cloud and Software Defined Networking (SDN), promise such flexibility for service providers. However, patching together existing cloud and network control mechanisms necessarily puts one over the above, e.g., OpenDaylight under an OpenStack controller. We designed and implemented a joint cloud and network resource virtualization and programming API. In this demonstration, we show that our abstraction is capable for flexible service chaining control over any technology domains.
We present a demonstration of a real-time distributed MIMO system, DMIMO. DMIMO synchronizes transmissions from 4 distributed MIMO transmitters in time, frequency and phase, and performs distributed multi-user beamforming to independent clients. DMIMO is built on top of a Zynq hardware platform integrated with an FMCOMMS2 RF front end. The platform implements a custom 802.11n compatible MIMO PHY layer which is augmented with a lightweight distributed synchronization engine. The demonstration shows the received constellation points, channels, and effective data throughput at each client.
Welcome to the July issue of Computer Communications Review. Over the past months we have tried to make CCR a resource where our community publishes fresh new ideas, expresses positions on interesting new research directions, as well as reports back on community activities, like workshops and regional meetings. In addition, we have introduced the student advice column and the column edited by our industrial board aiming to provide a clearer bridge between scientific practice and technology in commercial products. I am really proud to see all these additions being embraced by the community.
This issue features three technical papers, two on cloud computing and one on TCP. Our sincerest thanks to all the authors that submit technical contributions to CCR. And my personal thanks to the tireless editorial board that always aims towards outstanding quality while providing constructive feedback for the continuous improvement of the submitted manuscripts.
The editorial zone features three papers as well. One reports on the workshop on Internet economics that took place late last year. The other two cover (i) research challenges in multi-domain network measurement and monitoring, and (ii) the lessons learnt from using RIPE’s ATLAS for measurement research. I hope that both editorials will provide useful insights to the CCR audience.
Our industrial column features a contribution from Akamai. The authors describe how research in our community has influenced the design of the content delivery network (CDN) of Akamai, as well as the practical realities they had to deal with to create that bridge between academic output and a scalable, robust content delivery network that serves trillion of requests per day.
This issue is coming one month before the annual SIGCOMM conference, that will take place in London in August. Since a couple of years ago, SIGCOMM is the venue where CCR is presenting its "best-of" papers for the previous four issues (July 2014 to April 2015). I am really happy to announce the following two "best of CCR" papers for the year 2014-2015.
Technical paper: "Programming Protocol-Independent Packet Processors", by P. Bosshart (Barefoot Networks), D. Daly (Intel), G. Gibb (Barefoot Networks), M. Izzard (Barefoot Networks), N. McKeown (Stanford University), J. Rexford (Princeton University), C. Schlesinger (Barefoot Networks), D. Talayco (Barefoot Networks), A. Vahdat (Google), G. Varghese (Microsoft), and D. Walker (Princeton University).
Editorial paper: "A Primer on IPv4 Scarcity", by P. Richter (TU Berlin/ICSI), M. Allman (ICSI), R. Bush (Internet Initiative Japan), and V. Paxson (UC Berkeley/ICSI).
Congratulations to the authors of the two best papers and I am really looking forward to seeing all of you in London in August. Dina Papagiannaki CCR Editor
Given a set of datacenters and groups of application clients, well-connected datacenters can be rented as traffic proxies to reduce client latency. Rental costs must be minimized while meeting the application specific latency needs. Here, we formally define the Cooperative Group Provisioning problem and show it is NP-hard to approximate within a constant factor. We introduce a novel greedy approach and demonstrate its promise through extensive simulation using real cloud network topology measurements and realistic client churn. We ﬁnd that multi-cloud deployments dramatically increase the likelihood of meeting group latency thresholds with minimal cost increase compared to single-cloud deployments.
It is well-known that cloud application performance can critically depend on the network. Over the last years, several systems have been developed which provide the application with the illusion of a virtual cluster : a star-shaped virtual network topology connecting virtual machines to a logical switch with absolute bandwidth guarantees. In this paper, we debunk some of the myths around the virtual cluster embedding problem. First, we show that the virtual cluster embedding problem is not NP-hard, and present the fast and optimal embedding algorithm VC-ACE for arbitrary datacenter topologies. Second, we argue that resources may be wasted by enforcing star-topology embeddings, and alternatively promote a hose embedding approach. We discuss the computational complexity of hose embeddings and derive the HVC-ACE algorithm. Using simulations we substantiate the benefits of hose embeddings in terms of acceptance ratio and resource footprint.
This paper examines several TCP characteristics and their effect on existing passive RTT measurement techniques. In particular, using packet traces from three geographically distributed vantage points, we find relatively low use of TCP timestamps and significant presence of stretch acknowledgements. While the former simply affects the applicability of some measurement techniques, the latter may in principle affect the accuracy of RTT estimation. Using these insights, we quantify implications of common methodologies for passive RTT measurement. In particular, we show that, unlike delayed TCP acknowledgement, stretch acknowledgements do not distort RTT estimations.
The perfSONAR-based Multi-domain Network Performance Measurement and Monitoring Workshop was held on February 20-21, 2014 in Arlington, VA. The goal of the workshop was to review the state of the perfSONAR effort and catalyze future directions by cross-fertilizing ideas, and distilling common themes among the diverse perfSONAR stakeholders that include: network operators and managers, endusers and network researchers. The timing and organization for the second workshop is signiﬁcant because there are an increasing number of groups within NSF supported data-intensive computing and networking programs that are dealing with measurement, monitoring and troubleshooting of multi-domain issues. These groups are forming explicit measurement federations using perfSONAR to address a wide range of issues. In addition, the emergence and wide-adoption of new paradigms such as software-defined networking are taking shape to aid in traffic management needs of scientific communities and network operators. Consequently, there are new challenges that need to be addressed for extensible and programmable instrumentation, measurement data analysis, visualization and middleware security features in perfSONAR. This report summarizes the workshop efforts to bring together diverse groups for delivering targeted short/long talks, sharing latest advances, and identifying gaps that exist in the community for solving end-toend performance problems in an effective, scalable fashion.