Proceedings of the ACM on Networking: PACMNET: Vol. 1, No. CoNEXT3. 2023

Full Citation in the ACM Digital Library

PACMNET, V1, CoNEXT3, December 2023 Editorial

The Proceedings of the ACM on Networking (PACMNET) series present the highest-quality research conducted in the areas of emerging computer networks and their applications. We encourage submissions that present new technologies, novel experimentation, creative use of networking technologies, and new insights made possible using analysis. The journal is strongly supported by the ACM Special Interest Group on Communications and Computer Networks (SIGCOMM) and involves top-level researchers in its Editorial Board.

This issue contains papers submitted for the June '23 deadline. 129 long papers and 19 short papers were submitted to this deadline. 24 of the long papers and 5 of the short papers have been accepted after a rigorous review process organized with 50 Associate Editors and a few external reviewers. After review, each paper was conditionally accepted (with shepherding), allowed a "oneshot- major" revision, or rejected. The review process was split into two rounds. All submissions received at least 3 reviews in the first round. The papers that advanced to the second round received at least two additional reviews. After an online discussion phase among the reviewers, papers were extensively discussed during an online Associate Editors' meeting where the final decisions were made. In case of major revision, the authors have been invited to address the reviewers' comments and prepare a thoroughly revised version of their contribution. Revised papers have been then thoroughly reviewed again before being either accepted or rejected. This issue collects papers that have been finally accepted for publication after going through the minor revision process.

The long papers published in this issue and in the two previous ones will be presented at the CoNEXT 2023 conference on December 5-8, in Paris, France together with the short papers that appear in the conference proceedings.

Amoeba: Circumventing ML-supported Network Censorship via Adversarial Reinforcement Learning

Embedding covert streams into a cover channel is a common approach to circumventing Internet censorship, due to censors' inability to examine encrypted information in otherwise permitted protocols (Skype, HTTPS, etc.). However, recent advances in machine learning (ML) enable detecting a range of anti-censorship systems by learning distinct statistical patterns hidden in traffic flows. Therefore, designing obfuscation solutions able to generate traffic that is statistically similar to innocuous network activity, in order to deceive ML-based classifiers at line speed, is difficult.

In this paper, we formulate a practical adversarial attack strategy against flow classifiers as a method for circumventing censorship. Specifically, we cast the problem of finding adversarial flows that will be misclassified as a sequence generation task, which we solve with Amoeba, a novel reinforcement learning algorithm that we design. Amoeba works by interacting with censoring classifiers without any knowledge of their model structure, but by crafting packets and observing the classifiers' decisions, in order to guide the sequence generation process. Our experiments using data collected from two popular anti-censorship systems demonstrate that Amoeba can effectively shape adversarial flows that have on average 94% attack success rate against a range of ML algorithms. In addition, we show that these adversarial flows are robust in different network environments and possess transferability across various ML models, meaning that once trained against one, our agent can subvert other censoring classifiers without retraining.

C2Store: C2 Server Profiles at Your Fingertips

How can we build a definitive capability for tracking C2 servers? Having a large-scale continuously updating capability would be essential for understanding the spatiotemporal behaviors of C2 servers and, ultimately, for helping contain botnet activities. Unfortunately, existing information from threat intelligence feeds and previous works is often limited to a specific set of botnet families or short-term data collections. Responding to this need, we present C2Store, an initiative to provide the most comprehensive information on C2 servers. Our work makes the following contributions: (a) we develop techniques to collect, verify, and combine C2 server addresses from five types of sources, including uncommon platforms, such as GitHub and Twitter; (b) we create an open-access annotated database of 335,967 C2 servers across 133 malware families, which supports semantically-rich and smart queries; (c) we identify surprising behaviors of C2 servers with respect to their spatiotemporal patterns and behaviors. First, we successfully mine Twitter and GitHub and identify C2 servers with a precision of 97% and 94%, respectively. Furthermore, we find that the threat feeds identify only 24% of the servers in our database, with Twitter and GitHub providing 32%. A surprising observation is the identification of 250 IP addresses, each of which hosts more than 5 C2 servers for different botnet families at the same time. Overall, we envision C2Store as an ongoing effort that will facilitate research by providing timely, historical, and comprehensive C2 server information by critically combining multiple sources of information.

Dances with Blues: Harnessing Multi-Frequency Carriers for Commodity Bluetooth Backscatter

We present DanBlue, a commodity Bluetooth backscatter system that can take multi-frequency signals as excitations. Unlike all prior systems, DanBlue leverages ambient Bluetooth signals of various frequencies to backscatter in the standard Bluetooth-hopping way. To do so, we first introduce an edge proxy to identify uncontrolled ambient Bluetooth signals. Then, we design a wideband channel hopping to enable fast frequency shifts for low-power tags, empowering backscatter hopping much like active Bluetooth hopping. We prototype the DanBlue tag using off-the-shelf FPGAs and the DanBlue edge with commercial off-the-shelf (COTS) chips. Through comprehensive field studies, we show that DanBlue supports hopping from any-frequency Bluetooth excitations to any-frequency Bluetooth channels. Furthermore, the accuracy of frequency identification is as high as 99% with less than 7.1 ms latency. For the first time, we demonstrate DanBlue can emulate the Bluetooth protocol stack and seamlessly build connectivity with multiple active Bluetooth radios. We believe DanBlue is taking a crucial step forward on fully functioning battery-free Bluetooth.

Data-driven Analysis of the Cost-Performance Trade-off of Reconfigurable Intelligent Surfaces in a Production Network

This paper presents a comprehensive study on the deployment of Reconfigurable Intelligent Surfaces (RIS) in urban environments with poor radio coverage. We focus on the city of London, a large metropolis where radio network planning presents unique challenges due to diverse geographical and structural features. Using crowd-sourced datasets, we analyze the Reference Signal Received Power (RSRP) from end-user devices to understand the existing radio coverage landscape of a major Mobile Network Operator (MNO). Our study identifies areas with poor coverage and proposes the deployment of RIS to enhance signal strength and coverage. We selected a set of potential sites for RIS deployment and, combining data from the MNO, data extracted from a real RIS prototype, and a ray-tracing tool, we analyzed the gains of this novel technology with respect to deploying more conventional technologies in terms of RSRP, coverage, and cost-efficiency.

To the best of our knowledge, this is the first data-driven analysis of the cost-efficiency of RIS technology in the production of urban networks. Our findings provide compelling evidence about the potential of RIS as a cost-efficient solution for enhancing radio coverage in complex urban mobile networks. More specifically, our results indicate that large-scale RIS technology, when applied in real-world urban mobile network scenarios, can achieve 72% of the coverage gains attainable by deploying additional cells with only 22% of their Total Cost of Ownership (TCO) over a 5-year timespan. Consequently, RIS technology offers around 3x higher cost-efficiency than other more conventional coverage-enhancing technologies.

DDoS2Vec: Flow-Level Characterisation of Volumetric DDoS Attacks at Scale

Volumetric Distributed Denial of Service (DDoS) attacks have been a severe threat to the Internet for more than two decades. Some success in mitigation has been achieved based on numerous defensive techniques created by the research community, implemented by the industry, and deployed by network operators. However, evolution is not a privilege of mitigations, and DDoS attackers have found better strategies and continue to cause harm. A key challenge in winning this race is understanding the various characteristics of DDoS attacks in network traffic at scale and in a realistic manner.

In this paper, we propose DDoS2Vec, a novel approach to characterise DDoS attacks in real-world Internet traffic using Natural Language Processing (NLP) techniques. DDoS2Vec is a domain-specific application of Latent Semantic Analysis that learns vector representations of potential DDoS attacks. We look into the link between natural language and computer network communication in a way that has not been previously studied. Our approach is evaluated on a large-scale dataset of flow samples collected from an Internet eXchange Point (IXP) in one year. We evaluate the performance of DDoS2Vec via multi-label classification in a Machine Learning (ML) scenario. DDoS2Vec characterises DDoS attacks more clearly than other baselines - including NLP-based approaches inspired by recent networks research and a basic non-NLP solution.

DINC: Toward Distributed In-Network Computing

In-network computing provides significant performance benefits, load reduction, and power savings. Still, an in-network service's functionality is strictly limited to a single hardware device. Research has focused on enabling on-device functionality, with limited consideration to distributed in-network computing. This paper explores the applicability of distributed computing to in-network computing. We present DINC, a framework enabling distributed in-network computing, generating deployment strategies, overcoming resource constraints and providing functionality guarantees across a network. It uses multi-objective optimization to provide a deployment strategy, slicing P4 programs accordingly. DINC was evaluated using seven different workloads on both data center and wide-area network topologies, demonstrating feasibility and scalability, providing efficient distribution plans within seconds.

Dissecting the Performance of Satellite Network Operators

The rapid growth of satellite network operators (SNOs) has revolutionized broadband communications, enabling global connectivity and bridging the digital divide. As these networks expand, it is important to evaluate their performance and efficiency. This paper presents the first comprehensive study of SNOs. We take an opportunistic approach and devise a methodology which allows to identify public network measurements performed via SNOs. We apply this methodology to both M-Lab and RIPE public datasets which allowed us to characterize low level performance and footprint of up to 18 SNOs operating in different orbits. Finally, we identify and recruit paid testers on three popular SNOs (Starlink, HughesNet, and ViaSat) to evaluate the performance of popular applications like web browsing and video streaming.

Divided at the Edge - Measuring Performance and the Digital Divide of Cloud Edge Data Centers

Cloud providers are highly incentivized to reduce latency. One way they do this is by locating data centers as close to users as possible. These “cloud edge” data centers are placed in metropolitan areas and enable edge computing for residents of these cities. Therefore, which cities are selected to host edge data centers determines who has the fastest access to applications requiring edge compute — creating a digital divide between those closest and furthest from the edge. In this study we measure latency to the current and predicted cloud edge of three major cloud providers around the world. Our measurements use the RIPE Atlas platform targeting cloud regions, AWS Local Zones, and network optimization services that minimize the path to the cloud edge. An analysis of the digital divide shows rising inequality as the relative difference between users closest and farthest from cloud compute increases. We also find this inequality unfairly affects lower income census tracts in the US. This result is extended globally using remotely sensed night time lights as a proxy for wealth. Finally, we demonstrate that low earth orbit satellite internet can help to close this digital divide and provide more fair access to the cloud edge.

Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video Streaming Quality

Quality of Experience (QoE) and QoE models are of an increasing importance to networked systems. The traditional QoE modeling for video streaming applications builds a one-size-fits-all QoE model that underserves atypical viewers who perceive QoE differently. To address the problem of atypical viewers, this paper proposes iQoE (individualized QoE), a method that employs explicit, expressible, and actionable feedback from a viewer to construct a personalized QoE model for this viewer. The iterative iQoE design exercises active learning and combines a novel sampler with a modeler. The chief emphasis of our paper is on making iQoE sample-efficient and accurate. By leveraging the Microworkers crowdsourcing platform, we conduct studies with 120 subjects who provide 14,400 individual scores. According to the subjective studies, a session of about 22 minutes empowers a viewer to construct a personalized QoE model that, compared to the best of the 10 baseline models, delivers the average accuracy improvement of at least 42% for all viewers and at least 85% for the atypical viewers. The large-scale simulations based on a new technique of synthetic profiling expand the evaluation scope by exploring iQoE design choices, parameter sensitivity, and generalizability.

Enhancing the Unlinkability of Circuit-Based Anonymous Communications with k-Funnels

Anonymous communication systems are essential tools for preserving privacy and freedom of expression. However, traffic analysis attacks make it challenging to maintain unlinkability in circuit-based anonymity networks like Tor, enabling adversaries to deanonymize communications. To address this problem, we introduce k-funnel, a new security primitive that enhances the unlinkability of circuit-based anonymity networks, and we present BriK, a Tor pluggable transport that implements k-funnels. k-Funnels offer k-anonymity to a group of k clients by jointly tunneling their circuits' traffic through a bridge while ensuring that the client-generated flows are indistinguishable. BriK incorporates several defense mechanisms against traffic analysis attacks, including traffic shaping schemes, synchronization protocols, and approaches for monitoring exposure to statistical disclosure attacks. Our evaluation shows that BriK is able to support web browsing and video streaming while offering k-anonymity. We evaluate the security of BriK against traffic correlation attacks leveraging state-of-the-art deep learning classifiers without considering auxiliary information and find it highly resistant. Although k-funnels require the cooperation of mutually trusted clients, limiting their coordination, our work presents a new practical solution to strengthen unlinkability in circuit-based anonymity systems.

EXPLORA: AI/ML EXPLainability for the Open RAN

The Open Radio Access Network (RAN) paradigm is transforming cellular networks into a system of disaggregated, virtualized, and software-based components. These self-optimize the network through programmable, closed-loop control, leveraging Artificial Intelligence (AI) and Machine Learning (ML) routines. In this context, Deep Reinforcement Learning (DRL) has shown great potential in addressing complex resource allocation problems. However, DRL-based solutions are inherently hard to explain, which hinders their deployment and use in practice. In this paper, we propose EXPLORA, a framework that provides explainability of DRL-based control solutions for the Open RAN ecosystem. EXPLORA synthesizes network-oriented explanations based on an attributed graph that produces a link between the actions taken by a DRL agent (i.e., the nodes of the graph) and the input state space (i.e., the attributes of each node). This novel approach allows EXPLORA to explain models by providing information on the wireless context in which the DRL agent operates. EXPLORA is also designed to be lightweight for real-time operation. We prototype EXPLORA and test it experimentally on an O-RAN-compliant near-real-time RIC deployed on the Colosseum wireless network emulator. We evaluate EXPLORA for agents trained for different purposes and showcase how it generates clear network-oriented explanations. We also show how explanations can be used to perform informative and targeted intent-based action steering and achieve median transmission bitrate improvements of 4% and tail improvements of 10%.

Exploring the Benefits of Carbon-Aware Routing

Carbon emissions associated with fixed networks can be significant. However, accounting for these emissions is hard, requires changes to deployed equipment, and has contentious benefits. This work sheds light on the benefits of carbon aware networks, by exploring a set of potential carbon-related metrics and their use to define link-cost in carbon-aware link-state routing algorithms. Using realistic network topologies, traffic patterns and grid carbon intensity, we identify useful metrics and limitations to carbon emissions reduction. Consequently, a new heuristic carbon-aware traffic engineering algorithm, CATE, is proposed. CATE takes advantage of carbon intensity and routers' dynamic power consumption, combined with ports power down, to minimize carbon emissions. Our results show that there is no silver bullet to significant carbon reductions, yet there are promising directions without changes to existing routers' hardware.

FlexCP: A Scalable Multipath TCP Proxy for Cellular Networks

Research has shown that Multipath TCP (MPTCP) improves the quality of a TCP connection by exploiting multiple paths, but its adoption in the wide area network is still fledgling. While MPTCP-TCP proxying is often employed as a practical solution, the performance of a split-connection proxy is suboptimal – it wastes CPU cycles on content relaying between two connections while it does not efficiently leverage multiple CPU cores in packet processing.

We present FlexCP, a high-performance MPTCP-TCP proxy based on the following properties. First, FlexCP operates by translating the two protocols on a packet level. This approach not only avoids the overhead of flow reassembly and memory copying, but it greatly simplifies the implementation as the proxy stays away from reliable data transfer, socket buffer management, and per-hop congestion/flow control. Second, FlexCP maintains connection-to-core affinity for multiple subflows of the same MPTCP connection and its corresponding TCP connection by leveraging SmartNIC. This enables a lock-free implementation for packet processing, which significantly improves the performance. Our evaluation demonstrates that FlexCP achieves 281 Gbps of connection proxying on a single machine, outperforming existing proxies by up to 6.3× in terms of throughput while it incurs little extra latency over direct TCP/MPTCP connections.

Millions of Low-latency State Insertions on ASIC Switches

Key-value data structures are an essential component of today's stateful packet processors such as load balancers, packet schedulers, and more. Realizing key-value data structures entirely in the data-plane of an ASIC switch would bring enormous energy savings. Yet, today's implementations are ill-suited for stateful packet processing as they support only a limited amount of flow-state insertions per second into these data structures. In this paper, we present SWITCHAROO, a mechanism for realizing key-value data structures on programmable ASIC switches that supports both high-frequency insertions and fast lookups entirely in the data plane. We show that SWITCHAROO can be realized on ASIC, supports millions of flow-state insertions per second with only limited amount of packet recirculation.

Modular Data Plane Verification for Compositional Networks

Modern networks are increasingly using layering and bridging to form a compositional architecture. Layering protocols like VXLAN create multiple overlay networks on top of a single underlay network infrastructure. This makes network configurations even more complex, and error-prone. To check the correctness of such compositional networks, one needs to model the dependency across multiple layers (underlay and overlay) and multiple domains (different VPNs/VPCs). Existing verifiers, which are optimized to scale in single-layer single-domain networks, exhibit scalability limitations when applied to compositional networks. This paper proposes MNV, a modular network verifier that scales to large compositional networks. At its core is a new verification method termed decompose-merge reasoning, which decomposes the network into self-contained modules, verifies each module independently, and merges the verification results. Our experiments show that for a typical data center network virtualized with VXLAN, to check reachability for more than 100 million pairs of subnets, MNV is at least 100x faster than state-of-the-art tools.

Packed to the Brim: Investigating the Impact of Highly Responsive Prefixes on Internet-wide Measurement Campaigns

Internet-wide scans are an important tool to evaluate the deployment of services. To enable large-scale application layer scans, a fast, stateless port scan (e.g., using ZMap) is often performed ahead of time to collect responsive targets. It is a common expectation that port scans on the entire IPv4 address space provide a relatively unbiased view as they cover the complete address space. Previous work, however, has found prefixes where all addresses share particular properties. In IPv6, aliased prefixes and fully responsive prefixes, i.e., prefixes where all addresses are responsive, are a well-known phenomenon. However, there is no such in-depth analysis for prefixes with these responsiveness patterns in IPv4.

This paper delves into the underlying factors of this phenomenon in the context of IPv4 and evaluates port scans on a total of 161 ports (142 TCP & 19 UDP ports) from three different vantage points. To account for packet loss and other scanning artifacts, we propose the notion of a new category of prefixes, which we call highly responsive prefixes (HRPs). Our findings show that the share of HRPs can make up 70% of responsive addresses on selected ports. Regarding specific ports, we observe that CDNs contribute to the largest fraction of HRPs on TCP/80 and TCP/443, while TCP proxies emerge as the primary cause of HRPs on other ports. Our analysis also reveals that application layer handshakes to targets outside HRPs are, depending on the chosen service, up to three times more likely to be successful compared to handshakes with targets located in HRPs. To improve future scanning campaigns conducted by the research community, we make our study's data publicly available and provide a tool for detecting HRPs. Furthermore, we propose an approach for a more efficient, ethical, and sustainable application layer target selection. We demonstrate that our approach has the potential to reduce the number of TLS handshakes by up to 75% during an Internet-wide scan while successfully obtaining 99 % of all unique certificates.

Practical Packet Deflection in Datacenters

Bursts, sudden surges in network utilization, are a significant root cause of packet loss and high latency in datacenters. Packet deflection, re-routing packets that arrive at a local hotspot to neighboring switches, is shown to be a potent countermeasure against bursts. Unfortunately, existing deflection techniques cannot be implemented in today's datacenter switches. This is because, to minimize packet drops and remain effective under extreme load, existing deflection techniques rely on certain hardware primitives (e.g., extracting packets from arbitrary locations in the queue) that datacenter switches do not support. In this paper, we address the implementability hurdles of packet deflection. This paper proposes heuristics for approximating state-of-the-art deflection techniques in programmable switches. We introduce Simple Deflection which deflects excess traffic to randomly selected, non-congested ports and Preemptive Deflection (PD) in which switches identify the packets likely to be selected for deflection and preemptively deflect them before they are enqueued. We implement and evaluate our techniques on a testbed with Intel Tofino switches. Our testbed evaluations show that Simple and Preemptive Deflection improve the 99th percentile response times by 8× and 425×, respectively, compared to a baseline drop-tail queue under 90% load. Using large-scale network simulations, we show that the performance of our algorithms is close to the deflection techniques that they intend to approximate, e.g., PD achieves 4% lower 99th percentile query completion times (QCT) than Vertigo, a recent deflection technique that cannot be implemented in off-the-shelf switches, and 2.5× lower QCT than ECMP under 95% load.

Revocation Speedrun: How the WebPKI Copes with Fraudulent Certificates

The TLS ecosystem depends on certificates to bootstrap secure connections. Certificate Authorities (CAs) are trusted to issue these correctly. However, as a result of security breaches or attacks, certificates may be issued fraudulently and need to be revoked prematurely.

Revocation, as a reactive measure, is fundamentally damage control and, as such, time is critical. Therefore, measuring reaction delay is the first step to identifying how well the revocation system functions.

In this paper we attempt to characterize the current performance of the WebPKI in dealing with fraudulent certificates. We present measurements of each step in the revocation process: the detection of certificate issuance through Certificate Transparency (CT) monitoring, the administrative revocation process at popular CAs, and the revocation checking behavior of end-user clients, both in a controlled virtualized environment and in the wild. We perform two live measurements, in 2022 and 2023, respectively, to provide a longitudinal comparison.

We find that detection and revocation of fraudulent certificates is quick and efficient when leveraging CT and can be completed within 6.5 hours on average. Furthermore, CT is being increasingly enforced by some browsers. However, ∼83% of the clients we observed, across popular browsers, brands and OSes, completely disregard a certificate's status, whileall of the studied browsers still display soft-fail behavior, making them vulnerable to attackers capable of interfering with the network. Of the clients that do check revocation, we find that 35% can be made to accept a revoked certificate through the use of OCSP Stapling. We expect this number to grow with client-side adoption of OCSP Stapling [RFC6961]. Current OCSP expiration times allow a revoked certificate to remain fully valid for up to 7 days for the majority of CAs, exposing clients to attacks.

SPADA: A Sparse Approximate Data Structure Representation for Data Plane Per-flow Monitoring

Accurate per-flow monitoring is critical for precise network diagnosis, performance analysis, and network operation and management in general. However, the limited amount of memory available on modern programmable devices and the large number of active flows force practitioners to monitor only the most relevant flows with approximate data structures, limiting their view of network traffic. We argue that, due to the skewed nature of network traffic, such data structures are, in practice, heavily underutilized, i.e. sparse, thus wasting a significant amount of memory.

This paper proposes a Sparse Approximate Data Structure (SPADA) representation that leverages sparsity to reduce the memory footprint of per-flow monitoring systems in the data plane while preserving their original accuracy. SPADA representation can be integrated into a generic per-flow monitoring system and is suitable for several measurement use cases. We prototype SPADA in P4 for a commercial FPGA target and test our approach with a custom simulator that we make publicly available, on four real network traces over three different monitoring tasks. Our results show that SPADA achieves 2× to 11× memory footprint reduction with respect to the state-of-the-art while maintaining the same accuracy, or even improving it.