Xenofontas Dimitropoulos

On the Interplay of Link-Flooding Attacks and Traffic Engineering

Dimitrios Gkounis, Vasileios Kotronis, Christos Liaskos, Xenofontas Dimitropoulos
Appears in: 
CCR April 2016

Link-flooding attacks have the potential to disconnect even entire countries from the Internet. Moreover, newly proposed indirect link-flooding attacks, such as “Crossfire”, are extremely hard to expose and, subsequently, mitigate effectively. Traffic Engineering (TE) is the network’s natural way of mitigating link overload events, balancing the load and restoring connectivity. This work poses the question: Do we need a new kind of TE to expose an attack as well?

Public Review By: 
Katerina Argyraki

In a crossfire attack, the attacker disrupts the victim’s communications without ever sending any traffic directly to the victim; instead, the attacker identifies critical links that connect the victim to the Internet and floods them by sending traffic to “decoy” destinations, which happen to be served by the same links. How can we detect an attack that never generates traffic to the intended victim? This is the first paper that addresses this question. It makes the key observation that the detection of crossfire attacks can be a by-product of classic traffic engineering (TE): TE continuously changes the network routes in response to link overload and, as a side-effect, it forces the attacker to continuously change the decoys (in order to keep the critical links to the victim flooded); hence, simply by observing shifting traffic patterns, an administrator can eventually identify potential attacks sources, victims, and decoys. The paper also proposes a new kind of “attack-aware TE,” which reduces the frequency of routing changes caused by the attack. The reviewers appreciated the paper’s careful theoretical analysis and promising experimental evaluation, but most importantly the idea that route diversity, which is a natural result of TE, benefits the detection of crossfire attacks.

Estimating internet address space usage through passive measurements

Alberto Dainotti, Karyn Benson, Alistair King, kc claffy, Michael Kallitsis, Eduard Glatz, Xenofontas Dimitropoulos
Appears in: 
CCR January 2014
One challenge in understanding the evolution of Internet infrastructure is the lack of systematic mechanisms for monitoring the extent to which allocated IP addresses are actually used. Address utilization has been monitored via actively scanning the entire IPv4 address space. We evaluate
Public Review By: 
Renata Teixeira

This paper presents a novel approach for estimating the fraction of the IP address space that is actively used. The state-of-the-art in this area, ISI's Census project, issues active probes to every address block on the IPv4 space. Active probing suffers from high probing overhead. With the adoption of IPv6, any technique based solely on probing the entire address space may no longer work. The solution presented in this paper passively observes traffic to infer the fraction of used IPv4 address space. They say that an address block is used if it is sending or receiving traffic. Passive measurements introduce no probing overhead and hence the technique can potentially scale for IPv6. The use of passive measurements, however, brings two challenges. First, one single vantage point cannot observe traffic from all active addresses. Second, spoofed addresses may cause the technique to infer that an address is active when it is not. The main contributions of this paper are: (i) to show empirically that passive measurements do observe a large fraction of the used address space; and (ii) a technique to filter spoofed addresses. All reviewers appreciated the well thought-out approach presented in this paper. Although the estimation technique is simple (i.e., observed addresses minus spoofed ones), reviewers particularly liked the techniques to filter out spoofed addresses in two types of datasets: netflow traces and packet traces collected at darknets. Reviewers also acknowledged the validation and evaluation effort in the paper. Reviewers did give a number of suggestions to improve the presentation of the paper both to clarify explanations and get the ideas across more concisely. For example, the comparison with the ISI

pcapIndex: an index for network packet traces with legacy compatibility

Francesco Fusco, Xenofontas Dimitropoulos, Michail Vlachos, Luca Deri
Appears in: 
CCR January 2012

Long-term historical analysis of captured network traffic is a topic of great interest in network monitoring and network security. A critical requirement is the support for fast discovery of packets that satisfy certain criteria within large-scale packet repositories. This work presents the first indexing scheme for network packet traces based on compressed bitmap indexing principles. Our approach supports very fast insertion rates and results in compact index sizes. The proposed indexing methodology builds upon libpcap, the de-facto reference library for accessing packet-trace repositories.

Public Review By: 
Philip Levis

The pace of technological research can be dizzying, and the past decade has been especially amazing for networking. We've seen Internet traffic swing from the web to BitTorrent to Netflix; we’ve seen the source worms and hosts of other attacks expand from individuals to organized crime; we've seen whole new classes of networks, such as data centers, emerge. In a world where technology changes so quickly, stable, well-designed tools whose utility has stood the test of time are critically important. Tools let easily observe and understand networks even as they evolve. The longevity of tools such as tcpdump on one hand demonstrates their utility. On the other hand, it also shows their age: they were designed a long time ago. While networking has moved forward, so have many other fields, and applying new techniques to old problems can be a valuable contribution, deepening our understanding of networks. This paper is an excellent example of such a contribution. Tcpdump is a tool we've all used at some point. If you’ve ever tried to use tcpdump to search enormous packet traces, you know it can be tremendously slow. Franceso Fusco and his co-authors show how by changing tcpdump’s underlying library, libcap, to use state-of-the art bitmap indexes, one can in some cases speed it up by three orders of magnitude. In the worst case they observe, the speedup is a factor of 1.9. The reviewers for the paper agree that using bitmap indexes for packet traces is not a revolutionary new idea: it takes existing techniques from the database community and applies them to a problem in networking. But they also agree that the performance gains are impressive, and that the result is of significant practical benefit to networking researchers and engineers. The contribution of this paper lies in identifying an otherwise overlooked but very real problem and designing a good solution with cutting edge technology. Papers such as this one form the technical underpinnings of the tremendous gains we have seen in networks over the past decade and we hope to continue to see in the future.

On Cycles in AS Relationships

Xenofontas Dimitropoulos, M. Ángeles Serrano, and Dmitri Krioukov
Appears in: 
CCR July 2008

Several users of our AS relationship inference data asked us why it contained AS relationship cycles, e.g., cases where AS A is a provider of AS B, B is a provider of C, and C is a provider of A, or other cycle types. Having been answering these questions in private communications, we have eventually decided to write down our answers here for future reference.

Probabilistic Lossy Counting: An efficient algorithm for finding heavy hitters

Xenofontas Dimitropoulos, Paul Hurley, and Andreas Kind
Appears in: 
CCR January 2008

Knowledge of the largest traffic flows in a network is important for many network management applications. The problem of finding these flows is known as the heavy-hitter problem and has been the subject of many studies in the past years. One of the most efficient and well-known algorithms for finding heavy hitters is lossy counting [29].

Public Review By: 
Pablo Rodriguez

Imagine that you see a large number of individual transactions (such as Amazon book sales), and you want to calculate what are the top sellers today. Or imagine that you are monitoring network traffic and you want to know which hosts/subnets are responsible for most of the traffic. This is a problem of finding heavy hitters given a stream of elements.
A straight forward method to tackle this problem is to store each element identifier with a corresponding counter monitoring the number of occurrences of that element. Then, you sort the elements accordingly to their counters and you can easily get the most frequent elements. However, in many real scenarios, this simple solution is not efficient of computationally feasible. For instance, consider the case of tracking the pairs of IP address that generate the most traffic over some time period. You need 16,384 PBytes of memory and a lot of time to sort and scan that memory array, which often makes the problem non-computationally feasible.
For these reasons, during recent years, techniques to computing heavy hitters using limited memory resources have been investigated. A characteristic of these is that they cannot find exact heavy hitters, but instead approximate the heavy hitters of a data stream. The approximation typically lies in that the computed heavy hitters may include false negatives.
Lossy counting is a well-known algorithm for finding heavy hitters using limited memory. In lossy counting, an important parameter for each distinct element identifier in the table is its error bound. Such error bound reflects the potential error on the estimated frequency of an element. Its importance resides in the fact that elements with small error bounds are more likely to be removed from the lossy counting process than equal-frequency elements having a larger error bound.
This paper proposes to make the error bound substantially smaller than the deterministic error bound in existing lossy counting algorithms. By lowering the error bound, elements will stay in the lossy counting table for less time and the memory requirements will be lower. The authors call such system: Probabilistic Lossy Counting. The reason it works is because there is often a large number of small flows with large error bounds, which require a lot of table entries. Decreasing the error bound of such small flows can drastically reduce the table size.
The authors show how for the case where data streams exhibit Zipfian distribution, one can find a tighter probabilistic error bound on the estimated frequency of an element. For such scenarios, the probabilistic lossy counting algorithm provides good results. However, if the distribution cannot be approximated by a Zipf model, then, the error bound does not hold. The paper focuses on analyzing the system complexity in terms of the required memory size. However, more work is required to derive the required processing time, i.e., the number of memory accesses per each packet, and evaluate this. Overall, the evaluation experiments of the paper show that the probabilistic lossy counting algorithm exhibits substantially better memory consumption than lossy counting and multistage filters.

AS Relationships: Inference and Validation

Xenofontas Dimitropoulos, Dmitri Krioukov, Marina Fomenkov, Bradley Huffaker, Young Hyun, kc claffy, and George Riley
Appears in: 
CCR January 2007

Research on performance, robustness, and evolution of the global Internet is fundamentally handicapped without accurate and thorough knowledge of the nature and structure of the contractual relationships between Autonomous Systems (ASs). In this work we introduce novel heuristics for inferring AS relationships. Our heuristics improve upon previous works in several technical aspects, which we outline in detail and demonstrate with several examples. Seeking to increase the value and reliability of our inference results, we then focus on validation of inferred AS relationships.

Public Review By: 
Ernst Biersack

Inferring AS relationships using publicly available data is a difficult task for which various heuristics have been proposed. This paper revisits the problem, points out shortcomings of existing heuristics, and proposes improvements. The reviewers liked the paper for several reasons:

  • The paper does a nice job in reviewing the state of the art and improves on the existing heuristics.
  • The authors try to asses the quality of their inferences by contacting a small group ASs whom they asked for an explicit validation of the results. However, the sample size may be too small to allow any definite conclusions.
  • The heuristics proposed are implemented and the results of the AS inference made publicly available on a weekly basis. This should provide a basis on which further research can build on
    and compare its results against.

In summary, this paper combines existing and new heuristics for AS inference into a tool, the results of which are made available to the community.

Syndicate content