Christian Henke

Empirical Evaluation of Hash Functions for Multipoint Measurements

By: 
Christian Henke, Carsten Schmoll, and Tanja Zseby
Appears in: 
CCR July 2008

A broad spectrum of network measurement applications demand passive multipoint measurements in which data from multiple observation points has to be correlated. Examples are the passive measurement of one-way delay or the observation of the path that a packet takes through a network. Nevertheless, due to high data rates and the need for fine granular measurements, the resource consumption for passive measurements can be immense. Furthermore, the resource consumption depends on the traffic in the network, which usually is highly dynamic.

Public Review By: 
Chadi Barakat

Traffic sampling is a key component in a passive measurement architecture especially when it comes to monitoring high speed networks. It has the advantage of reducing the volume of collected traffic, an important requirement for scalability. The main question stays however how to achieve this traffic reduction without impacting the accuracy of measurements and without harming the applications behind. This question becomes more difficult to answer when it comes to measurements carried out at different points inside the network. Randomness in sampling is known to reduce the bias but there is also a need to make measurement points collaborate together so that the same (randomly selected) traffic is not sampled several times at different points inside the network. Here comes the importance of hash functions (over random methods) which are the main subject of this paper. Indeed, one can use hash functions to (deterministically) select a subset of the traffic to measure and tune the hash functions of the different measurement points so that the total accuracy is improved and the bias is reduced.
Diverse functions have been proposed in the literature (BOB recommended by PSAMP, OAAT, CRC, etc). The present paper overviews these functions and propose an evaluation framework based on real traces and statistical tools.
The methods are compared with respect to their bias and speed and recommendations are made at the end of the paper. We believe that this paper, even though it does not bring any new hash function or sampling method, is worth reading for operators and researchers that want to work in this area. Its strength is in the number of hash functions it covers and the thorough experimental and statistical evaluation it proposes.

Syndicate content