Differentially-Private Network Trace Analysis

Frank McSherry and Ratul Mahajan
Appears in: 
CCR October 2010

We consider the potential for network trace analysis while providing the guarantees of “differential privacy.” While differential privacy provably obscures the presence or absence of individual records in a dataset, it has two major limitations: analyses must (presently) be expressed in a higher level declarative language; and the analysis results are randomized before returning to the analyst.

We report on our experiences conducting a diverse set of analyses in a differentially private manner. We are able to express all of our target analyses, though for some of them an approximate expression is required to keep the error-level low. By running these analyses on real datasets, we find that the error introduced for the sake of privacy is often (but not always) low even at high levels of privacy. We factor our learning into a toolkit that will be likely useful for other analyses. Overall, we conclude that differential privacy shows promise for a broad class of network analyses.