SpreadSketch: Toward Invertible and Network-Wide Detection of Superspreaders

Introduction

Superspreaders (i.e., hosts with numerous distinct connections) remain severe threats to production networks. How to accurately detect superspreaders in real-time at scale remains a non-trivial yet challenging issue. We present SpreadSketch, an invertible sketch data structure for network-wide superspreader detection with the theoretical guarantees on memory space, performance, and accuracy. SpreadSketch tracks candidate super- spreaders and embeds estimated fan-outs in binary hash strings inside small and static memory space, such that multiple SpreadSketch instances can be merged to provide a network- wide measurement view for recovering superspreaders and their estimated fan-outs. We present formal theoretical analysis on SpreadSketch in terms of space and time complexities as well as error bounds. Trace-driven evaluation shows that SpreadSketch achieves higher accuracy and performance over state-of-the-art sketches. Furthermore, we prototype SpreadSketch in P4 and show its feasible deployment in commodity hardware switches.

Publication

Lu Tang, Qun Huang, and Patrick P. C. Lee
"SpreadSketch: Toward Invertible and Network-Wide Detection of Superspreaders"
IEEE International Conference on Computer Communications (INFOCOM) 27-30 April 2020 // Beijing, China
[pdf] [software]

Download

Version 1.2.0 (March 2022): spreadsketch-1.2.0.tar.gz (md5sum: 19948a3fabf1f78b83376347f67d61e1)
Version 1.1.0 (October 2021): spreadsketch-1.1.0.tar.gz (md5sum: 25787081e4315b19331f12e5785dbbfe)
Version 1.0.0 (January 2020): spreadsketch-1.0.0.tar.gz (md5sum: 3c4257f167dabcf979947559c3688d9e)
ChangeLog
README is included in the tarball.
Please also refer to the GitHub page: https://github.com/Grace-TL/SpreadSketch

Please contact Lu Tang if you have any questions.

License

The source code of SpreadSketch is released under the GNU/GPL license.