Information Leakage in Encrypted Deduplication via Frequency Analysis

Introduction

In this project, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. In the attack side, we propose a locality-based attack that exploits chunk locality to increase the coverage of inferred chunks against encrypted deduplication. Also, we propose the distribution-based attack, which builds on a statistical approach to model the relative frequency distributions of plaintexts and ciphertexts, and improves the inference precision (i.e., have high confidence on the correctness of inferred ciphertext-plaintext pairs) of the locality-based attack. In the defense side, we present two schemes, namely MinHash encryption and scrambling, which aim to disturb the frequency rank or break the chunk locality of ciphertext workloads.

This website provides both attack and defense toolkits against the FSL dataset to demonstrate how frequency analysis can be launched and defended. It also provides a deduplication-based prototype to demonstrate the metadata access overhead of our combined defense scheme.

Publication

Jingwei Li, Guoli Wei, Jiacheng Liang, Yanjing Ren, Patrick P. C. Lee, Xiaosong Zhang.
"Revisiting Frequency Analysis against Encrypted Deduplication via Statistical Distribution."
Proceedings of IEEE International Conference on Computer Communications (INFOCOM 2022), May 2022.
(AR: 225/1129 = 19.9%)
[pdf]
Jingwei Li, Patrick P. C. Lee, Chufeng Tan, Chuan Qin, and Xiaosong Zhang.
"Information Leakage in Encrypted Deduplication via Frequency Analysis: Attacks and Defenses."
ACM Transactions on Storage (TOS), 16(1), pp. 4:1-4:30, March 2020.
(An earlier version appeared in DSN 2017)
[pdf] [arXiv] [doi]
Jingwei Li, Chuan Qin, Patrick P. C. Lee, and Xiaosong Zhang
"Information Leakage in Encrypted Deduplication via Frequency Analysis."
Proceedings of the 47th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2017) (Regular paper), Denver, Colorado, USA, June 2017.
[pdf] [pptx]

Download

Please find the README file in the package.

Version 1.3 (December 2021) freqanalysis-1.3.zip (md5sum: 2acb402bb877c8afe24066a3c5bd20e5)
Version 1.2 (March 2019) freqanalysis-1.2.zip (md5sum: 57f74f218710599160858bffc2ac795a)
Version 1.1 (November 2017) freqanalysis-1.1.zip (md5sum: 7a9dc4bb5fb550cb972c7208170813c1)
Version 1.0 (June 2017) freqanalysis-1.0.zip (md5sum: 54c53a5092a5b57e6882258044615fac)
ChangeLog

Demo Video

People

The software is developed by the Applied Distributed Systems Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong (CUHK) and the University of Electronic Science and Technology of China (UESTC).

Jingwei Li
Chuan Qin
Patrick P. C. Lee
Guoli Wei
Jiacheng Liang
Yanjing Ren

License

The source code is released under the GNU/GPL license.