Information Leakage in Encrypted Deduplication via Frequency Analysis


In this project, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. In the attack side, we propose a locality-based attack that exploits chunk locality to increase the coverage of inferred chunks against encrypted deduplication. Also, we propose the distribution-based attack, which builds on a statistical approach to model the relative frequency distributions of plaintexts and ciphertexts, and improves the inference precision (i.e., have high confidence on the correctness of inferred ciphertext-plaintext pairs) of the locality-based attack. In the defense side, we present two schemes, namely MinHash encryption and scrambling, which aim to disturb the frequency rank or break the chunk locality of ciphertext workloads.

This website provides both attack and defense toolkits against the FSL dataset to demonstrate how frequency analysis can be launched and defended. It also provides a deduplication-based prototype to demonstrate the metadata access overhead of our combined defense scheme.



Please find the README file in the package.

Demo Video


The software is developed by the Applied Distributed Systems Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong (CUHK) and the University of Electronic Science and Technology of China (UESTC).


The source code is released under the GNU/GPL license.