EDP: An Implementation of Even Data Placement in Distributed Reliable Deduplication Storage Systems

Introduction

Modern distributed storage systems aggregate mul- tiple storage nodes to provide scalable platforms for managing a tremendous amount of data. They often deploy deduplication to remove content-level redundancy to improve storage efficiency. However, deduplication inevitably leads to unbalanced data place- ment, thereby degrading read performance. This paper studies the load balance problem in reliable distributed deduplication storage systems. We argue that it is generally challenging to find a data placement that simultaneously achieves both read balance and storage balance objectives. To this end, we formulate a combinatorial optimization problem, and propose a greedy, polynomial-time Even Data Placement (EDP) algorithm, which identifies a data placement that effectively achieves read balance while maintaining storage balance. We further extend our EDP algorithm to heterogeneous environments. We demonstrate the effectiveness of our EDP algorithm under real-world workloads using both extensive simulations and prototype testbed experi- ments. In particular, our testbed experiments show that our EDP algorithm reduces the file read time by 37.41% compared to the baseline round-robin placement, and the reduction can further reach 52.11% in a heterogeneous setting.

Publication

Download

People

EDP is developed by the Advanced Network and System Research Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong (CUHK). This project is also affiliated with the School of Computer Science and Technology at the University of Science and Technology of China (USTC).

Please contact Min Xu if you have any questions.

License

The source code of EDP is released under the GNU/GPL license.

Acknowledgments

The software of EDP codes uses Jerasure (Revision 2.0) and GF-Complete developed by Prof. James S. Plank.

The work is supported by grants GRF CUHK413813 from the Research Grant Council of Hong Kong.