To balance performance and storage efficiency, modern clustered file systems (CFSes) often first store data with random replication (i.e., distributing replicas across randomly selected nodes), followed by encoding the replicated data with erasure coding. We argue that random replication, while being commonly used, does not take into account erasure coding and hence will raise both performance and availability issues to the subsequent encoding operation. We propose encoding-aware replication, which carefully places the replicas so as to (i) avoid cross-rack downloads of data blocks during encoding, (ii) preserve availability without data relocation after encoding, and (iii) maintain load balancing as in random replication. We implement encoding-aware replication on HDFS, and show via testbed experiments that it achieves significant encoding throughput gains over random replication. We also show via discrete-event simulations that encoding-aware replication remains effective under various parameter choices in a large-scale setting. We further show that encoding-aware replication evenly distributes replicas as in random replication.
Encoding-Aware Replication is developed by the Advanced Network and System Research Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong.