ParaRC: Embracing Sub-Packetization for Repair Parallelization in MSR-Coded Storage

Introduction

Minimum-storage regenerating (MSR) codes are provably optimal erasure codes that minimize the repair bandwidth (i.e., the amount of traffic being transferred during a repair operation), with the minimum storage redundancy, in distributed storage systems. However, the practical repair performance of MSR codes still has significant room to improve, as the mathematical structure of MSR codes make their repair operations difficult to parallelize. We present ParaRC, a parallel repair framework for MSR codes. ParaRC exploits the sub- packetization nature of MSR codes to parallelize the repair of sub-blocks and balance the repair load (i.e., the amount of traffic transferred at a node) across the available nodes. We show that there exists a trade-off between the repair band- width and the maximum repair load, and further propose a fast heuristic that approximately minimizes the maximum re- pair load with limited search time for large coding parameters. We prototype our heuristic in ParaRC and show that ParaRC reduces the degraded read and full-node recovery times over the traditional centralized repair approach in MSR codes by up to 59.3% and 39.2%, respectively.

Publication

Download

License

The source code of the prototype is released under the Apache license 2.0 and is restricted for academic purposes only. Commercial use of the source code is not allowed.