STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems

Introduction

Practical storage systems often adopt erasure codes to tolerate device failures and sector failures, both of which are prevalent in the field. However, traditional erasure codes employ device-level redundancy to protect against sector failures, and hence incur significant space overhead. Recent sector-disk (SD) codes are available only for limited configurations due to the relatively strict assumption on the coverage of sector failures. By making a relaxed but practical assumption, we construct a general family of erasure codes called STAIR codes, which efficiently and provably tolerate both device and sector failures without any restriction on the size of a storage array and the numbers of tolerable device failures and sector failures. We propose the upstairs encoding and downstairs encoding methods, which provide complementary performance advantages for different configurations. We conduct extensive experiments to justify the practicality of STAIR codes in terms of space saving, encoding/decoding speed, and update cost. We demonstrate that STAIR codes not only improve space efficiency over traditional erasure codes, but also provide better computational efficiency than SD codes based on our special code construction.

Publications

Download

License

The source code of STAIR codes is released under the GNU/GPL license.

People

The software is developed by Advanced Network and System Research Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong.

Acknowledgments

The software of STAIR codes uses open source libraries Jerasure (Revision 1.2A) and GF-Complete (Revision 0.1) developed by Prof. James S. Plank.

The work is supported by grants AoE/E-02/08 and ECS CUHK419212 from the University Grants Committee of Hong Kong