What is Erasure Coding and When Should it Be Used?

shutterstock_90058201RAID has long been a mainstay of data protection. RAID protects against data loss from bad blocks or bad disks by either mirroring data on multiple disks in a storage array or adding parity blocks to the data, which allows for the recovery of failed blocks. Now another approach, erasure coding, is gaining traction.

Erasure coding uses sophisticated mathematics to break the data into multiple fragments and then it places the fragments in different locations. Locations can be disks within an array or, more likely, distributed nodes. If the fragments are spread over 16 nodes, for example, any ten can recover all the data. In other words, the data is protected even if six nodes, fail. This also means that if a node(s) fails, all the other nodes participate in replacing it, which makes erasure coding not as CPU-constrained as rebuilds are using RAID in a single array.

Erasure coding is finding many applications. It is often used, for example, for object storage, and vendors of file- and block-level storage are beginning to leverage the technology. Erasure coding has proven useful for scaled-out storage, protecting petabytes of lukewarm, back-up or archival data stored across multiple locations in the cloud. Of course, mirroring remains an option for these applications. This approach, however, requires double the storage capacity, but it does eliminate rebuilds.

RAID is still a strong strategy for safeguarding active, primary data. Data remains safe within the data center and rebuilds won’t tax available WAN bandwidth. To determine whether RAID or erasure coding is best for you, assess the impact each would have on your data protection needs.