*** Welcome to piglix ***

Tornado codes


In computer science, Tornado codes are a class of erasure codes that support error correction. Tornado codes require a constant C more redundant blocks than the more data-efficient Reed–Solomon erasure codes, but are much faster to generate and can fix erasures faster. Software-based implementations of tornado codes are about 100 times faster on small lengths and about 10,000 times faster on larger lengths than Reed–Solomon erasure codes. Since the introduction of Tornado codes, many other similar erasure codes have emerged, most notably Online codes, LT codes and Raptor codes.

Tornado codes use a layered approach. All layers except the last use an LDPC error correction code, which is fast but has a chance of failure. The final layer uses a Reed–Solomon correction code, which is slower but is optimal in terms of failure recovery. Tornado codes dictates how many levels, how many recovery blocks in each level, and the distribution used to generate blocks for the non-final layers.

The input data is divided into blocks. Blocks are sequences of bits that are all the same size. Recovery data uses the same block size as the input data. The erasure of a block (input or recovery) is detected by some other means. (For example, a block from disk does not pass a CRC check or a network packet with a given sequence number never arrived.)

The number of recovery blocks is given by the user. Then the number of levels is determined along with the number of blocks in each level. The number in each level is determined by a factor B which is less than one. If there are N input blocks, the first recovery level has B*N blocks, the second has B*B*N, the third has B*B*B*N, and so on.

All levels of recovery except the final one use an LDPC, which works by xor (exclusive-or). Xor operates on binary values, 1s and 0s. A xor B is 1 if A and B have different values and 0 if A and B have the same values. If you are given (A xor B) and A, you can determine the value for B. (A xor B xor A = B) Similarly, if you are given (A xor B) and B, you can determine the value for A. This extends to multiple values, so given (A xor B xor C xor D) and any 3 of the values, the missing value can be recovered.

So the recovery blocks in level one are just the xor of some set of input blocks. Similarly, the recovery blocks in level two are each the xor of some set of blocks in level one. The blocks used in the xor are chosen randomly, without repetition. However, the number of blocks xor'ed to make a recovery block is chosen from a very specific distribution for each level.


...
Wikipedia

...