Linux 6.2 core will include RAID5/6 in BTRFS

To include the Linux 6.2 proposed BTRFS improvements regarding the correction of the problem regarding the correction of the problem. “ write hole ” in the implementation of RAID 5/6. The essence of the problem boils down to the fact that if the collapse happened during the recording, it is initially impossible to understand which block on which of the RAID devices was signed correctly, and in which record was not completed. In the event of an attempt to restore RAID in such a situation, the destruction of blocks corresponding to the unwritten blocks may occur, since the state of the RAID blocks is synchronized. This problem occurs in any RAID1/5/6 RADS, where special measures are not taken to combat a similar effect.

in the implementation of the RAID, like RAID1 in BTRFS, this problem is solved by using control amounts in both copies, if the data is mismatch, they are simply restored from the second copy. This approach also works if some device begins to give incorrect data instead of a complete refusal.

However, in the case of RAID5/6, the file system does not store control amounts for the blocks of parties: in the normal situation, the correctness of the blocks is checked by the fact that they are all equipped with a control amount, and the part of the combination can be recreated from the data. However, in the case of a partial record, this approach in certain situations may not work. In this case, when restoring the array, it is possible that the blocks that fell under the incomplete record will be restored incorrectly.

In the case of BTRFS, this problem is most relevant if the record made in size is less than a stroke. In this case, the file system must perform the operation-modification operation (Read-Modify-Write, RMW). If at the same time there are blocks with an incomplete record, in this case, the RMW operation can cause destruction that will not be detected, regardless of the control amounts. The developers made changes in which the RMW operation checks the control amount of the blocks before performing this operation, and if necessary, the data recovery performs the check of control amounts after recording. Unfortunately, in the situation with the recording of an incomplete strap (RMW), this leads to additional overhead costs for calculating control amounts, but significantly increases reliability. For RAID6, such a logic is not yet ready, but for such a refusal in RAID6 it is necessary that the recording fails on 2 devices at once, which is less likely.

/Media reports cited above.