Data replication is a commonly used data redundancy strategy for enterprise cloud platforms. It ensures that even if one or some replicas are abnormal, the storage system can restore them through a healthy replica.
However, the mainstream data replication design cannot avoid the risk of data loss during the replica recovery process. This is because, until the replica recovery is fully completed, the number of replicas in the cluster remains below the expected level, resulting in a degradation of the replica count. Consequently, if the healthy replica also fails or unintentionally goes offline during this period, it greatly increases the likelihood of data loss.
Arcfra Enterprise Cloud Platform (AECP) supports data redundancy strategies, including data replication and erasure coding (EC); moreover, it enhances the replica strategy by introducing the “temporary replica” mechanism, which prevents the degradation of replica numbers during the replica restoration process and ensures the higher stability of core business services.
In AECP, the multi-replica strategy ensures that each piece of data has multiple identical replicas, which are distributed and stored across different devices based on established rules. This strategy is designed to mitigate data damage or loss from hardware failures. In the event of a hardware failure that causes one or more replicas in the cluster to go offline or become damaged, the remaining healthy replica(s) can continue to read and write data. At the same time, the system initiates the process of regenerating new replicas based on the available healthy data. This approach safeguards data integrity and ensures the continuous availability of the data.
Figure 1
Taking the two-replica strategy as an example (Figure 1), the storage volume is partitioned into multiple data blocks, each having two replicas. For instance, data block A’s replicas are stored on node 1 and node 2 respectively. This configuration ensures that even if one server (node) experiences a crash or failure, there is still at least one replica accessible and available.
In AECP, users have the flexibility to choose between two-replica and three-replica policies, each offering varying levels of resilience against hardware damage.
In addition to hardware damage, AECP clusters may encounter various issues, such as misoperation of hard disks, accidental server node restarts, and storage network disconnections. These problems can lead to temporary hardware disconnections (which can be reconnected after a while) and a degradation in the number of replicas.
Figure 2
Under a two-replica policy (Figure 2), for example, data A is synchronously written to two replicas. If Replica 2 experiences an abnormal disconnection, it will be removed, triggering data recovery to Replica 2'. During this phase, only Replica 1 (the healthy replica) can respond to I/O requests, resulting in a temporary degradation in the number of replicas. Once Replica 2' is fully rebuilt as Replica 2'’, the new replica (Replica 2'’) can resume writing I/O operations, and the number of available replicas meets the expected level once again.
During the replica recovery process, if the healthy replica is also damaged and gets disconnected, it is very likely to cause data loss.
Figure 3
For instance (Figure 3), if Replica 2 is abnormally disconnected, I/O can be normally written to Replica 1, with Replica 2 being restored based on Replica 1. During the recovery process, if Replica 1 also becomes inaccessible due to hardware damage or other reasons, it cannot be restored as there is no healthy replica available. In this scenario, any changes made to the data will not be written to any of the replicas. As all replicas are damaged, it will significantly increase the risk of data loss.
To prevent the degradation of replica numbers, AECP introduces the “temporary replica” design. This new strategy can keep the number of accessible replicas the same as expected during the replica recovery process. Even if the healthy replicas also become abnormal during this period, data can be restored through a specific mechanism (supporting complete restoration and partial restoration), greatly enhancing data security.
1. Replica recovery with at least one healthy replica
Under the two-replica and three-replica policies, when a single replica becomes abnormal, a temporary replica will be assigned to handle write operations for the new data. Simultaneously, a new replica is created based on the healthy replica.
Figure 4
For example (Figure 4), if Replica 2 becomes disconnected and inaccessible, the system will mark it as a failed replica, initiate replica recovery, and allocate a temporary replica. All newly generated data during the replica recovery process is synchronously written to Replica 1 and the temporary replica. Thus, the replica count remains intact for the new data. Additionally, a new replica (Replica 2') is created by duplicating the data from Replica 1. Once the recovery process is complete, Replica 2'’ becomes a new healthy replica, and the failed replica and temporary replica will be deleted from the system.
2. Replica recovery with no healthy replica
If, unfortunately, the last remaining healthy replica also becomes disconnected, the system can merge the failed replica with the temporary replica once the failed replica is reconnected (such as when the host is rebooted or the network is re-established). This merging process results in the formation of a complete replica. However, it’s important to note that during the recovery period, the VM is still unable to respond to I/O requests.
Figure 5
For example (Figure 5), if Replica 2 experiences an abnormal disconnection, the system will automatically initiate data recovery and generate a temporary replica. New data will be written to both Replica 1 and the temporary replica, while Replica 2' is created based on the data from Replica 1.
In the event of additional faults occurring during the replica recovery, there may not be a complete replica available for access, resulting in a complete disconnection of data A. However, once the failed replica (Replica 2) is reconnected, the system can integrate the data from Replica 2 with the temporary replica containing incremental data, forming a replica with complete data (Replica 3). At this point, data A becomes reconnected, capable of accepting read and write requests.
The system will then reinitiate replica recovery based on the data from Replica 3, resulting in the formation of Replica 1'. After the recovery process is completed, data A once again has two healthy replicas (Replica 3 and Replica 1'’). This entire process does not involve any degradation in the number of replicas.
The temporary replica is mainly designed to address the degradation in replica number caused by short-term hardware disconnection. To cope with unrecoverable hardware failures such as disk damage and multiple hardware failures, it is recommended to use a higher-level replica strategy (e.g., a three-replica policy) for data protection.
The creation of temporary replicas will occupy additional storage space, which, however, will be automatically reclaimed after replica recovery is completed.
Explore more AECP features and capabilities in our previous blogs:
Arcfra Virtual Machine High Availability Explained
Arcfra vs. VMware: VM Snapshot and I/O Performance Comparison
Arcfra vs. VMware: I/O Path Comparison and Performance Impact
For more information on AECP, please visit our website.
Arcfra is an IT innovator that simplifies on-premises enterprise cloud infrastructure with its full-stack, software-defined platform. In the cloud and AI era, we help enterprises effortlessly build robust on-premises cloud infrastructure from bare metal, offering computing, storage, networking, security, backup, disaster recovery, Kubernetes service, and more in one stack. Our streamlined design supports both virtual machines and containers, ensuring a future-proof infrastructure.
For more information, please visit www.arcfra.com.