Products

High Availability, High Efficiency: Meet Arcfra File Storage

2024-12-19

Arcfra Team

As a full-stack software-defined infrastructure, Arcfra Enterprise Cloud Platform (AECP) provides integrated infrastructure resources with one unified platform. One of its components, Arcfra File Storage (AFS), offers stable, high-performance, and scalable distributed file storage services, to help enterprises efficiently store and manage unstructured data like text, images, and videos. With AFS, enterprises can meet diverse data storage needs with one AECP cluster, reducing the costs associated with legacy NAS procurement while boosting storage reliability and performance.

Product Architecture

AFS is deployed within the Arcfra Cloud Operating System (ACOS, AECP’s software foundation) cluster, operating as containers in the file controller. It utilizes Arcfra Block Storage (ABS) to provide underlying persistent block storage services, which can be managed uniformly through the Arcfra Operation Center.

Currently, AFS supports NFS protocol, offering file storage services by mounting the file system to application systems and external clients. This facilitates users in storing and managing unstructured data such as text, images, and videos simply and efficiently.

Key Features

01 Ensure Business Continuity with Data High Availability

With production-grade high availability features such as hard disk data checksum, multi-replica redundancy, erasure coding (EC), client access HA, and rack awareness, AFS can address a wide range of issues, including file controller failures, hard drive failures, node failures, and rack failures, ensuring business continuity and data reliability.

Data Redundancy Between Physical Nodes

AFS allows users to choose from replica and EC strategies for data redundancy protection.

1.Replica Strategy

Users can opt for two or three replica strategies with data copies placed on different physical nodes to avoid data loss caused by hardware failure. In case of node failure, the system will automatically trigger data recovery for unwritten data blocks and rebuild the data copies on healthy nodes, without manual intervention and additional space for hot backup.

2.EC Strategy

EC saves storage space by calculating parity blocks (M) for multiple data blocks (K), eliminating the need to store the complete data copy. If part of the data blocks are corrupted (≤ M), the corrupted data can be reconstructed with available data blocks and verification blocks (=K).

Compared with the replica strategy, EC can significantly reduce storage space utilization and improve storage efficiency given the same fault tolerance level. This makes it ideal for storing large data with relatively lower performance requirements, such as log servers and backup archives.

In AECP, users can configure EC for ABS and AFS separately, and flexibly choose between replica or EC for data redundancy protection.

Client Access High Availability

When a file controller fails, it will first trigger the HA protection of the file controller on the current node. If it cannot be pulled up on its physical server node, the corresponding access IP will drift to another healthy file controller to protect the access of the application to the file system from interruption. When the client has new data to write, it will trigger copy-on-write (COW), and after the file controller is restored, the access IP will be migrated back to the original controller node to balance the overall access loads.

02 Multi-Level Performance Optimization

AFS also enhances storage performance through a variety of optimization strategies during the process of writing data from the client to the actual disk.

Data Access: Prioritizing Distributing Files to Local File Controllers to Reduce Lateral Traffic

When creating a file system, virtual disks are created on each file controller, with priority given to writing files to the access point’s local virtual disk, thus reducing lateral traffic and enhancing file I/O performance.

Data Drop: Optimizing I/O Path with I/O Locality and Boost Mode

With AECP’s I/O localization feature, the file controller prioritizes the placement of data replicas on the host where they are located, reducing I/O request latency. With Boost Mode, the ABS Chunk can access the file controller’s Guest OS memory directly, bypassing the performance bottleneck caused by QEMU processing of I/O requests, thus significantly improving performance.

Currently, in the performance test (1M sequential read/write of 12GB files, based on two-replica strategy), AFS can provide storage performance of 8 GiB/s sequential read and 5 GiB/s sequential write, fully satisfying the access requirements of high-bandwidth file services.

03 Simple and Flexible O&M

Arcfra Operation Center (AOC) provides a unified GUI that allows users to perform one-click deployment, deletion, scaling-out, upgrading, and other lifecycle management operations for file storage clusters. Users can also manage clients’ and users’ access to file systems and files through NFS authentication, and achieve comprehensive and efficient file storage management through observability and alerting features.

Use Cases

01 Data Center Integration

AECP provides high-performance block storage and file storage with a single set of resource pools, satisfying the diverse storage needs of different business services and helping customers build a more streamlined IT infrastructure with lower TCO.

For applications requiring high performance and low latency such as financial transaction databases and online payment systems, users can pin the volumes in the cache and keep the key application data continuously supported with high-performance hardware. Applications handling unstructured data can mount a file system with a maximum capacity of 1PiB, eliminating the need to use several volumes. Moreover, AFS can provide file storage services for container applications using Kubernetes NFS CSI Driver, helping enterprises accelerate application containerization.

02 PACS Image & Large Capacity Resource Pool

With AFS, AECP can provide large-capacity, high-performance, and stable distributed storage and computing resources for imaging, ticketing, audio, and video applications. For example, healthcare institutions can use AECP to support Picture Archiving and Communication System (PACS), providing high-performance storage for images needed for recent diagnosis and treatment, as well as storage with balanced performance and capacity for images stored for a long period of time.

Moreover, applications can flexibly allocate resources, utilizing GPU passthrough and sharing for intelligent analysis. Container management capabilities provided by Arcfra Kubernetes Engine (AKE) further facilitate users’ application containerization process. Overall, AECP can help customers maximize the cost-effectiveness of IT systems with a simpler architecture, lower procurement costs, and higher application performance.

03 Large and Medium Backup Resource Pool

No need for separate backup server hardware. Backup management services and media servers can run directly on file-based, high-capacity hyper-converged nodes, also supporting other applications like desensitization, verification, and emergency response. A single AECP node can provide over 1GiB of write performance and more than 2TiB per hour recovery speed.

04 Integrated Distributed Disaster Recovery

Arcfra also offers an integrated distributed disaster recovery solution in collaboration with CDP vendors. This solution provides data protection and second-level RPO for Arcfra virtual machines, VMware virtual machines, and physical nodes. It also provides the necessary computing power and storage space for backup and emergency recovery.

Overall Advantages

Reliable and scalable: Feature high availability with both block storage HA features and file controllers HA protection. Support node-level and file-controller-level scaling-out.
Flexible and open: Adapt to mainstream operating systems and support to provide storage services for container applications using Kubernetes NFS CSI Driver. Provide enterprise-grade high-performance and highly reliable file storage services for both internal and external clients of the ACOS cluster through a single storage resource pool.
Convenient O&M: Support one-click deployment, upgrade, and scaling out. File storage cluster can be deployed within 10 minutes. Visualization and alarm capabilities can further enhance O&M efficiency.
Optimize Investment: Eliminate the need for separate NAS storage. Scale online as you wish to suit capacity and performance requirements.

To learn more about AFS, please visit our website.

About Arcfra

Arcfra simplifies enterprise cloud infrastructure with a full-stack, software-defined platform built for the AI era. We deliver computing, storage, networking, security, Kubernetes, and more — all in one streamlined solution. Supporting VMs, containers, and AI workloads, Arcfra offers future-proof infrastructure trusted by enterprises across e-commerce, finance, and manufacturing. Arcfra is recognized by Gartner as a Representative Vendor in full-stack hyperconverged infrastructure. Learn more at www.arcfra.com.