Industries

Securities Low-Latency Trading System on Arcfra: Solution and Validation Test

Published on 2026-01-20 by Arcfra Team

Last edited on 2026-02-11

Low-latency market data feeds and trading systems have become essential for the core operations of securities institutions, including high-frequency trading (HFT), quantitative investment, and intelligent risk control. With only microsecond or even nanosecond-level latency, these systems enable institutions to capture opportunities and execute orders instantly during rapid market fluctuations. Consequently, they directly impact trading efficiency, execution pricing, and overall business profitability.

To ensure the necessary speed and performance, securities institutions typically deploy high-frequency physical servers, low-latency network interface cards (NICs), and low-latency switches. However, this high-end hardware is not only expensive but often suffers from underutilized computing and network resources. Furthermore, the significant physical footprint required in hosted data centers drives up the Total Cost of Ownership (TCO). Consequently, there is an urgent need to explore the feasibility of deploying low-latency systems on virtualized platforms, despite the common worries that low-latency systems face significant performance hurdles on VMs due to added abstraction layers and resource sharing.

Recently, a securities institution partnered with Arcfra to validate the performance of the Arcfra Enterprise Cloud Platform (AECP) — powered by the Arcfra Virtualization Engine (AVE) — in supporting its low-latency workloads. The test focused on market data reception and trade execution. The results demonstrate that both systems’ performance meets rigorous business requirements. Notably, the AVE-based trading NIC latency achieved levels comparable to those of physical servers.

Background

A securities institution utilizes low-latency systems for quantitative trading. The production environment employs high-frequency physical servers and Solarflare’s low-latency NICs to meet business requirements. Regarding market data reception, the institution utilizes UDP multicast to receive upstream FPGA-based market data, enabling faster data decoding. For the trading link, OpenOnload is deployed for unilateral acceleration.

OpenOnload is a high-performance, user-end network stack that accelerates TCP and UDP network I/O for applications using the BSD socket API. As OpenOnload is implemented based on the kernel bypass technology, the use of KVM-based virtualization should, in theory, not introduce significant latency overhead to the VM GuestOS, and thus achieve high resource utilization without significant performance loss.

Accordingly, the institution planned to evaluate the performance of quantitative trading in a virtualized environment. A testing environment built upon “AMD CPUs + 10Gb Ethernet + AVE” was deployed to conduct a comparative analysis against the physical-based production environment.

Evaluation of Server CPU: While Intel CPUs (such as the Intel i9) offer high frequency and are widely used in data centers, they typically feature fewer cores than AMD chips. Consequently, in this test, we chose to employ servers with high-frequency, multi-core AMD chips for higher core density.
Evaluation of Switch: Considering that market data feeds can approach 1Gb/s upon market opening — coupled with the three replicas in distributed storage — instantaneous sequential write requirements may be further amplified. Ideally, 25GbE switches supporting the RDMA protocol should be utilized to construct the storage network. However, due to resource constraints, this test was conducted on 10Gb switches.
Evaluation of Virtualization Platform and Architecture: Since this test involves the stage of market data reception, it cannot be reliably conducted outside the stock exchange’s hosted data centers. Due to the limited space and resources, the institution decided to deploy hyperconverged infrastructure (HCI) as a physical server alternative. Regarding virtualization platforms, VMware offers limited supported versions for low-latency NICs. In contrast, KVM-based virtualization benefits from long-term stable support via the Linux ecosystem. After thorough examination, the institution opted for AVE for its optimized performance and rich features, and deployed AECP clusters at two data centers for benchmarking.

Test Environment

As both market data acquisition and trading are critical for low-latency systems, we conducted two tests with both market data receiving and trading networks connected to the production environment, utilizing the same physical upstream links as the production environment.

Test Architecture

Platform Architecture and Hardware Configuration

The test servers used AMD EPYC 9554 CPUs and were equipped with four Solarflare PCI passthrough NICs, providing a total of eight network ports. VMs were configured as follows: 24 cores exclusive, 128GB memory, two PCI passthrough NICs, and one virtio NIC. Each host was planned to run four VMs. The hardware configuration of the test environment was not fully uniform; specific details were as follows:

Device	Low-latency NIC Configuration	Purpose
x86 Server1 64core * 2	SolarFlare X2522 * 5	Market data feed delivered via PCI passthrough NIC to 4 VMs Live testing of institutional trading operations
x86 Server2 64core * 2	SolarFlare X2522 * 3	Market data feed delivered via SR-IOV NICs to multiple VMs Intended for live trading validation by the technical department
x86 Server3 24core*1	Software-only	Cluster Minimum Size: 3 Nodes
HCI Software (ACOS, with vhost enabled)	——	For AECP cluster deployment
Low-latency network	——	Access to intranet VMs
Market Data System Switch	——	FPGA Market Data Access for Virtual Machines
Storage Switch	——	Storage network access for hyper-converged infrastructure

VM NIC Configuration

dw2_03(1).png

The test environment deployed a three-node AECP cluster (based on AVE). Each test VM was configured with two PCI passthrough NICs or SR-IOV passthrough NICs to connect to two exchanges’ market data feeds. In addition, one SR-IOV NIC was configured for the trading path, and one virtual NIC was used for VM management. Two types of virtualized NICs were evaluated in the tests, primarily to address different business requirements:

Low-Latency Market Data Acquisition: As Solarflare low-latency NICs support two virtualization methods-SR-IOV (via the creation of Virtual Functions) and PCI passthrough, in this test both configurations were tested to evaluate their respective performance.
Low-Latency Trading: Both trading pathways and TCP market data links can be accelerated via the OpenOnload stack. In virtualized environments, Virtual Functions (VFs) created through SR-IOV inherit these OpenOnload capabilities. Furthermore, since trading traffic typically has low bandwidth requirements, SR-IOV virtualization enhances the utilization of NIC hardware. Consequently, SR-IOV NICs were employed for connectivity to the stock exchange.

To further minimize latency, we enabled the NUMA affinity scheduling feature in AECP. For VMs with CPU pinning (dedicated cores), vCPUs are allocated according to a strict priority hierarchy: Same NUMA Node > Same Socket > Balanced distribution across a minimum number of sockets. Additionally, during VM boot-up, memory is prioritized to be allocated from the local NUMA node to mitigate the latency overhead associated with cross-node memory access.

Test Results

Performance of Market Data Reception

1. Testing using Physical NIC Passthrough

To ensure the stable operation of production applications and maintain redundancy for extreme market volatility, market data reception was initially tested under the PCI Passthrough condition. The securities institution deployed three low-latency VMs within the data center’s pilot environment for live trading trials. Continuous testing has been conducted since January 2025 to evaluate long-term throughput stability and latency performance. Over the past six months, the system has successfully processed daily live FPGA-based market data feeds with zero packet loss.

2. Testing using SR-IOV NIC

Following the subsequent adaptation of the low-latency market data system for SR-IOV functionality, the institution conducted further testing using the SR-IOV NIC, including 10x- and 20x-speed market data simulation tests. Based on a single-day trading volume of 97.43 billion USD in the stock market, the setup — comprising four hosts connected via a single NIC — demonstrated zero packet loss during the 10x-speed simulation (roughly equivalent to 860.98 billion USD in unilateral trading data), thereby exceeding historical peak redundancy requirements by over three times.

Furthermore, during the 20x-speed simulation, market data reception remained stable with zero packet loss despite instantaneous traffic exceeding 1 Gb/s, indicating that the architecture effectively meets business requirements under peak load conditions.

* The primary objective of this test was to verify whether multiple VMs sharing a single physical NIC can sustain market data reception under peak traffic conditions, specifically meeting the redundancy requirement of three times the historical peak trading volume. On the source end, FPGA-based market data for the stock market was replayed at 10x and 20x speeds. Simultaneously, four client VMs utilized SR-IOV NIC to perform concurrent FPGA market data reception.

Performance of Trading

In the latency evaluation, standard ping-pong measurements were conducted between two SR-IOV-enabled VMs connected via a single low-latency switch. The results demonstrated a round-trip latency of under 2 microseconds, achieving performance parity with bare-metal servers of the same hardware configuration.

During the throughput evaluation, both TCP and UDP tests were conducted, with results fully satisfying all business requirements. In the iperf3 TCP traffic test, the system exhibited zero packet loss under a sustained load of 2Gb/s. Given that client strategy machines prioritize latency performance over bandwidth capacity, a stable 2Gb/s throughput with zero packet loss is more than sufficient for VM operations.

Conclusions

Through the comprehensive evaluations and six months of production trials, AECP has fully demonstrated its capability to support low-latency market data receiving and trading systems. Notably, in the trading test, the networking latency under the virtualization environment is on par with that on bare-metal servers, effectively meeting all production requirements.

Benefits and Values

1. User Experience Optimization under the Hosted Quantitative Trading Scenario

Unlike traditional back-office IT operations, O&M services for hosted quantitative trading are directly client-facing and mission-critical. Running low-latency workloads on VMs can significantly improve user experience by enabling T+0 post-market host deployment, reducing operational windows to minutes, and ensuring high-quality, consistent delivery through standardized VM templates.

2. Data Center Transformation of Hosted Applications

Currently, hosted data centers operate almost entirely on physical hardware, where equipment deployment, decommissioning, and migration rely entirely on manual on-site intervention. After validating the performance of VM-based low-latency systems, securities institutions can implement a phased cloud transformation.

Clients with ultra-low-latency requirements utilize overclocked bare-metal servers, while critical service applications are deployed on high-frequency physical servers or overclocked hardware.
General clients transition to virtualized strategy machines, with standby instances for critical services utilizing VMs as appropriate. All sidecar applications, such as analytics and monitoring, are fully migrated to VMs.

3. Cost Optimization for Client Strategy Machines

By utilizing low-latency VMs for client strategy machine deployment, the platform reduces costs and power consumption by more than 50% compared to bare-metal servers respectively.

4. Enhanced Infrastructure Elasticity and Scalability

Both hosted data centers of the securities institution encountered resource constraints in the past. Embracing the virtualized environment not only increases the scalability of the infrastructure, but also enables O&M personnel to rapidly deploy, modify, or decommission virtual client strategy machines and provide low-cost standby instances, allowing the institution to respond to external uncertainties more effectively.

For more information on AVE and related features:

Arcfra Virtualization Engine vs. VMware vSphere: Higher Availability with Optimized Performance

Arcfra Dynamic Resource Scheduler Explained: Innovating DRS Scoring System for Modernized Applications

Arcfra Virtual Machine High Availability Explained

About Arcfra

Arcfra simplifies enterprise cloud infrastructure with a full-stack, software-defined platform built for the AI era. We deliver computing, storage, networking, security, Kubernetes, and more — all in one streamlined solution. Supporting VMs, containers, and AI workloads, Arcfra offers future-proof infrastructure trusted by enterprises across e-commerce, finance, and manufacturing. Arcfra is recognized by Gartner as a Representative Vendor in full-stack hyperconverged infrastructure. Learn more at www.arcfra.com.