Low-latency market data feeds and trading systems have become essential for the core operations of securities institutions, including high-frequency trading (HFT), quantitative investment, and intelligent risk control. With only microsecond or even nanosecond-level latency, these systems enable institutions to capture opportunities and execute orders instantly during rapid market fluctuations. Consequently, they directly impact trading efficiency, execution pricing, and overall business profitability.
To ensure the necessary speed and performance, securities institutions typically deploy high-frequency physical servers, low-latency network interface cards (NICs), and low-latency switches. However, this high-end hardware is not only expensive but often suffers from underutilized computing and network resources. Furthermore, the significant physical footprint required in hosted data centers drives up the Total Cost of Ownership (TCO). Consequently, there is an urgent need to explore the feasibility of deploying low-latency systems on virtualized platforms, despite the common worries that low-latency systems face significant performance hurdles on VMs due to added abstraction layers and resource sharing.
Recently, a securities institution partnered with Arcfra to validate the performance of the Arcfra Enterprise Cloud Platform (AECP) — powered by the Arcfra Virtualization Engine (AVE) — in supporting its low-latency workloads. The test focused on market data reception and trade execution. The results demonstrate that both systems’ performance meets rigorous business requirements. Notably, the AVE-based trading NIC latency achieved levels comparable to those of physical servers.
A securities institution utilizes low-latency systems for quantitative trading. The production environment employs high-frequency physical servers and Solarflare’s low-latency NICs to meet business requirements. Regarding market data reception, the institution utilizes UDP multicast to receive upstream FPGA-based market data, enabling faster data decoding. For the trading link, OpenOnload is deployed for unilateral acceleration.
OpenOnload is a high-performance, user-end network stack that accelerates TCP and UDP network I/O for applications using the BSD socket API. As OpenOnload is implemented based on the kernel bypass technology, the use of KVM-based virtualization should, in theory, not introduce significant latency overhead to the VM GuestOS, and thus achieve high resource utilization without significant performance loss.
Accordingly, the institution planned to evaluate the performance of quantitative trading in a virtualized environment. A testing environment built upon “AMD CPUs + 10Gb Ethernet + AVE” was deployed to conduct a comparative analysis against the physical-based production environment.
As both market data acquisition and trading are critical for low-latency systems, we conducted two tests with both market data receiving and trading networks connected to the production environment, utilizing the same physical upstream links as the production environment.


The test servers used AMD EPYC 9554 CPUs and were equipped with four Solarflare PCI passthrough NICs, providing a total of eight network ports. VMs were configured as follows: 24 cores exclusive, 128GB memory, two PCI passthrough NICs, and one virtio NIC. Each host was planned to run four VMs. The hardware configuration of the test environment was not fully uniform; specific details were as follows:
| Device | Low-latency NIC Configuration | Purpose |
| x86 Server1 64core * 2 | SolarFlare X2522 * 5 | Market data feed delivered via PCI passthrough NIC to 4 VMs Live testing of institutional trading operations |
| x86 Server2 64core * 2 | SolarFlare X2522 * 3 | Market data feed delivered via SR-IOV NICs to multiple VMs Intended for live trading validation by the technical department |
| x86 Server3 24core*1 | Software-only | Cluster Minimum Size: 3 Nodes |
| HCI Software (ACOS, with vhost enabled) | —— | For AECP cluster deployment |
| Low-latency network | —— | Access to intranet VMs |
| Market Data System Switch | —— | FPGA Market Data Access for Virtual Machines |
| Storage Switch | —— | Storage network access for hyper-converged infrastructure |
.png)
The test environment deployed a three-node AECP cluster (based on AVE). Each test VM was configured with two PCI passthrough NICs or SR-IOV passthrough NICs to connect to two exchanges’ market data feeds. In addition, one SR-IOV NIC was configured for the trading path, and one virtual NIC was used for VM management. Two types of virtualized NICs were evaluated in the tests, primarily to address different business requirements:
To further minimize latency, we enabled the NUMA affinity scheduling feature in AECP. For VMs with CPU pinning (dedicated cores), vCPUs are allocated according to a strict priority hierarchy: Same NUMA Node > Same Socket > Balanced distribution across a minimum number of sockets. Additionally, during VM boot-up, memory is prioritized to be allocated from the local NUMA node to mitigate the latency overhead associated with cross-node memory access.
To ensure the stable operation of production applications and maintain redundancy for extreme market volatility, market data reception was initially tested under the PCI Passthrough condition. The securities institution deployed three low-latency VMs within the data center’s pilot environment for live trading trials. Continuous testing has been conducted since January 2025 to evaluate long-term throughput stability and latency performance. Over the past six months, the system has successfully processed daily live FPGA-based market data feeds with zero packet loss.
Following the subsequent adaptation of the low-latency market data system for SR-IOV functionality, the institution conducted further testing using the SR-IOV NIC, including 10x- and 20x-speed market data simulation tests. Based on a single-day trading volume of 97.43 billion USD in the stock market, the setup — comprising four hosts connected via a single NIC — demonstrated zero packet loss during the 10x-speed simulation (roughly equivalent to 860.98 billion USD in unilateral trading data), thereby exceeding historical peak redundancy requirements by over three times.
Furthermore, during the 20x-speed simulation, market data reception remained stable with zero packet loss despite instantaneous traffic exceeding 1 Gb/s, indicating that the architecture effectively meets business requirements under peak load conditions.
* The primary objective of this test was to verify whether multiple VMs sharing a single physical NIC can sustain market data reception under peak traffic conditions, specifically meeting the redundancy requirement of three times the historical peak trading volume. On the source end, FPGA-based market data for the stock market was replayed at 10x and 20x speeds. Simultaneously, four client VMs utilized SR-IOV NIC to perform concurrent FPGA market data reception.
In the latency evaluation, standard ping-pong measurements were conducted between two SR-IOV-enabled VMs connected via a single low-latency switch. The results demonstrated a round-trip latency of under 2 microseconds, achieving performance parity with bare-metal servers of the same hardware configuration.
During the throughput evaluation, both TCP and UDP tests were conducted, with results fully satisfying all business requirements. In the iperf3 TCP traffic test, the system exhibited zero packet loss under a sustained load of 2Gb/s. Given that client strategy machines prioritize latency performance over bandwidth capacity, a stable 2Gb/s throughput with zero packet loss is more than sufficient for VM operations.
Through the comprehensive evaluations and six months of production trials, AECP has fully demonstrated its capability to support low-latency market data receiving and trading systems. Notably, in the trading test, the networking latency under the virtualization environment is on par with that on bare-metal servers, effectively meeting all production requirements.
Unlike traditional back-office IT operations, O&M services for hosted quantitative trading are directly client-facing and mission-critical. Running low-latency workloads on VMs can significantly improve user experience by enabling T+0 post-market host deployment, reducing operational windows to minutes, and ensuring high-quality, consistent delivery through standardized VM templates.
Currently, hosted data centers operate almost entirely on physical hardware, where equipment deployment, decommissioning, and migration rely entirely on manual on-site intervention. After validating the performance of VM-based low-latency systems, securities institutions can implement a phased cloud transformation.
By utilizing low-latency VMs for client strategy machine deployment, the platform reduces costs and power consumption by more than 50% compared to bare-metal servers respectively.
Both hosted data centers of the securities institution encountered resource constraints in the past. Embracing the virtualized environment not only increases the scalability of the infrastructure, but also enables O&M personnel to rapidly deploy, modify, or decommission virtual client strategy machines and provide low-cost standby instances, allowing the institution to respond to external uncertainties more effectively.
For more information on AVE and related features:
Arcfra Virtualization Engine vs. VMware vSphere: Higher Availability with Optimized Performance
Arcfra simplifies enterprise cloud infrastructure with a full-stack, software-defined platform built for the AI era. We deliver computing, storage, networking, security, Kubernetes, and more — all in one streamlined solution. Supporting VMs, containers, and AI workloads, Arcfra offers future-proof infrastructure trusted by enterprises across e-commerce, finance, and manufacturing. Arcfra is recognized by Gartner as a Representative Vendor in full-stack hyperconverged infrastructure. Learn more at www.arcfra.com.