AI training rack scales

Technology
GPGPU-HPC
Partner
One Stop Systems

GPultima CI & OSS Ampere8 is a rack-scale AI training solution designed for organisations that need high-performance infrastructure for advanced model development. It combines robust server nodes with GPU expansion chassis to support large-scale deep learning training in datacenter environments. The composable architecture allows GPUs, storage, and network resources to be dynamically allocated across workloads for better utilisation and flexibility.

This makes the system well suited for autonomous systems, defense and surveillance analytics, scientific research, and other compute-intensive applications. By centralising AI training, the platform complements rugged edge and mobile deployments where trained models are later deployed. The result is a streamlined workflow from data collection and model development to validation and field deployment.

AI training rack scales

Organisations looking to train advanced AI models can benefit from GPultima CI & OSS Ampere8, a combined rack-scale solution purpose-built for intensive AI training workloads. The platform integrates scalable server infrastructure with high-performance GPU expansion, enabling centralised deep learning training on massive datasets.

Its composable design allows GPU, NVMe storage, and network resources to be allocated dynamically across applications, helping teams adapt infrastructure to changing project demands. This makes the solution suitable for AI development pipelines that span datacenter training and deployment to rugged edge or mobile systems.

Range features

A high level overview of what this range offers

  • High-density GPU acceleration: Supports up to 48 NVIDIA GPU accelerators per rack, with 8 GPUs per OSS Ampere8 chassis for large-scale parallel training.
  • Composable architecture: Dynamically shares GPUs, NVMe storage, and NIC resources between servers to improve utilisation and workload flexibility.
  • PCIe Gen4 expansion fabric: Uses a high-bandwidth, low-latency PCIe Gen4 interconnect with up to 512 Gbps aggregate bandwidth for fast data movement.
  • Integrated rack-scale solution: Combines servers, GPU expansion chassis, networking, and storage in a pre-configured deployment model.
  • Edge-to-core AI workflow: Acts as a central training hub for models that are later deployed to rugged edge and mobile platforms.
  • Power and cooling readiness: Engineered for dense compute environments with support for high rack power draw and continuous operation.

Downloads

for AI training rack scales

pdf
GPultima CI Rack-Scale System – Datasheet
Download

What’s in this range?

All the variants in the range and a comparison of what they offer

SpecificationGPultima CI (Rack-Scale Cluster)OSS Ampere8 (GPU Expansion Chassis)

Form Factor

19-inch rack (42U), single or multi-rack configuration

4U chassis, rack-mountable

Compute Nodes (CPU)

Up to 32 × dual Intel Xeon Scalable nodes per rack

Uses external host server via dual x16 PCIe links, no CPU on board

GPU Capacity

Up to 48 NVIDIA data center GPUs per rack, scalable to 128+ with multi-rack

Up to 8 NVIDIA GPUs in one chassis

Storage

Up to 96 NVMe SSD drives per rack

No internal drives; storage provided by host server or external system

Networking

Up to 32 × 100 Gb InfiniBand/Ethernet NICs per rack

Supports up to 2 high-bandwidth NIC cards for optional data I/O

Interconnect Fabric

48-port PCIe Gen4 switch fabric for composable resource sharing

Dual PCIe 4.0 x16 host interfaces with 512 Gbps total bandwidth

GPU-to-GPU Links

NVLink/NVSwitch supported depending on GPU integration

NVIDIA NVLink 3rd Gen between GPUs with up to 600 GB/s communication

Power Supply

Rack power distribution supports up to ~52 kW

4000 W redundant power supply built in

Management Software

Composable infrastructure management with API and GUI control

Managed as part of the GPultima cluster or via the host system

FAQs

for AI training rack scales

A rack-scale AI training system is a high-performance computing platform that brings together servers, GPUs, storage, and networking across an entire rack to operate as a unified training cluster. It is used when AI workloads require more compute, memory, and throughput than a single server can deliver. This architecture helps reduce training time, supports larger datasets, and improves experimentation capacity for demanding deep learning projects.

GPultima CI uses a composable architecture in which GPUs, NVMe storage, and network resources are pooled instead of being permanently tied to one server. Through a high-speed PCIe fabric and management software, resources can be assigned dynamically to the servers that need them for a given workload. This improves utilisation, simplifies reconfiguration, and allows hardware to be matched to changing training requirements without physical rewiring.

The platform is designed for high-performance data center components. GPultima CI can support up to 48 NVIDIA GPUs in a single rack and server nodes based on dual Intel Xeon Scalable processors. OSS Ampere8 is designed for NVIDIA Ampere-generation PCIe GPUs such as the A100 and connects to compatible host servers with PCIe Gen4 capability.

OSS Ampere8 connects to a host server through dual x16 PCIe Gen4 links using the appropriate host interface adapter. Once connected, the host server accesses the GPUs in the chassis as if they were locally installed. This provides very high aggregate bandwidth and helps avoid I/O bottlenecks during GPU-intensive AI workloads.

The system provides a centralised training environment for AI models that will later run on rugged edge or mobile platforms. Data gathered in the field can be used to train or refine models in the datacenter, where far greater compute resources are available. Updated models can then be deployed back to the edge, creating a practical workflow for continuous AI improvement.

Yes. The architecture is modular and designed for growth. Organisations can begin with a smaller configuration and later add more GPU expansion chassis, more server nodes, additional storage, or even multiple interconnected racks. GPU and storage upgrades are also possible, provided that power, cooling, and compatibility requirements are met.

These systems typically use cluster management and composable infrastructure software to allocate hardware resources and monitor the environment. On the AI side, they support standard operating systems and common frameworks such as PyTorch and TensorFlow, along with NVIDIA CUDA and driver stacks. Distributed training tools and job schedulers can also be used to orchestrate workloads across multiple servers and GPUs.

A fully configured GPultima CI rack requires substantial facility support for both power delivery and cooling. Depending on configuration, rack power draw can be very high, so appropriate PDUs, electrical feeds, and airflow planning are essential. These systems are typically deployed in datacenter environments or specialised high-density computing spaces that can support continuous operation under heavy load.