Xelon

Cloud

Pricing

Solutions

Resources

Partners

Company

Get Started

Select Language

Xelon

NVIDIA A100/H100 GPU Servers for AI, ML, and HPC Workloads

Deploy high-performance GPU servers with NVIDIA A100 (80GB) or H100 GPUs for AI/ML model training, inference, and scientific computing. Scale from single GPUs to multi-node GPU clusters on Swiss infrastructure.

Deploy GPU Server

View GPU Pricing

NVIDIA A100/H100 GPU Servers for AI, ML, and HPC Workloads

Deploy GPU Server

View GPU Pricing

Key Features

What sets Xelon Cloud apart

Key Features

What sets Xelon Cloud apart

Why choose Xelon Cloud?

Overview

Xelon GPU Servers provide on-demand access to NVIDIA A100 and H100 GPUs, designed for deep learning model training, large language models (LLMs), computer vision, scientific simulations, and rendering workloads.

Key Highlights:

NVIDIA A100 80GB PCIe/SXM4 GPUs with NVLink for multi-GPU training
NVIDIA H100 80GB SXM5 GPUs (up to 3x faster than A100 for LLM training)
Flexible configurations: 1-8 GPUs per server, AMD EPYC CPUs, up to 2 TB RAM
Pre-installed ML frameworks: PyTorch 2.2, TensorFlow 2.15, JAX, CUDA 12.3
High-speed NVMe storage with up to 100 TB capacity per server
100 Gbps InfiniBand networking for distributed training (multi-node GPU clusters)

Use Cases:

LLM Training: Train GPT, LLaMA, Mistral models with multi-GPU parallelism (DeepSpeed, FSDP)
Computer Vision: Object detection, image segmentation, video analysis (YOLOv8, Stable Diffusion)
Scientific Computing: Molecular dynamics, CFD simulations, genomics (GROMACS, OpenFOAM)
Rendering: 3D rendering, video transcoding, ray tracing (Blender, FFmpeg with NVENC)

GPU pricing starts at CHF 2.50/hour for A100 80GB (60-70% cheaper than AWS p4d.24xlarge).

Why choose Xelon Cloud?

Overview

Key Highlights:

NVIDIA A100 80GB PCIe/SXM4 GPUs with NVLink for multi-GPU training
NVIDIA H100 80GB SXM5 GPUs (up to 3x faster than A100 for LLM training)
Flexible configurations: 1-8 GPUs per server, AMD EPYC CPUs, up to 2 TB RAM
Pre-installed ML frameworks: PyTorch 2.2, TensorFlow 2.15, JAX, CUDA 12.3
High-speed NVMe storage with up to 100 TB capacity per server
100 Gbps InfiniBand networking for distributed training (multi-node GPU clusters)

Use Cases:

LLM Training: Train GPT, LLaMA, Mistral models with multi-GPU parallelism (DeepSpeed, FSDP)
Computer Vision: Object detection, image segmentation, video analysis (YOLOv8, Stable Diffusion)
Scientific Computing: Molecular dynamics, CFD simulations, genomics (GROMACS, OpenFOAM)
Rendering: 3D rendering, video transcoding, ray tracing (Blender, FFmpeg with NVENC)

GPU pricing starts at CHF 2.50/hour for A100 80GB (60-70% cheaper than AWS p4d.24xlarge).

Automated Backups Made Simple

Daily snapshots are included with Xelon Cloud instances by default.

Need longer retention? Choose flexible options with 7, 30, or 365-day retention for compliance or business continuity.

Learn more about Snapshots & Backup

Automated Backups Made Simple

Daily snapshots are included with Xelon Cloud instances by default.

Need longer retention? Choose flexible options with 7, 30, or 365-day retention for compliance or business continuity.

Learn more about Snapshots & Backup

Technical Specifications

GPU Options

NVIDIA A100 80GB PCIe: 6912 CUDA cores, 432 Tensor cores, 80 GB HBM2e memory, 2 TB/s memory bandwidth, 312 TFLOPS FP16, 624 TFLOPS with sparsity
NVIDIA A100 80GB SXM4: Same specs as PCIe, with NVLink 3.0 (600 GB/s GPU-to-GPU bandwidth for multi-GPU training)
NVIDIA H100 80GB SXM5: 16896 CUDA cores, 528 Tensor cores, 80 GB HBM3 memory, 3.35 TB/s memory bandwidth, 989 TFLOPS FP16, 1979 TFLOPS with sparsity, 3958 TFLOPS FP8

Server Configurations

Single GPU: 1x A100/H100, 16-32 vCPU (AMD EPYC 7543), 128-256 GB RAM, 2 TB NVMe SSD
Dual GPU: 2x A100/H100, 32-64 vCPU, 256-512 GB RAM, 4 TB NVMe SSD
Quad GPU: 4x A100/H100 SXM4/SXM5 with NVLink, 64-128 vCPU, 512 GB - 1 TB RAM, 8 TB NVMe SSD
Octa GPU: 8x A100/H100 SXM4/SXM5 with NVLink, 128 vCPU, 2 TB RAM, 16 TB NVMe SSD

CPU & Memory

CPU: AMD EPYC 7543 (32 cores), 7713 (64 cores), 9554 (128 cores)
RAM: 128 GB, 256 GB, 512 GB, 1 TB, 2 TB DDR4-3200 ECC memory
PCIe: PCIe Gen4 x16 per GPU for maximum bandwidth

Storage & Networking

Boot/Scratch Storage: 2-16 TB NVMe SSD RAID 0/1 for datasets and checkpoints
Shared Storage: Ceph RBD or NFS for multi-node training (100 Gbps RDMA networking)
Network: 25 Gbps Ethernet (standard), 100 Gbps InfiniBand HDR for multi-node GPU clusters

Software & Frameworks

OS: Ubuntu 22.04 LTS with NVIDIA GPU drivers 535.x, CUDA 12.3, cuDNN 8.9
Deep Learning Frameworks: PyTorch 2.2, TensorFlow 2.15, JAX 0.4.23, MXNet, PaddlePaddle
Distributed Training: DeepSpeed, FSDP (Fully Sharded Data Parallelism), Horovod, PyTorch DDP
Model Serving: vLLM, TensorRT-LLM, Triton Inference Server, TorchServe
Jupyter: JupyterLab with GPU support, pre-installed kernels for Python 3.10/3.11

Pricing Models

On-Demand: Per-hour billing (CHF 2.50/hour for A100 80GB, CHF 4.00/hour for H100 80GB)
Reserved Instances: 1-year or 3-year commitments with 30-50% discounts
Spot Instances: Up to 70% discount for interruptible workloads (preemptible GPUs)

Technical Specifications

GPU Options

NVIDIA A100 80GB PCIe: 6912 CUDA cores, 432 Tensor cores, 80 GB HBM2e memory, 2 TB/s memory bandwidth, 312 TFLOPS FP16, 624 TFLOPS with sparsity
NVIDIA A100 80GB SXM4: Same specs as PCIe, with NVLink 3.0 (600 GB/s GPU-to-GPU bandwidth for multi-GPU training)
NVIDIA H100 80GB SXM5: 16896 CUDA cores, 528 Tensor cores, 80 GB HBM3 memory, 3.35 TB/s memory bandwidth, 989 TFLOPS FP16, 1979 TFLOPS with sparsity, 3958 TFLOPS FP8

Server Configurations

Single GPU: 1x A100/H100, 16-32 vCPU (AMD EPYC 7543), 128-256 GB RAM, 2 TB NVMe SSD
Dual GPU: 2x A100/H100, 32-64 vCPU, 256-512 GB RAM, 4 TB NVMe SSD
Quad GPU: 4x A100/H100 SXM4/SXM5 with NVLink, 64-128 vCPU, 512 GB - 1 TB RAM, 8 TB NVMe SSD
Octa GPU: 8x A100/H100 SXM4/SXM5 with NVLink, 128 vCPU, 2 TB RAM, 16 TB NVMe SSD

CPU & Memory

CPU: AMD EPYC 7543 (32 cores), 7713 (64 cores), 9554 (128 cores)
RAM: 128 GB, 256 GB, 512 GB, 1 TB, 2 TB DDR4-3200 ECC memory
PCIe: PCIe Gen4 x16 per GPU for maximum bandwidth

Storage & Networking

Boot/Scratch Storage: 2-16 TB NVMe SSD RAID 0/1 for datasets and checkpoints
Shared Storage: Ceph RBD or NFS for multi-node training (100 Gbps RDMA networking)
Network: 25 Gbps Ethernet (standard), 100 Gbps InfiniBand HDR for multi-node GPU clusters

Software & Frameworks

OS: Ubuntu 22.04 LTS with NVIDIA GPU drivers 535.x, CUDA 12.3, cuDNN 8.9
Deep Learning Frameworks: PyTorch 2.2, TensorFlow 2.15, JAX 0.4.23, MXNet, PaddlePaddle
Distributed Training: DeepSpeed, FSDP (Fully Sharded Data Parallelism), Horovod, PyTorch DDP
Model Serving: vLLM, TensorRT-LLM, Triton Inference Server, TorchServe
Jupyter: JupyterLab with GPU support, pre-installed kernels for Python 3.10/3.11

Pricing Models

On-Demand: Per-hour billing (CHF 2.50/hour for A100 80GB, CHF 4.00/hour for H100 80GB)
Reserved Instances: 1-year or 3-year commitments with 30-50% discounts
Spot Instances: Up to 70% discount for interruptible workloads (preemptible GPUs)

Optimized for Kubernetes & Cloud-Native Workloads

Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:

Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters

Explore Managed Kubernetes

Optimized for Kubernetes & Cloud-Native Workloads

Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:

Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters

Explore Managed Kubernetes

Optimized for Kubernetes & Cloud-Native Workloads

Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:

Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters

Explore Managed Kubernetes

Book an Appointment

Choose a time that works for you and connect with one of our cloud specialists for a personalised session — via Microsoft Teams or phone.

Request Meeting with our Solution Architect

Request Meeting with our Partner Manager

Book an Appointment

Choose a time that works for you and connect with one of our cloud specialists for a personalised session — via Microsoft Teams or phone.

Request Meeting with our Solution Architect

Request Meeting with our Partner Manager

Get in touch

We’re here to help you with cloud strategy, technical questions, pricing, compliance, and tailored solutions for your organisation.

Get in touch

We’re here to help you with cloud strategy, technical questions, pricing, compliance, and tailored solutions for your organisation.

First name

Bonnie

Last name

Green

name@example.com

Phone number

+(12) 345 6789

Your message

By submitting this form, you confirm that you have read and agree to Xelon's Terms of Service and Privacy Statement

Send message

Take your cloud to the next level

Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.

Start free trial

contact our sales

Take your cloud to the next level

Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.

Start free trial

contact our sales

Take your cloud to the next level

Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.

Start free trial

contact our sales