NVIDIA A100/H100 GPU Servers for AI, ML, and HPC Workloads
Deploy high-performance GPU servers with NVIDIA A100 (80GB) or H100 GPUs for AI/ML model training, inference, and scientific computing. Scale from single GPUs to multi-node GPU clusters on Swiss infrastructure.
NVIDIA A100/H100 GPU Servers for AI, ML, and HPC Workloads
Deploy high-performance GPU servers with NVIDIA A100 (80GB) or H100 GPUs for AI/ML model training, inference, and scientific computing. Scale from single GPUs to multi-node GPU clusters on Swiss infrastructure.
Key Features
What sets Xelon Cloud apart
Key Features
What sets Xelon Cloud apart
Why choose Xelon Cloud?
Overview
Xelon GPU Servers provide on-demand access to NVIDIA A100 and H100 GPUs, designed for deep learning model training, large language models (LLMs), computer vision, scientific simulations, and rendering workloads.
Key Highlights:
NVIDIA A100 80GB PCIe/SXM4 GPUs with NVLink for multi-GPU training
NVIDIA H100 80GB SXM5 GPUs (up to 3x faster than A100 for LLM training)
Flexible configurations: 1-8 GPUs per server, AMD EPYC CPUs, up to 2 TB RAM
Pre-installed ML frameworks: PyTorch 2.2, TensorFlow 2.15, JAX, CUDA 12.3
High-speed NVMe storage with up to 100 TB capacity per server
100 Gbps InfiniBand networking for distributed training (multi-node GPU clusters)
Use Cases:
LLM Training: Train GPT, LLaMA, Mistral models with multi-GPU parallelism (DeepSpeed, FSDP)
Computer Vision: Object detection, image segmentation, video analysis (YOLOv8, Stable Diffusion)
Scientific Computing: Molecular dynamics, CFD simulations, genomics (GROMACS, OpenFOAM)
Rendering: 3D rendering, video transcoding, ray tracing (Blender, FFmpeg with NVENC)
GPU pricing starts at CHF 2.50/hour for A100 80GB (60-70% cheaper than AWS p4d.24xlarge).
Why choose Xelon Cloud?
Overview
Xelon GPU Servers provide on-demand access to NVIDIA A100 and H100 GPUs, designed for deep learning model training, large language models (LLMs), computer vision, scientific simulations, and rendering workloads.
Key Highlights:
NVIDIA A100 80GB PCIe/SXM4 GPUs with NVLink for multi-GPU training
NVIDIA H100 80GB SXM5 GPUs (up to 3x faster than A100 for LLM training)
Flexible configurations: 1-8 GPUs per server, AMD EPYC CPUs, up to 2 TB RAM
Pre-installed ML frameworks: PyTorch 2.2, TensorFlow 2.15, JAX, CUDA 12.3
High-speed NVMe storage with up to 100 TB capacity per server
100 Gbps InfiniBand networking for distributed training (multi-node GPU clusters)
Use Cases:
LLM Training: Train GPT, LLaMA, Mistral models with multi-GPU parallelism (DeepSpeed, FSDP)
Computer Vision: Object detection, image segmentation, video analysis (YOLOv8, Stable Diffusion)
Scientific Computing: Molecular dynamics, CFD simulations, genomics (GROMACS, OpenFOAM)
Rendering: 3D rendering, video transcoding, ray tracing (Blender, FFmpeg with NVENC)
GPU pricing starts at CHF 2.50/hour for A100 80GB (60-70% cheaper than AWS p4d.24xlarge).
Automated Backups Made Simple
Daily snapshots are included with Xelon Cloud instances by default.
Need longer retention? Choose flexible options with 7, 30, or 365-day retention for compliance or business continuity.
Automated Backups Made Simple
Daily snapshots are included with Xelon Cloud instances by default.
Need longer retention? Choose flexible options with 7, 30, or 365-day retention for compliance or business continuity.
Technical Specifications
Technical Specifications
GPU Options
NVIDIA A100 80GB PCIe: 6912 CUDA cores, 432 Tensor cores, 80 GB HBM2e memory, 2 TB/s memory bandwidth, 312 TFLOPS FP16, 624 TFLOPS with sparsity
NVIDIA A100 80GB SXM4: Same specs as PCIe, with NVLink 3.0 (600 GB/s GPU-to-GPU bandwidth for multi-GPU training)
NVIDIA H100 80GB SXM5: 16896 CUDA cores, 528 Tensor cores, 80 GB HBM3 memory, 3.35 TB/s memory bandwidth, 989 TFLOPS FP16, 1979 TFLOPS with sparsity, 3958 TFLOPS FP8
Server Configurations
Single GPU: 1x A100/H100, 16-32 vCPU (AMD EPYC 7543), 128-256 GB RAM, 2 TB NVMe SSD
Dual GPU: 2x A100/H100, 32-64 vCPU, 256-512 GB RAM, 4 TB NVMe SSD
Quad GPU: 4x A100/H100 SXM4/SXM5 with NVLink, 64-128 vCPU, 512 GB - 1 TB RAM, 8 TB NVMe SSD
Octa GPU: 8x A100/H100 SXM4/SXM5 with NVLink, 128 vCPU, 2 TB RAM, 16 TB NVMe SSD
CPU & Memory
CPU: AMD EPYC 7543 (32 cores), 7713 (64 cores), 9554 (128 cores)
RAM: 128 GB, 256 GB, 512 GB, 1 TB, 2 TB DDR4-3200 ECC memory
PCIe: PCIe Gen4 x16 per GPU for maximum bandwidth
Storage & Networking
Boot/Scratch Storage: 2-16 TB NVMe SSD RAID 0/1 for datasets and checkpoints
Shared Storage: Ceph RBD or NFS for multi-node training (100 Gbps RDMA networking)
Network: 25 Gbps Ethernet (standard), 100 Gbps InfiniBand HDR for multi-node GPU clusters
Software & Frameworks
OS: Ubuntu 22.04 LTS with NVIDIA GPU drivers 535.x, CUDA 12.3, cuDNN 8.9
Deep Learning Frameworks: PyTorch 2.2, TensorFlow 2.15, JAX 0.4.23, MXNet, PaddlePaddle
Distributed Training: DeepSpeed, FSDP (Fully Sharded Data Parallelism), Horovod, PyTorch DDP
Model Serving: vLLM, TensorRT-LLM, Triton Inference Server, TorchServe
Jupyter: JupyterLab with GPU support, pre-installed kernels for Python 3.10/3.11
Pricing Models
On-Demand: Per-hour billing (CHF 2.50/hour for A100 80GB, CHF 4.00/hour for H100 80GB)
Reserved Instances: 1-year or 3-year commitments with 30-50% discounts
Spot Instances: Up to 70% discount for interruptible workloads (preemptible GPUs)
Technical Specifications
Technical Specifications
GPU Options
NVIDIA A100 80GB PCIe: 6912 CUDA cores, 432 Tensor cores, 80 GB HBM2e memory, 2 TB/s memory bandwidth, 312 TFLOPS FP16, 624 TFLOPS with sparsity
NVIDIA A100 80GB SXM4: Same specs as PCIe, with NVLink 3.0 (600 GB/s GPU-to-GPU bandwidth for multi-GPU training)
NVIDIA H100 80GB SXM5: 16896 CUDA cores, 528 Tensor cores, 80 GB HBM3 memory, 3.35 TB/s memory bandwidth, 989 TFLOPS FP16, 1979 TFLOPS with sparsity, 3958 TFLOPS FP8
Server Configurations
Single GPU: 1x A100/H100, 16-32 vCPU (AMD EPYC 7543), 128-256 GB RAM, 2 TB NVMe SSD
Dual GPU: 2x A100/H100, 32-64 vCPU, 256-512 GB RAM, 4 TB NVMe SSD
Quad GPU: 4x A100/H100 SXM4/SXM5 with NVLink, 64-128 vCPU, 512 GB - 1 TB RAM, 8 TB NVMe SSD
Octa GPU: 8x A100/H100 SXM4/SXM5 with NVLink, 128 vCPU, 2 TB RAM, 16 TB NVMe SSD
CPU & Memory
CPU: AMD EPYC 7543 (32 cores), 7713 (64 cores), 9554 (128 cores)
RAM: 128 GB, 256 GB, 512 GB, 1 TB, 2 TB DDR4-3200 ECC memory
PCIe: PCIe Gen4 x16 per GPU for maximum bandwidth
Storage & Networking
Boot/Scratch Storage: 2-16 TB NVMe SSD RAID 0/1 for datasets and checkpoints
Shared Storage: Ceph RBD or NFS for multi-node training (100 Gbps RDMA networking)
Network: 25 Gbps Ethernet (standard), 100 Gbps InfiniBand HDR for multi-node GPU clusters
Software & Frameworks
OS: Ubuntu 22.04 LTS with NVIDIA GPU drivers 535.x, CUDA 12.3, cuDNN 8.9
Deep Learning Frameworks: PyTorch 2.2, TensorFlow 2.15, JAX 0.4.23, MXNet, PaddlePaddle
Distributed Training: DeepSpeed, FSDP (Fully Sharded Data Parallelism), Horovod, PyTorch DDP
Model Serving: vLLM, TensorRT-LLM, Triton Inference Server, TorchServe
Jupyter: JupyterLab with GPU support, pre-installed kernels for Python 3.10/3.11
Pricing Models
On-Demand: Per-hour billing (CHF 2.50/hour for A100 80GB, CHF 4.00/hour for H100 80GB)
Reserved Instances: 1-year or 3-year commitments with 30-50% discounts
Spot Instances: Up to 70% discount for interruptible workloads (preemptible GPUs)
Optimized for Kubernetes & Cloud-Native Workloads
Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:
Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters
Optimized for Kubernetes & Cloud-Native Workloads
Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:
Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters
Optimized for Kubernetes & Cloud-Native Workloads
Our compute plans are designed to integrate seamlessly with CloudDeck and Xelon Kubernetes:
Native support for K8s node pools
Instant scaling
Multi-zone deployments
S3-compatible object storage for stateful workloads
SCION-secured networking for critical clusters
Book an Appointment
Choose a time that works for you and connect with one of our cloud specialists for a personalised session — via Microsoft Teams or phone.
Request Meeting with our Solution Architect
Request Meeting with our Partner Manager
Book an Appointment
Choose a time that works for you and connect with one of our cloud specialists for a personalised session — via Microsoft Teams or phone.
Request Meeting with our Solution Architect
Request Meeting with our Partner Manager
Get in touch
We’re here to help you with cloud strategy, technical questions, pricing, compliance, and tailored solutions for your organisation.
Get in touch
We’re here to help you with cloud strategy, technical questions, pricing, compliance, and tailored solutions for your organisation.
First name
Bonnie
Last name
Green
name@example.com
Phone number
+(12) 345 6789
Your message
By submitting this form, you confirm that you have read and agree to Xelon's Terms of Service and Privacy Statement
Send message
Take your cloud to the next level
Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.
Take your cloud to the next level
Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.
Take your cloud to the next level
Experience high-performance Swiss cloud infrastructure built for teams who want reliability, sovereignty, and simplicity.