← All apps
SoperatorAI Cloud
Nebius Slurm ML Training and Inference Demo
Distributed LLM fine-tuning and vLLM serving on a Soperator Slurm-on-Kubernetes cluster
About this project
An end-to-end demo that fine-tunes Qwen3-8B and Qwen3-32B across 16 NVIDIA H100 GPUs on a Nebius Cloud cluster, then serves the models with vLLM. The cluster is deployed via Terraform using the Soperator (Slurm-on-Kubernetes) template from the Nebius Solutions Library, with InfiniBand GPUDirect RDMA interconnect.
Technologies
finetuning
infra