Mockup for reviewTech-stack demonstration. Not affiliated with Nebius and not the live Builders Network.About this build →
← Library
REPO
Official
advanced · 35 min

ML Cookbook: pre-training DeepSeek-V3 with MXFP8 on a B200 cluster

A Nebius ml-cookbook recipe with Slurm job scripts for multi-node pre-training of DeepSeek-V3 (16B and 671B) on a 256-GPU NVIDIA B200 cluster, showing up to 41% faster throughput with MXFP8 mixed precision and DeepEP.
aicloud
soperator

The full write-up lives on the original source — use the link above to read it.

Mockup for reviewStack demo — not the live Builders Network.About this build →
Brand