About this project
ClawBench benchmarks a coding agent across open-weight models on Nebius Token Factory, scoring each output on quality, cost, latency, and structured-output reliability. It then exports a routing.md the agent reads natively to route each task to the best model.
Technologies
agents
benchmark