About this project
A benchmarking harness that runs a chess tournament between large language models and the Stockfish engine, ranking each model's tactical play with Elo ratings. It is built with DSPy and requires models from Nebius AI Studio.
★ 9 stars on GitHub