Add GPQA evaluation result

#241
by burtenshaw HF Staff - opened

Evaluation Results

This PR adds structured evaluation results using the new .eval_results/ format.

What This Enables

  • Model Page: Results appear on the model page with benchmark links
  • Leaderboards: Scores are aggregated into benchmark dataset leaderboards
  • Verification: Support for cryptographic verification of evaluation runs

Model Evaluation Results

Format Details

Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.


Generated by community-evals

For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment