Add GPQA evaluation result

#241

by burtenshaw HF Staff - opened Jan 27

base: refs/heads/main

←

from: refs/pr/241

Discussion Files changed

-0

Add GPQA evaluation result12c7a75d

burtenshaw

Jan 27

Evaluation Results

This PR adds structured evaluation results using the new .eval_results/ format.

What This Enables

Model Page: Results appear on the model page with benchmark links
Leaderboards: Scores are aggregated into benchmark dataset leaderboards
Verification: Support for cryptographic verification of evaluation runs

Format Details

Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.

Generated by community-evals

Fix task_id to diamond (matching benchmark eval.yaml)00c6b9d6

xujfcn

13 days ago

For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment