DeepSeek-V3.2 / .eval_results /evasionbench.yaml
FutureMa's picture
Add EvasionBench evaluation results
6554821 verified
raw
history blame
185 Bytes
- dataset:
id: FutureMa/EvasionBench
task_id: evasion_bench
value: 66.88
date: "2026-02-10"
source:
url: https://arxiv.org/abs/2601.09142
name: EvasionBench Paper