view article Article Scaling robotics datasets with video encoding +1 aliberts, cadene, mfarre • Aug 27, 2024 • 41
view article Article Understanding NDCG@k (Normalized Discounted Cumulative Gain) charchits7 • Dec 3, 2025 • 3
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 32 items • Updated 5 days ago • 83
SWE-bench Collection SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues. • 4 items • Updated Mar 8, 2025 • 10
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know qgallouedec • Apr 18, 2025 • 72