RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 14 days ago • 1.64k • 79 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 16 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 16 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published May 18 • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated about 1 month ago • 18 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated about 1 month ago • 17 • 1 Miaow-Lab/STT-Arena Preview • Updated about 1 month ago • 102 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published May 18 • 1
RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 14 days ago • 1.64k • 79 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 16 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 16 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published May 18 • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated about 1 month ago • 18 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated about 1 month ago • 17 • 1 Miaow-Lab/STT-Arena Preview • Updated about 1 month ago • 102 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published May 18 • 1