HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 8 days ago • 27
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 13 days ago • 18
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 26 days ago • 10
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 26 days ago • 10