Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 7 days ago • 41
Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling Paper • 2604.27039 • Published Apr 29 • 26
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents Paper • 2604.23781 • Published Apr 26 • 33