Hao Peng's picture

Hao Peng

Wesleythu

·

h-peng17

AI & ML interests

None yet

Organizations

New activity in huggingface/InferenceSupport 10 months ago

THU-KEG/TULU3-VerIF

#3578 opened 10 months ago by

commented a paper 12 months ago

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Paper • 2506.09942 • Published Jun 11, 2025 • 5 •

New activity in huggingface/HuggingDiscussions 12 months ago

[FEEDBACK] Daily Papers

#32 opened almost 2 years ago by

commented 2 papers over 1 year ago

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Paper • 2502.19328 • Published Feb 26, 2025 • 23 •

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 18 •