arxiv:2504.16828
Muhammad Khalifa
mkhalifa
AI & ML interests
natural language genration, reinforcement learning
Recent Activity
liked a dataset about 17 hours ago
nvidia/Nemotron-Personas-Korea updated a dataset 5 days ago
launch/thinkprm-1K-verification-cots submitted a paper 3 months ago
Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation