Muhammad Khalifa's picture

Muhammad Khalifa

mkhalifa

·

https://mukhal.github.io/

AI & ML interests

natural language genration, reinforcement learning

Recent Activity

upvoted a paper about 1 month ago

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

liked a dataset about 2 months ago

nvidia/Nemotron-Personas-Korea

updated a dataset about 2 months ago

launch/thinkprm-1K-verification-cots

View all activity

Organizations

Papers 9

arxiv:2504.16828

arxiv:2412.04144

arxiv:2410.02899

arxiv:2405.16337

models 21

mkhalifa/flan-t5-large-gsm8k

Text Generation • Updated Jan 7 • 15

mkhalifa/flan-t5-large-svamp

Text Generation • Updated Jan 7 • 3

mkhalifa/flan-t5-large-mathqa

Text Generation • Updated Jan 7 • 5

mkhalifa/ThinkPRM-gptoss-20B

Updated Aug 18, 2025 • 15

mkhalifa/r1_14b_discriminative_prm

Text Generation • 15B • Updated Mar 27, 2025 • 2

mkhalifa/r1_14b_longthought-1K

Text Generation • 15B • Updated Mar 25, 2025 • 2

mkhalifa/r1-1.5b-longthought-outcome-matching

Text Generation • 2B • Updated Mar 20, 2025 • 2

mkhalifa/r1-1.5b-longthought-1K

Text Generation • 2B • Updated Mar 10, 2025 • 2

mkhalifa/r1_14b_longthought-1K-outcome-only

Text Generation • 15B • Updated Mar 9, 2025 • 2

mkhalifa/r1-1.5b-longthought-v2

Text Generation • 2B • Updated Mar 9, 2025 • 5

datasets 18

mkhalifa/agent

Updated Nov 26, 2025 • 3

mkhalifa/gpqa-diamond-physics

Viewer • Updated Mar 15, 2025 • 86 • 400

mkhalifa/short-to-long-5K

Viewer • Updated Feb 26, 2025 • 5k • 14

mkhalifa/CoGEX

Viewer • Updated Feb 13, 2025 • 51.8k • 391

mkhalifa/llama-3.1-8b-instruct-math-trajectories-64-sample-per-problem

Viewer • Updated Jan 29, 2025 • 736k • 32

mkhalifa/llama-3.1-8b-instruct-math-trajectories-48-sample-per-problem

Viewer • Updated Jan 29, 2025 • 552k • 17

mkhalifa/llama-3.1-8b-instruct-math-trajectories-32-sample-per-problem

Viewer • Updated Jan 29, 2025 • 368k • 248

mkhalifa/llama-3.1-8b-instruct-math-trajectories-16-sample-per-problem

Viewer • Updated Jan 29, 2025 • 184k • 8

mkhalifa/llama-3.1-8b-instruct-math-trajectories-8-sample-per-problem

Viewer • Updated Jan 29, 2025 • 92k • 6

mkhalifa/llama-3.1-70b-instruct-math-trajectories-8-sample-per-problem

Viewer • Updated Jan 29, 2025 • 92k • 5

View 18 datasets