·
AI & ML interests
NLP, RLHF, IR
Organizations
models
13
Makrrr/qwen3-8B-reasonmed-finetune-extreme
Text Generation
•
8B
•
Updated
•
2
Makrrr/qwen2.5-7B-reasonmed-finetune-extreme
Text Generation
•
8B
•
Updated
•
4
Makrrr/Qwen3-1.7B-GSM8K-GRPO-verl
Reinforcement Learning
•
2B
•
Updated
•
28
•
3
Makrrr/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
•
20
Makrrr/ppo-SnowballTarget
Reinforcement Learning
•
Updated
•
13
Makrrr/Pixelcopter-PLE-v0
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Makrrr/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
4
Reinforcement Learning
•
Updated