·
AI & ML interests
None yet
Organizations
Anna4242/qwen25-7b-multihop-grpo-checkpoint-200
8B • Updated
Anna4242/qwen25-7b-singlehop-grpo-checkpoint-200
8B • Updated
Anna4242/qwen25-3b-instruct-grpo-merged
3B • Updated
• 1
Anna4242/qwen25-3b-base-grpo
Text Generation
• Updated
Anna4242/qwen25-7b-full-sft-multihop
8B • Updated
• 3
Anna4242/qwen25-3b-full-sft-multihop
3B • Updated
• 1
Anna4242/qwen25-7b-sft-grpo-checkpoint-200
Reinforcement Learning
• Updated
Anna4242/qwen25-3b-original-sft-ep1-grpo-checkpoint-200
Text Generation
• Updated
Anna4242/Qwen2.5-7B-Instruct-onlyrl-step-1000
8B • Updated
Anna4242/Qwen2.5-7B-Instruct-Singlehop-SFT
8B • Updated
Anna4242/Qwen2.5-3B-Instruct-Singlehop-SFT
3B • Updated
Anna4242/Qwen2.5-1.5B-Instruct-Singlehop-SFT
2B • Updated
Anna4242/Qwen2.5-instruct-rl-only
8B • Updated
Anna4242/Singlehop-Qwen3-8b-final
8B • Updated
• 2
Anna4242/Singlehop-Qwen3-8b-epoch1
8B • Updated
Anna4242/Singlehop-Qwen3-1.7b-final
2B • Updated
• 1
Anna4242/Singlehop-Qwen3-1.7b-epoch1
2B • Updated
Anna4242/Singlehop-Qwen3-1.7b-epoch2
Updated
Anna4242/Multihop-Qwen3-8b-epoch2
8B • Updated
• 1
Anna4242/Multihop-Qwen3-8b-epoch1
8B • Updated
• 1
Anna4242/Singlehop-Qwen3-4b-epoch1
4B • Updated
Anna4242/Multihop-Qwen3-1.7b-final
2B • Updated
• 2
Anna4242/Multihop-Qwen3-1.7b-epoch1
2B • Updated
Anna4242/Multihop-Qwen3-4b-final
4B • Updated
Anna4242/Multihop-Qwen3-4b-epoch1
4B • Updated