shivash/enhanced-hybrid-transformer-768d-trained-thinking Text Generation • 0.1B • Updated Sep 24 • 1
TMLR-Group-HF/Majority-Voting-Llama-3.2-3B-Instruct-DAPO14k Text Generation • 4B • Updated Oct 11 • 9
mradermacher/Self-Certainty-Qwen3-1.7B-Base-MATH-GGUF Reinforcement Learning • 2B • Updated Oct 11 • 175 • 1