Building on HF
·
AI & ML interests
Reward models
Organizations
models 18
reciprocate/mistral-7b-gsm8k-code-rm
Text Classification
• 7B • Updated • 4
• 3
reciprocate/mistral-7b-rm
Text Classification
• Updated • 8
• 2
reciprocate/rm_beluga-7b_hh-full
Text Classification
• Updated • 4
reciprocate/rm-llama2-7b-gsm8k
Text Generation
• Updated • 3
reciprocate/llama2-7b-gsm8k
Text Generation
• Updated • 3
• 1
Text Generation
• Updated • 4
• 1
Text Generation
• Updated • 3
reciprocate/vicuna-13b_rm_oasst-hh
Text Classification
• Updated • 1
reciprocate/openllama-13b-rlhf-v0
Text Generation
• Updated • 4
reciprocate/openllama-13b_rm_oasst-hh
Text Classification
• Updated • 3
datasets 35
reciprocate/kaggle-lmarena-synth-50k
Viewer
• Updated • 50.7k • 3
reciprocate/ultra-annotated-200k
Viewer
• Updated • 208k • 5
reciprocate/dpo-objective-v0.2
Viewer
• Updated • 384 • 7
reciprocate/tinygsm_interpreter_1M
Viewer
• Updated • 1M • 15
Viewer
• Updated • 541 • 4
reciprocate/dpo_mix-zero-math-untoxic
Viewer
• Updated • 6.91k • 9
reciprocate/dpo_mix-7k_untoxic
Viewer
• Updated • 7.29k • 6
• 2
reciprocate/tinygsm_mixtral_12M
Viewer
• Updated • 12M • 46
• 1
reciprocate/dpo_ultra-capybara-code_filtered-best
Viewer
• Updated • 35.2k • 17
• 1
Viewer
• Updated • 6.17k • 25
• 2