odelalleau commited on
Commit
309c6a8
·
verified ·
1 Parent(s): aca6709

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -74,7 +74,7 @@ As of 29 May 2025, Qwen-3-Nemotron-32B-Reward has comparable scores on [JudgeBen
74
 
75
  ## Use Case
76
 
77
- Qwen-3-Nemotron-32B-Reward assigns a reward score to each LLM-generated response in a user–assistant dialogue.
78
 
79
  ## Release Date
80
 
@@ -150,7 +150,7 @@ If you find this model useful, please cite the following work:
150
 
151
  ```bibtex
152
  @misc{wang2025helpsteer3preferenceopenhumanannotatedpreference,
153
- title={HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages},
154
  author={Zhilin Wang and Jiaqi Zeng and Olivier Delalleau and Hoo-Chang Shin and Felipe Soares and Alexander Bukharin and Ellie Evans and Yi Dong and Oleksii Kuchaiev},
155
  year={2025},
156
  eprint={2505.11475},
 
74
 
75
  ## Use Case
76
 
77
+ Qwen-3-Nemotron-32B-Reward assigns a reward score to an LLM-generated response in a user–assistant dialogue.
78
 
79
  ## Release Date
80
 
 
150
 
151
  ```bibtex
152
  @misc{wang2025helpsteer3preferenceopenhumanannotatedpreference,
153
+ title={Help{S}teer3-{P}reference: Open Human-Annotated Preference Data across Diverse Tasks and Languages},
154
  author={Zhilin Wang and Jiaqi Zeng and Olivier Delalleau and Hoo-Chang Shin and Felipe Soares and Alexander Bukharin and Ellie Evans and Yi Dong and Oleksii Kuchaiev},
155
  year={2025},
156
  eprint={2505.11475},