launch
/

ThinkPRM-1.5B

Text Generation

generative reward model

process supervision

chain-of-thought

code verification

text-generation-inference

Model card Files Files and versions

mkhalifa commited on Jun 25

Commit

021087f

·

verified ·

1 Parent(s): 1e997c2

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -20,6 +20,10 @@ ThinkPRM-1.5B is a generative Process Reward Model (PRM) based on the R1-Distill
 Here's an example of the model output:
 ## Model Details
 ### Model Description

 Here's an example of the model output:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/5f350fe67e5835433862161b/hN9K1zf-d2jZiyGoBxYbA.png)
 ## Model Details
 ### Model Description