Improve model card: Add pipeline tag, library name, paper and code links (#1)
Browse files- Improve model card: Add pipeline tag, library name, paper and code links (e28d411855797619eea03d3366dd686029ede847)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,14 +1,20 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- RenlyH/CodeV-RL-Data
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
- zh
|
|
|
|
| 8 |
metrics:
|
| 9 |
- accuracy
|
| 10 |
-
|
| 11 |
-
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 4 |
datasets:
|
| 5 |
- RenlyH/CodeV-RL-Data
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
- zh
|
| 9 |
+
license: mit
|
| 10 |
metrics:
|
| 11 |
- accuracy
|
| 12 |
+
pipeline_tag: image-text-to-text
|
| 13 |
+
library_name: transformers
|
| 14 |
---
|
| 15 |
|
| 16 |
+
CodeV is a code-based visual agent trained with Tool-Aware Policy Optimization (TAPO) for faithful visual reasoning. This agentic vision-language model is designed to "think with images" by calling image operations, addressing unfaithful visual reasoning in prior models. CodeV achieves competitive accuracy and substantially increases faithful tool-use rates on visual search benchmarks, also demonstrating strong performance on multimodal reasoning and math benchmarks.
|
| 17 |
+
|
| 18 |
+
This model was presented in the paper [CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization](https://huggingface.co/papers/2511.19661).
|
| 19 |
+
|
| 20 |
+
Code: https://github.com/RenlyH/CodeV
|