Lamapi
/

next-ocr

Model card Files Files and versions

xet

Community

Lamapi commited on 29 days ago

Commit

6657632

verified ·

1 Parent(s): 0472c48

Update README.md

Browse files

Files changed (1) hide show

README.md +34 -13

README.md CHANGED Viewed

@@ -194,10 +194,16 @@ Supports **multilingual OCR** (Turkish, English, German, Spanish, French, Chines
 ## 📊 Benchmark & Comparison
-| Model               | OCR Accuracy (%)         | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
-| ------------------- | ------------------------ | ------------------------- | -------------------------------- |
-| **Next OCR 8B**     | 98.9                     | 96.7                      | 94.4                             |
-| **DeepSeek‑OCR 3B** | 97 (yüksek sıkıştırmada) | 88–90                     | 85–87                            |
 ---
@@ -210,15 +216,30 @@ import torch
 model_id = "Lamapi/next-ocr"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
-image_path = "document.png"
-images = [image_path]
-inputs = tokenizer(images, return_tensors="pt").to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=512)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ---

 ## 📊 Benchmark & Comparison
+| Model                           | OCR-Bench Accuracy (%)   | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
+| ------------------------------- | ------------------------ | ------------------------- | -------------------------------- |
+| **Next OCR**                    | **99.0**                 | **96.8**                  | **95.3**                         |
+| PaddleOCR                       | 95.2                     | 93.9                      | 95.3                             |
+| Deepseek OCR                    | 90.6                     | 87.4                      | 86.1                             |
+| Tesseract                       | 92.0                     | 88.4                      | 72.0                             |
+| EasyOCR                         | 90.4                     | 84.7                      | 78.9                             |
+| Google Cloud Vision / DocAI     | 98.7                     | 95.5                      | 93.6                             |
+| Amazon Textract                 | 94.7                     | 86.2                      | 86.1                             |
+| Azure Document Intelligence     | 95.1                     | 93.6                      | 91.4                             |
 ---
 model_id = "Lamapi/next-ocr"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16)
+img = Image.open("image.jpg")
+# ATTENTION: The content list must include both an image and text.
+messages = [
+    {"role": "system", "content": "You are Next-OCR, an helpful AI assistant trained by Lamapi."},
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "image": img},
+            {"type": "text", "text": "Read the text in this image and summarize it."}
+        ]
+    }
+]
+# Apply the chat template correctly
+prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device)
+with torch.no_grad():
+    generated = model.generate(**inputs, max_new_tokens=256)
+print(processor.decode(generated[0], skip_special_tokens=True))
 ```
 ---