Lamapi commited on
Commit
6657632
·
verified ·
1 Parent(s): 0472c48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -13
README.md CHANGED
@@ -194,10 +194,16 @@ Supports **multilingual OCR** (Turkish, English, German, Spanish, French, Chines
194
 
195
  ## 📊 Benchmark & Comparison
196
 
197
- | Model | OCR Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
198
- | ------------------- | ------------------------ | ------------------------- | -------------------------------- |
199
- | **Next OCR 8B** | 98.9 | 96.7 | 94.4 |
200
- | **DeepSeek‑OCR 3B** | 97 (yüksek sıkıştırmada) | 88–90 | 85–87 |
 
 
 
 
 
 
201
 
202
  ---
203
 
@@ -210,15 +216,30 @@ import torch
210
  model_id = "Lamapi/next-ocr"
211
 
212
  tokenizer = AutoTokenizer.from_pretrained(model_id)
213
- model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
214
-
215
- image_path = "document.png"
216
- images = [image_path]
217
-
218
- inputs = tokenizer(images, return_tensors="pt").to(model.device)
219
- outputs = model.generate(**inputs, max_new_tokens=512)
220
-
221
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
222
  ```
223
 
224
  ---
 
194
 
195
  ## 📊 Benchmark & Comparison
196
 
197
+ | Model | OCR-Bench Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
198
+ | ------------------------------- | ------------------------ | ------------------------- | -------------------------------- |
199
+ | **Next OCR** | **99.0** | **96.8** | **95.3** |
200
+ | PaddleOCR | 95.2 | 93.9 | 95.3 |
201
+ | Deepseek OCR | 90.6 | 87.4 | 86.1 |
202
+ | Tesseract | 92.0 | 88.4 | 72.0 |
203
+ | EasyOCR | 90.4 | 84.7 | 78.9 |
204
+ | Google Cloud Vision / DocAI | 98.7 | 95.5 | 93.6 |
205
+ | Amazon Textract | 94.7 | 86.2 | 86.1 |
206
+ | Azure Document Intelligence | 95.1 | 93.6 | 91.4 |
207
 
208
  ---
209
 
 
216
  model_id = "Lamapi/next-ocr"
217
 
218
  tokenizer = AutoTokenizer.from_pretrained(model_id)
219
+ model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16)
220
+
221
+ img = Image.open("image.jpg")
222
+
223
+ # ATTENTION: The content list must include both an image and text.
224
+ messages = [
225
+ {"role": "system", "content": "You are Next-OCR, an helpful AI assistant trained by Lamapi."},
226
+ {
227
+ "role": "user",
228
+ "content": [
229
+ {"type": "image", "image": img},
230
+ {"type": "text", "text": "Read the text in this image and summarize it."}
231
+ ]
232
+ }
233
+ ]
234
+
235
+ # Apply the chat template correctly
236
+ prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
237
+ inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device)
238
+
239
+ with torch.no_grad():
240
+ generated = model.generate(**inputs, max_new_tokens=256)
241
+
242
+ print(processor.decode(generated[0], skip_special_tokens=True))
243
  ```
244
 
245
  ---