No text output inferencing with ollama?

by frankslin - opened 16 days ago

16 days ago

I'm using the example image from https://cdn.bigmodel.cn/static/logo/introduction.png.

Apple M2, macOS 15.7.3.

$ ollama -v
ollama version is 0.15.5-rc1

$ sha256sum introduction.png
b01139689cb0682b3a1d6f3f3eda3f481101571654e3a23a1de5bf5520f3c6f5  introduction.png

$ ollama run glm-ocr Text Recognition: ./introduction.png
Added image './introduction.png'
```markdown

```markdown

```markdown

```text


When I tried with another file, I got this error:

Error: an error was encountered while running the model: GGML_ASSERT(a->ne[2] * 4 == b->ne[0]) failed
WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info.
WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash.
See: https://github.com/ggml-org/llama.cpp/pull/17869
0 ollama 0x0000000100e6d620 ggml_print_backtrace + 276


What am I missing?

famer058

14 days ago

我也遇见了同样的问题，我希望这个研究人员要端正态度，好好的把东西做好。

dewitcho

14 days ago

•

edited 14 days ago

I am having the same problem serving glm-ocr from ollama.

Macbook Air M2, Tahoe 26.2
Ollama pre-release 0.15.5

Output for first three images of a non-complex PDF (https://www.fiw.uni-bonn.de/de/forschung/demokratieforschung/team/prof-dr-rudolf-stichweh/papers/pdfs/81_stw_niklas-luhmann-blackwell-companion-to-major-social-theorists.pdf):

Page 1

"<table bordered, no table, but no table, but no table, but no table, no table, no table, no layout, but no layout, or any other content is present. No layout, or any other content is present. No text is present. No layout. No layout. No layout. No layout. No layout."

Page 2

"<table bordered, no table, but no table, but no table, but no table, no table, no layout, rows, columns, rows, columns, rows, or any other content within the image content is present. No text, or any other content is present. No text is present. No layout, just a single row, just a single row, just a single row, just a single row, just a single row, no layout. No layout. No layout. No layout. No layout. No layout. No layout. No layout."

Page 3

Error extracting page: an error was encountered while running the model: GGML_ASSERT(a->ne[2] * 4 == b->ne[0]) failed
WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info.
WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash.
See: https://github.com/ggml-org/llama.cpp/pull/17869
0 ollama 0x0000000101b050c0 ggml_print_backtrace + 276
1 ollama 0x0000000101b052ac ggml_abort + 156
2 ollama 0x0000000101b0d8e8 ggml_rope + 300
3 ollama 0x0000000101b0db44 ggml_rope_multi + 20
4 ollama 0x0000000101aaf2b0 _cgo_7ebcd35a9797_Cfunc_ggml_rope_multi + 64
5 ollama 0x0000000100e6549c ollama + 513180 (status code: 500)

NeoHuggingF

13 days ago

Either increase the context for the model or reduce the image size. Depending on your available VRAM, Ollama defaults to only 4096 as context, that is not enough for that large file (introduction.png). You can check the available context after loading the model with ollama ps.

letorbi

13 days ago

I can confirm that increasing the context size works. I am using a context size of 10240 and have successfully extracted text from images with a size of 12 megapixels. Here is the modelfile I am using:

# Modelfile generated by "ollama show"
FROM glm-ocr:latest

# Increase context size
PARAMETER num_ctx 10240

TEMPLATE {{ .Prompt }}
RENDERER glm-ocr
PARSER glm-ocr
PARAMETER temperature 0

khansiz1

10 days ago

yea ur right mate

stevan79

10 days ago

Thank you to all for providing the "increase context window" solution .
I did same and I have moved from gibberish output to the clean exact text expected

hoangns

7 days ago

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment