error with ollama

by arm2arm - opened Jan 2

Jan 2

somehow getting error with:
ollama run hf.co/AaryanK/Solar-Open-100B-GGUF:Q4_K_M
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.attn_q.bias'
llama_model_load_from_file_impl: failed to load model
ollama --version
ollama version is 0.13.5
any ideas??

AaryanK

Owner Jan 2

I've confirmed the issue, and it is definitely on Ollama's end.

The model uses a newer architecture configuration (attention_bias=False) that removes specific bias tensors to improve performance. The error missing tensor... happens because the version of llama.cpp bundled inside your current Ollama installation is slightly behind and still expects those tensors to exist.

Since I can run it perfectly on the latest standalone llama.cpp, this is just a matter of waiting for Ollama to update their backend. You will need to wait for the next Ollama release or use llama.cpp directly

You can try the model using llama-cli in the meantime:

./llama-cli -m Solar-Open-100B.Q4_K_M.gguf \
  -c 8192 \
  --temp 0.8 \
  --top-p 0.95 \
  --top-k 50 \
  -p "User: Who are you?\nAssistant:" \
  -cnv

hunkimup

Jan 2

I will ping ollama team. thanks!

AaryanK changed discussion status to closed Jan 2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment