Made a llama.cpp version

#39

by notune - opened 3 days ago

3 days ago

If anyone needs to use it with llama.cpp, before the official implementaion, i implemented this: https://github.com/notune/llama.cpp/tree/model-glm-ocr
Testing vs ollama showed a slight speedup, but since i wrote this with ai, it probably doesnt meet the quality standards of the llama.cpp repo.

notune

2 days ago

Seems like we have an official implementation now: https://github.com/ggml-org/llama.cpp/pull/19677

notune changed discussion status to closed 2 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment