Accuracy decreases a lot in gguf conversion

#28

by fhaDL - opened 6 days ago

6 days ago

Hi,

I have finetuned this model for a custom VQA dataset and the accuracy is great, getting upto 95%. But when I convert to gguf format for running with llama.cpp, the accuracy falls down drastically to 60%.

This is how I am converting the model:
python convert_hf_to_gguf.py /path/to/local/model/folder --outtype bf16 --outfile model-2-1B-bf16.gguf
python convert_hf_to_gguf.py /path/to/local/model/folder --mmproj --outtype bf16 --outfile mmproj-model-2-1B-bf16.gguf

Then run it through llama.cpp using the following command:
./llama.cpp/build/bin/llama-server --host 0.0.0.0 --port 4183 -m "model-2-1B-bf16.gguf" --mmproj "mmproj-model-2-1B-f32.gguf" -c 8192 --temp 0.0 --top-k 0 --top-p 0.9 -ngl -1 --repeat-penalty 1.15 -n 1000

Any idea why the accuracy is decreasing to such an extent?

Thanks for considering!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment