Accuracy decreases a lot in gguf conversion

#28
by fhaDL - opened

Hi,

I have finetuned this model for a custom VQA dataset and the accuracy is great, getting upto 95%. But when I convert to gguf format for running with llama.cpp, the accuracy falls down drastically to 60%.

This is how I am converting the model:
python convert_hf_to_gguf.py /path/to/local/model/folder --outtype bf16 --outfile model-2-1B-bf16.gguf
python convert_hf_to_gguf.py /path/to/local/model/folder --mmproj --outtype bf16 --outfile mmproj-model-2-1B-bf16.gguf

Then run it through llama.cpp using the following command:
./llama.cpp/build/bin/llama-server --host 0.0.0.0 --port 4183 -m "model-2-1B-bf16.gguf" --mmproj "mmproj-model-2-1B-f32.gguf" -c 8192 --temp 0.0 --top-k 0 --top-p 0.9 -ngl -1 --repeat-penalty 1.15 -n 1000

Any idea why the accuracy is decreasing to such an extent?

Thanks for considering!

Sign up or log in to comment