Accuracy decreases a lot in gguf conversion
Hi,
I have finetuned this model for a custom VQA dataset and the accuracy is great, getting upto 95%. But when I convert to gguf format for running with llama.cpp, the accuracy falls down drastically to 60%.
This is how I am converting the model:
python convert_hf_to_gguf.py /path/to/local/model/folder --outtype bf16 --outfile model-2-1B-bf16.gguf
python convert_hf_to_gguf.py /path/to/local/model/folder --mmproj --outtype bf16 --outfile mmproj-model-2-1B-bf16.gguf
Then run it through llama.cpp using the following command:
./llama.cpp/build/bin/llama-server --host 0.0.0.0 --port 4183 -m "model-2-1B-bf16.gguf" --mmproj "mmproj-model-2-1B-f32.gguf" -c 8192 --temp 0.0 --top-k 0 --top-p 0.9 -ngl -1 --repeat-penalty 1.15 -n 1000
Any idea why the accuracy is decreasing to such an extent?
Thanks for considering!