"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked a model about 8 hours ago
Qwen/Qwen2.5-VL-3B-Instruct liked a model 4 days ago
mistralai/Ministral-3-3B-Instruct-2512 liked a model 5 days ago
Qwen/Qwen3.5-27B