Enable thinking not works as expected while using VLLM

#16
by MRU4913 - opened

{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"chat_template_kwargs": {"enable_thinking": false},
"transformers_version": "4.51.0"
}

I'm using vlllm for deployment, but it still outputs <think>

OpenBMB org
MRU4913 changed discussion status to closed

Sign up or log in to comment