SGLang deploy commands

#2
by vvekthkr - opened

Could you please share recommended SGLang deploy commands, I currently use a rtx 5090 and a pro 6000. If all goes well, I might jump from a 4B model to 8B model with data-parallel pipeline of 2.

Octen-Team org

I’m not sure whether sglang supports deployment for this yet, but we’ve used vLLM and it does work.

You can refer to this example for details: https://huggingface.co/Qwen/Qwen3-Embedding-8B#vllm-usage

Sign up or log in to comment