nvidia
/

Kimi-K2.5-NVFP4

Text Generation

Model Optimizer

Model card Files Files and versions

zhiyucheng commited on 12 days ago

Commit

3f7790f

·

verified ·

1 Parent(s): d4a2f75

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -102,7 +102,7 @@ This model was obtained by converting and quantizing the weights and activations
 ## Usage
-To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you can start the docker `vllm/vllm-openai:latest` and run the sample command below:
 ```sh
 python3 -m vllm.entrypoints.openai.api_server --model nvidia/Kimi-K2.5-NVFP4 --tensor-parallel-size 4 --tool-call-parser kimi_k2 --reasoning-parser kimi_k2 --trust-remote-code

 ## Usage
+To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you can start the docker `vllm/vllm-openai:v0.15.0` and run the sample command below:
 ```sh
 python3 -m vllm.entrypoints.openai.api_server --model nvidia/Kimi-K2.5-NVFP4 --tensor-parallel-size 4 --tool-call-parser kimi_k2 --reasoning-parser kimi_k2 --trust-remote-code