Update README.md
Browse files
README.md
CHANGED
|
@@ -102,7 +102,7 @@ This model was obtained by converting and quantizing the weights and activations
|
|
| 102 |
## Usage
|
| 103 |
|
| 104 |
|
| 105 |
-
To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you can start the docker `vllm/vllm-openai:
|
| 106 |
|
| 107 |
```sh
|
| 108 |
python3 -m vllm.entrypoints.openai.api_server --model nvidia/Kimi-K2.5-NVFP4 --tensor-parallel-size 4 --tool-call-parser kimi_k2 --reasoning-parser kimi_k2 --trust-remote-code
|
|
|
|
| 102 |
## Usage
|
| 103 |
|
| 104 |
|
| 105 |
+
To serve this checkpoint with [vLLM](https://github.com/vllm-project/vllm), you can start the docker `vllm/vllm-openai:v0.15.0` and run the sample command below:
|
| 106 |
|
| 107 |
```sh
|
| 108 |
python3 -m vllm.entrypoints.openai.api_server --model nvidia/Kimi-K2.5-NVFP4 --tensor-parallel-size 4 --tool-call-parser kimi_k2 --reasoning-parser kimi_k2 --trust-remote-code
|