Update README.md
Browse files
README.md
CHANGED
|
@@ -114,8 +114,8 @@ See [its documentation](https://docs.sglang.ai/get_started/install.html) for mor
|
|
| 114 |
|
| 115 |
The following command can be used to create an API endpoint at `http://localhost:30000/v1` with maximum context length 256K tokens using tensor parallel on 4 GPUs.
|
| 116 |
```shell
|
| 117 |
-
python -m sglang.launch_server --model Qwen/Qwen3-Coder-Next --tp-size 2 --tool-call-parser qwen3_coder```
|
| 118 |
-
|
| 119 |
|
| 120 |
> [!Note]
|
| 121 |
> The default context length is 256K. Consider reducing the context length to a smaller value, e.g., `32768`, if the server fails to start.
|
|
|
|
| 114 |
|
| 115 |
The following command can be used to create an API endpoint at `http://localhost:30000/v1` with maximum context length 256K tokens using tensor parallel on 4 GPUs.
|
| 116 |
```shell
|
| 117 |
+
python -m sglang.launch_server --model Qwen/Qwen3-Coder-Next --port 30000 --tp-size 2 --tool-call-parser qwen3_coder```
|
| 118 |
+
```
|
| 119 |
|
| 120 |
> [!Note]
|
| 121 |
> The default context length is 256K. Consider reducing the context length to a smaller value, e.g., `32768`, if the server fails to start.
|