Update README.md
Browse files
README.md
CHANGED
|
@@ -80,6 +80,10 @@ Performance of Step 3.5 Flash measured across **Reasoning**, **Coding**, and **A
|
|
| 80 |
3. **BrowseComp (with Context Manager)**: When the effective context length exceeds a predefined threshold, the agent resets the context and restarts the agent loop. By contrast, Kimi K2.5 and DeepSeek-V3.2 used a "discard-all" strategy.
|
| 81 |
4. **Decoding Cost**: Estimates are based on a methodology similar to, but more accurate than, the approach described arxiv.org/abs/2507.19427
|
| 82 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
## 4. Architecture Details
|
| 84 |
|
| 85 |
Step 3.5 Flash is built on a **Sparse Mixture-of-Experts (MoE)** transformer architecture, optimized for high throughput and low VRAM usage during inference.
|
|
|
|
| 80 |
3. **BrowseComp (with Context Manager)**: When the effective context length exceeds a predefined threshold, the agent resets the context and restarts the agent loop. By contrast, Kimi K2.5 and DeepSeek-V3.2 used a "discard-all" strategy.
|
| 81 |
4. **Decoding Cost**: Estimates are based on a methodology similar to, but more accurate than, the approach described arxiv.org/abs/2507.19427
|
| 82 |
|
| 83 |
+
### Recommended Inference Parameters
|
| 84 |
+
1. For general chat domain, we suggest: `temperature=0.6, top_p=0.95`
|
| 85 |
+
2. For reasoning / agent scenario, we recommend: `temperature=1.0, top_p=0.95`.
|
| 86 |
+
|
| 87 |
## 4. Architecture Details
|
| 88 |
|
| 89 |
Step 3.5 Flash is built on a **Sparse Mixture-of-Experts (MoE)** transformer architecture, optimized for high throughput and low VRAM usage during inference.
|