dasLOL
/

Affine-12412414412124123

+---
+license: other
+license_name: modified-mit
+library_name: transformers
+---
+<div align="center">
+  <picture>
+      <img src="figures/kimi-logo.png" width="30%" alt="Kimi K2: Open Agentic Intellignece">
+  </picture>
+</div>
+<hr>
+<div align="center" style="line-height:1">
+  <a href="https://www.kimi.com" target="_blank"><img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-Kimi%20K2-ff6b6b?color=1783ff&logoColor=white"/></a>
+  <a href="https://github.com/moonshotai/Kimi-K2"><img alt="github" src="https://img.shields.io/badge/🤖%20Github-Kimi%20K2-ff6b6b?color=1783ff&logoColor=white"/></a>
+  <a href="https://www.moonshot.ai" target="_blank"><img alt="Homepage" src="https://img.shields.io/badge/Homepage-Moonshot%20AI-white?logo=Kimi&logoColor=white"/></a>
+</div>
+<div align="center" style="line-height: 1;">
+  <a href="https://huggingface.co/moonshotai" target="_blank"><img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Moonshot%20AI-ffc107?color=ffc107&logoColor=white"/></a>
+  <a href="https://twitter.com/kimi_moonshot" target="_blank"><img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-Kimi.ai-white?logo=x&logoColor=white"/></a>
+    <a href="https://discord.gg/TYU2fdJykW" target="_blank"><img alt="Discord" src="https://img.shields.io/badge/Discord-Kimi.ai-white?logo=discord&logoColor=white"/></a>
+</div>
+<div align="center" style="line-height: 1;">
+  <a href="https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
+</div>
+<p align="center">
+<b>📰&nbsp;&nbsp;<a href="https://moonshotai.github.io/Kimi-K2/">Tech Blog</a></b> &nbsp;&nbsp;&nbsp; | &nbsp;&nbsp;&nbsp; <b>📄&nbsp;&nbsp;<a href="https://github.com/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf">Paper</a></b>
+</p>
+## 1. Model Introduction
+Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.
+### Key Features
+- Enhanced agentic coding intelligence: Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
+- Improved frontend coding experience: Kimi K2-Instruct-0905 offers advancements in both the aesthetics and practicality of frontend programming.
+- Extended context length: Kimi K2-Instruct-0905’s context window has been increased from 128k to 256k tokens, providing better support for long-horizon tasks.
+## 2. Model Summary
+<div align="center">
+| | |
+|:---:|:---:|
+| **Architecture** | Mixture-of-Experts (MoE) |
+| **Total Parameters** | 1T |
+| **Activated Parameters** | 32B |
+| **Number of Layers** (Dense layer included) | 61 |
+| **Number of Dense Layers** | 1 |
+| **Attention Hidden Dimension** | 7168 |
+| **MoE Hidden Dimension** (per Expert) | 2048 |
+| **Number of Attention Heads** | 64 |
+| **Number of Experts** | 384 |
+| **Selected Experts per Token** | 8 |
+| **Number of Shared Experts** | 1 |
+| **Vocabulary Size** | 160K |
+| **Context Length** | 256K |
+| **Attention Mechanism** | MLA |
+| **Activation Function** | SwiGLU |
+</div>
+## 3. Evaluation Results
+| Benchmark              | Metric | K2-Instruct-0905 | K2-Instruct-0711 | Qwen3-Coder-480B-A35B-Instruct    | GLM-4.5    | DeepSeek-V3.1 | Claude-Sonnet-4 | Claude-Opus-4 |
+|------------------------|--------|------------------|------------------|--------|--------|--------|-----------------|---------------|
+| SWE-Bench verified     | ACC    | 69.2 ± 0.63      | 65.8             | 69.6*  | 64.2*  | 66.0*  | 72.7*            | 72.5*          |
+| SWE-Bench Multilingual | ACC    | 55.9 ± 0.72      | 47.3             | 54.7*  | 52.7   | 54.5*  | 53.3*           | -             |
+| Multi-SWE-Bench        | ACC    | 33.5 ± 0.28      | 31.3             | 32.7   | 31.7   | 29.0   | 35.7            | -             |
+| Terminal-Bench         | ACC    | 44.5 ± 2.03      | 37.5             | 37.5*  | 39.9*  | 31.3*  | 36.4*           | 43.2*         |
+| SWE-Dev                | ACC    | 66.6 ± 0.72      | 61.9             | 64.7   | 63.2   | 53.3   | 67.1            | -             |
+All K2-Instruct-0905 numbers are reported as mean ± std over five independent, full-test-set runs.
+Before each run we prune the repository so that every Git object unreachable from the target commit disappears; this guarantees the agent sees only the code that would legitimately be available at that point in history.
+Except for Terminal-Bench (Terminus-2), every result was produced with our in-house evaluation harness. The harness is derived from SWE-agent, but we clamp the context windows of the Bash and Edit tools and rewrite the system prompt to match the task semantics. All baseline figures denoted with an asterisk (*) are excerpted directly from their official report or public leaderboard; the remaining metrics were evaluated by us under conditions identical to those used for K2-Instruct-0905.
+For SWE-Dev we go one step further: we overwrite the original repository files and delete any test file that exercises the functions the agent is expected to generate, eliminating any indirect hints about the desired implementation.
+## 4. Deployment
+> [!Note]
+> You can access Kimi K2's API on https://platform.moonshot.ai , we provide OpenAI/Anthropic-compatible API for you.
+>
+> The Anthropic-compatible API maps temperature by `real_temperature = request_temperature * 0.6` for better compatible with existing applications.
+Our model checkpoints are stored in the block-fp8 format, you can find it on [Huggingface](https://huggingface.co/moonshotai/Kimi-K2-Instruct).
+Currently, Kimi-K2 is recommended to run on the following inference engines:
+* vLLM
+* SGLang
+* KTransformers
+* TensorRT-LLM
+Deployment examples for vLLM and SGLang can be found in the [Model Deployment Guide](docs/deploy_guidance.md).
+---
+## 5. Model Usage
+### Chat Completion
+Once the local inference service is up, you can interact with it through the chat endpoint:
+```python
+def simple_chat(client: OpenAI, model_name: str):
+    messages = [
+        {"role": "system", "content": "You are Kimi, an AI assistant created by Moonshot AI."},
+        {"role": "user", "content": [{"type": "text", "text": "Please give a brief self-introduction."}]},
+    ]
+    response = client.chat.completions.create(
+        model=model_name,
+        messages=messages,
+        stream=False,
+        temperature=0.6,
+        max_tokens=256
+    )
+    print(response.choices[0].message.content)
+```
+> [!NOTE]
+> The recommended temperature for Kimi-K2-Instruct-0905 is `temperature = 0.6`.
+> If no special instructions are required, the system prompt above is a good default.
+---
+### Tool Calling
+Kimi-K2-Instruct-0905 has strong tool-calling capabilities.
+To enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them.
+The following example demonstrates calling a weather tool end-to-end:
+```python
+# Your tool implementation
+def get_weather(city: str) -> dict:
+    return {"weather": "Sunny"}
+# Tool schema definition
+tools = [{
+    "type": "function",
+    "function": {
+        "name": "get_weather",
+        "description": "Retrieve current weather information. Call this when the user asks about the weather.",
+        "parameters": {
+            "type": "object",
+            "required": ["city"],
+            "properties": {
+                "city": {
+                    "type": "string",
+                    "description": "Name of the city"
+                }
+            }
+        }
+    }
+}]
+# Map tool names to their implementations
+tool_map = {
+    "get_weather": get_weather
+}
+def tool_call_with_client(client: OpenAI, model_name: str):
+    messages = [
+        {"role": "system", "content": "You are Kimi, an AI assistant created by Moonshot AI."},
+        {"role": "user", "content": "What's the weather like in Beijing today? Use the tool to check."}
+    ]
+    finish_reason = None
+    while finish_reason is None or finish_reason == "tool_calls":
+        completion = client.chat.completions.create(
+            model=model_name,
+            messages=messages,
+            temperature=0.6,
+            tools=tools,          # tool list defined above
+            tool_choice="auto"
+        )
+        choice = completion.choices[0]
+        finish_reason = choice.finish_reason
+        if finish_reason == "tool_calls":
+            messages.append(choice.message)
+            for tool_call in choice.message.tool_calls:
+                tool_call_name = tool_call.function.name
+                tool_call_arguments = json.loads(tool_call.function.arguments)
+                tool_function = tool_map[tool_call_name]
+                tool_result = tool_function(**tool_call_arguments)
+                print("tool_result:", tool_result)
+                messages.append({
+                    "role": "tool",
+                    "tool_call_id": tool_call.id,
+                    "name": tool_call_name,
+                    "content": json.dumps(tool_result)
+                })
+    print("-" * 100)
+    print(choice.message.content)
+```
+The `tool_call_with_client` function implements the pipeline from user query to tool execution.
+This pipeline requires the inference engine to support Kimi-K2’s native tool-parsing logic.
+For more information, see the [Tool Calling Guide](docs/tool_call_guidance.md).
+---
+## 6. License
+Both the code repository and the model weights are released under the [Modified MIT License](LICENSE).
+---
+## 7. Third Party Notices
+See [THIRD PARTY NOTICES](THIRD_PARTY_NOTICES.md)
+---
+## 7. Contact Us
+If you have any questions, please reach out at [support@moonshot.cn](mailto:support@moonshot.cn).