Kai-30B-Instruct

A 30B-parameter instruction-tuned language model optimized for reasoning, math, and code generation tasks, powered by our ADS (Adaptive Dual-Search Distillation) technique. The largest model in the Kai family.

Model Details

Model Kai-30B-Instruct
Architecture LlamaForCausalLM
Parameters ~30B
Hidden size 7168
Intermediate size 20480
Layers 60
Attention heads 56 (8 KV heads, GQA)
Head dim 128
Context length 4096
Precision bfloat16
Vocab size 64,000
Chat template ChatML (<|im_start|> / <|im_end|>)

Benchmark Results (5-shot, acc_norm)

Benchmark Kai-30B-Instruct Llama-3 70B Qwen2.5 32B Yi-34B Llama-3 8B Mistral 7B Llama-2 7B
ARC-C 64.0 83.0 70.5 65.3 60.1 55.5 53.0
HellaSwag 74.4 89.0 85.2 83.1 78.6 81.3 78.6
PIQA 84.8 85.0 84.1 82.5 79.8 82.1 78.1
Winogrande 86.4 83.0 78.2 76.4 73.0 74.0 69.1

Benchmark Comparison

What is ADS?

Adaptive Dual-Search Distillation treats model fine-tuning as a constrained optimization problem inspired by Operations Research. The core mechanism is a dynamic loss function with a stateful dual penalty factor that adapts based on embedding space entropy — forcing the model to converge to high-confidence predictions at difficult reasoning points, without modifying the model architecture.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "NoesisLab/Kai-30B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("NoesisLab/Kai-30B-Instruct")

messages = [{"role": "user", "content": "What is 25 * 4?"}]
input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

output = model.generate(
    input_ids,
    max_new_tokens=512,
    temperature=0.6,
    top_p=0.8,
    do_sample=True,
)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Citation

@misc{noesislab2026kai30b,
  title={Kai-30B-Instruct},
  author={NoesisLab},
  year={2026},
  url={https://huggingface.co/NoesisLab/Kai-30B-Instruct}
}

License

Apache 2.0

Downloads last month
188
Safetensors
Model size
34B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 4 Ask for provider support

Model tree for NoesisLab/Kai-30B-Instruct

Quantizations
1 model

Space using NoesisLab/Kai-30B-Instruct 1

Collection including NoesisLab/Kai-30B-Instruct