Zen4 Pro Max
Zen4 Pro Max is an 80B MoE (3B active) parameter language model from the Zen4 family by Zen LM and Hanzo AI.
The ultimate consumer model with hybrid Gated DeltaNet + Gated Attention + MoE architecture, running at just 3B active parameters. Built on abliterated (uncensored) weights from Qwen3-Next-80B-A3B-Instruct.
Model Details
| Property | Value |
|---|---|
| Parameters | 80B total, 3B active |
| Context | 256K tokens |
| Base | Qwen3-Next-80B-A3B-Instruct (abliterated) |
| Architecture | Hybrid DeltaNet + MoE, 512 experts |
| License | Apache 2.0 |
| Family | Zen4 |
| Creator | Zen LM / Hanzo AI |
Zen4 Family
| Model | Params | Active | Context | HuggingFace |
|---|---|---|---|---|
| Zen4 Mini | 4B | 4B | 32K | zenlm/zen4-mini |
| Zen4 | 8B | 8B | 32K | zenlm/zen4 |
| Zen4 Pro | 14B | 14B | 32K | zenlm/zen4-pro |
| Zen4 Max | 30B MoE | 3B | 256K | zenlm/zen4-max |
| Zen4 Pro Max | 80B MoE | 3B | 256K | zenlm/zen4-pro-max |
| Zen4 Coder Flash | 31B MoE | 3B | 131K | zenlm/zen4-coder-flash |
| Zen4 Coder | 80B MoE | 3B | 256K | zenlm/zen4-coder |
| Zen4 Ultra | 1.04T MoE | 32B | 256K | zenlm/zen4-ultra |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("zenlm/zen4-pro-max")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen4-pro-max")
messages = [{"role": "user", "content": "Hello, who are you?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Links
Model tree for zenlm/zen4-pro-max
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct