Zen Omni 30B Thinking

Advanced multimodal model with thinking and audio capabilities from the Zen family.

Model Details

  • Architecture: Qwen2-based with multimodal extensions
  • Parameters: 31.7B
  • Context Length: 32,768 tokens
  • Modalities: Text, Audio, Thinking
  • Hidden Size: 5,120
  • Layers: 64
  • Attention Heads: 40
  • Developer: Hanzo AI

Features

  • Thinking Module: Chain-of-thought reasoning capabilities
  • Audio Tower: Audio processing and understanding
  • Multimodal Integration: Seamless text and audio processing

Usage

PyTorch

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("zenlm/zen-omni-30b-thinking")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-omni-30b-thinking")

# Generate text
prompt = "Explain quantum computing"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Available Formats

  • PyTorch: Default safetensors format (16 shards)
  • GGUF: Coming soon
  • MLX: Coming soon

Hardware Requirements

  • VRAM: ~64GB for full precision
  • RAM: 128GB recommended
  • Storage: ~60GB for model files

Training

Fine-tuned with:

  • Zen identity
  • Multimodal understanding
  • Chain-of-thought reasoning
  • Audio processing capabilities

Model Components

  • thinker.*: Thinking/reasoning module
  • audio_tower.*: Audio processing layers
  • Standard transformer layers for text generation

License

Apache 2.0

Downloads last month
1
Safetensors
Model size
32B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support