Models
All 14 Zen models -- capabilities, pricing, and recommended use cases
Available Models
List Models
GET https://api.hanzo.ai/v1/modelsReturns all available Zen models.
Zen4 Generation (9 models)
The latest generation. Flagship, reasoning, and code models.
| Model | Context | Architecture | Input $/1M | Output $/1M |
|---|---|---|---|---|
| zen4 | 202K | ~400B Dense | $3.00 | $9.60 |
| zen4-ultra | 202K | ~400B Dense + CoT | $3.00 | $9.60 |
| zen4-pro | 131K | 80B (3B active) MoE | $2.70 | $2.70 |
| zen4-max | 256K | 1.04T (32B active) MoE | $3.60 | $3.60 |
| zen4-mini | 40K | 8B Dense | $0.60 | $0.60 |
| zen4-thinking | 131K | 80B (3B active) MoE + CoT | $2.70 | $2.70 |
| zen4-coder | 262K | 480B (35B active) MoE | $3.60 | $3.60 |
| zen4-coder-pro | 262K | 480B Dense BF16 | $4.50 | $4.50 |
| zen4-coder-flash | 262K | Dense | $1.50 | $1.50 |
Zen3 Generation (5 models)
Multimodal, vision, safety, and embedding models.
| Model | Context | Architecture | Input $/1M | Output $/1M |
|---|---|---|---|---|
| zen3-omni | 202K | ~200B Dense Multimodal | $1.80 | $6.60 |
| zen3-vl | 131K | 30B (3B active) MoE VL | $0.45 | $1.80 |
| zen3-nano | 40K | 4B Dense | $0.30 | $0.30 |
| zen3-guard | 40K | 4B Dense | $0.30 | $0.30 |
| zen3-embedding | 8K | Embedding (3072 dim) | $0.39 | -- |
Model Selection Guide
By Task
| Task | Recommended Model |
|---|---|
| General chat | zen4 |
| Maximum reasoning | zen4-ultra |
| Deep reasoning (CoT) | zen4-thinking |
| Code generation | zen4-coder |
| Fast code iteration | zen4-coder-flash |
| Premium code accuracy | zen4-coder-pro |
| Image understanding | zen3-vl |
| Multimodal (text+vision+audio) | zen3-omni |
| Content moderation | zen3-guard |
| Text embeddings | zen3-embedding |
| Edge / mobile | zen3-nano |
| Budget-friendly | zen4-mini |
| Extended context docs | zen4-max |
| High capability | zen4-pro |
By Budget
| Budget | Recommended |
|---|---|
| Free tier ($5) | zen4-mini, zen3-nano |
| Low cost | zen4-mini, zen3-vl, zen4-coder-flash |
| Standard | zen4-pro, zen4-coder, zen4-thinking |
| Premium | zen4, zen4-ultra, zen4-coder-pro |
Open Weights
All Zen models are also available as open weights for self-hosting:
- HuggingFace: huggingface.co/zenlm
- Ollama:
ollama run zen4 - Formats: SafeTensors, GGUF, MLX
Cloud API via Hanzo gives you managed infrastructure, usage tracking, and pay-per-token billing without running your own GPUs.