Zen LM
Frontier AI models for code, reasoning, vision, video, audio, 3D, and agentic workflows
49 models across 10 modalities. 15 production API models from 4B to 1T+ parameters. Open weights on HuggingFace. OpenAI-compatible API. Built by Hanzo AI (Techstars '17).
Flagship Models
Three tiers — from efficient edge to trillion-parameter frontier scale
zen4-max
1.04T MoE · 32B active · 256K ctx
Trillion-parameter frontier model with deep reasoning. Our largest and most capable model.
zen4
744B MoE · 40B active · 202K ctx
Flagship MoE for complex reasoning and multi-domain tasks. Optimal intelligence-to-speed ratio.
zen4-coder
480B MoE · 35B active · 262K ctx
Code-specialized MoE model for generation, review, debugging, and agentic programming.
API Pricing
Pay-as-you-go. $5 free credit on signup. No minimum commitment.
Free credit on every new account
- All 15 API models
- OpenAI-compatible API
- 30-day expiry
Starting from — scale with usage
- Prepaid credits
- Real-time usage tracking
- No surprise bills
Volume pricing, SLAs, dedicated support
- Volume discounts
- Dedicated infrastructure
- SLA guarantees
Zen4 Generation — Production API
Zen3 Generation — Specialized Models
| Model | Architecture | Context | Tier | Input $/MTok | Output $/MTok |
|---|---|---|---|---|---|
| zen3-omni | ~200B Dense Multimodal | 202K | pro max | $1.80 | $6.60 |
| zen3-vl | 30B (3B active) MoE VL | 131K | pro max | $0.45 | $1.80 |
| zen3-nano | 4B Dense | 40K | pro | $0.30 | $0.30 |
| zen3-guard | 4B Dense | 40K | pro | $0.30 | $0.30 |
| zen3-embedding | Embedding (3072 dim) | 8K | pro max | $0.39 | — |
Quick Start
Install the Hanzo SDK — supports OpenAI and Claude-style endpoints, plus 100+ providers
pip install hanzoai
from hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
response = client.chat.completions.create(
model="zen4",
messages=[
{"role": "user", "content": "Hello, Zen."}
],
)
print(response.choices[0].message.content)npm install hanzoai
import Hanzo from "hanzoai";
const client = new Hanzo({
apiKey: "hk-your-api-key",
});
const response = await client.chat.completions.create({
model: "zen4-coder",
messages: [
{ role: "user", content: "Write a React hook" }
],
});
console.log(response.choices[0].message.content);curl https://api.hanzo.ai/v1/chat/completions \
-H "Authorization: Bearer hk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "zen4",
"messages": [
{"role": "user", "content": "Hello, Zen."}
]
}'from hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
stream = client.chat.completions.create(
model="zen4-max",
messages=[
{"role": "user", "content": "Explain MoE"}
],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")SDK Features
Full Model Library
49 models across 10 categories — text, code, vision, video, audio, 3D, safety, embeddings, and agents
Foundation
Core models from 0.6B edge to 1T+ frontier scale
Next-generation preview
Zen4 Generation
Latest generation — 10 production API models
Deep reasoning — chain-of-thought enabled
Code generation — 262K context for large codebases
HuggingFaceNext-gen code generation — preview
Code
Specialized for generation, review, debugging, and agentic programming
Vision & Image
Image understanding, generation, editing, and design
Hypermodal — text, vision, audio, code in one model
HuggingFaceHigh-resolution image generation, multiple styles
Edit-by-instruction, inpainting, outpainting
UI/UX design, graphics, mockups
Video
Video generation, understanding, and world modeling
Cinematic-quality text-to-video generation
Video analysis, frame-by-frame understanding
Animate images into video sequences
Spatial reasoning and world simulation
Audio & Speech
Music generation, voice synthesis, transcription, and translation
Multi-language transcription
100+ languages, context-aware translation
Multi-language voice dubbing and synthesis
Ultra-low latency real-time voice
Multi-instrument music composition
Text-to-SFX, foley art generation
Bidirectional real-time speech translation
3D & Spatial
3D asset generation and world simulation
Text-to-3D and image-to-3D asset generation
Spatial reasoning and environment simulation
Safety & Guardrails
Content moderation and safety classification
Safe generation with built-in guardrails
Low-latency streaming moderation
Embedding & Retrieval
Text embeddings and search reranking
Agents
Agentic AI with tool use and multi-step planning
Agentic AI — tool use, multi-step planning, autonomous execution
10 Modalities
One model family covering every AI capability
Text
14 models
Chat, reasoning, analysis
Code
9 models
Generation, review, debugging
Vision
5 models
Understanding, generation, editing
Video
4 models
Generation, understanding, I2V
Audio
7 models
Speech, music, translation
3D
2 models
Generation, world simulation
Safety
3 models
Moderation, guardrails
Embedding
2 models
Search, retrieval, RAG
Agents
1 models
Tool use, planning
Math
6 models
Reasoning, proof, computation
Architecture
Zen MoDE — curating best open-source foundations and fusing them into a unified, high-performance family
Consumer Line
Dense and MoE models from 4B to 80B. Edge-deployable dense models and efficient MoE flagships with only 3B active parameters.
Coder Line
Code-specialized models trained on 8.47B tokens of real agentic programming data. Fast completions to full-precision code intelligence.
Ultra Line
Trillion-parameter MoE models for cloud deployment. 1.04T parameters with 32B active for frontier-scale reasoning.
Efficient MoE
Mixture-of-Experts delivers frontier performance with only 3B active parameters — runs on consumer hardware.
Long Context
Up to 262K context on code models, 256K on frontier models. Dense models support 32–40K for efficient local inference.
Zen MoDE
Mixture of Distilled Experts — curating the best open-source foundations and fusing into a unified model family.
Zen Agentic Dataset
8.47 billion tokens of real-world agentic programming — not synthetic data
Git History
4.03B tokens
Agentic Debug Sessions
2.42B tokens
Architecture Discussions
1.14B tokens
Code Review Sessions
0.86B tokens
Open Weights
Download, self-host, and fine-tune — multiple formats for every platform
SafeTensors
Full precision for HuggingFace Transformers
GGUF
Quantized for llama.cpp / Ollama
MLX
Apple Silicon optimized
ONNX
Cross-platform inference
Hanzo Ecosystem
Zen models power the entire Hanzo AI platform
Hanzo Chat
Chat with all 14 Zen models plus 100+ third-party models. MCP tools, file uploads, and persistent conversations.
Try Hanzo ChatHanzo Cloud
Managed inference API. Console, billing, usage analytics, API key management.
Open ConsoleHanzo MCP
260+ Model Context Protocol tools. Connect Zen models to your codebase, browser, filesystem, and APIs.
Explore MCPHanzo Dev
AI coding agent powered by Zen Coder. Code generation, debugging, and refactoring in your IDE.
Get Hanzo DevLLM Gateway
Unified proxy for 100+ LLM providers. Load balancing, caching, rate limiting, and observability.
View DocsHanzo Industries
Enterprise AI and defense. Custom model training, dedicated infrastructure, and compliance.
Learn MoreResearch
130+ technical papers across AI, cryptography, and consensus
Build with Zen LM
49 models. 10 modalities. Open weights. From $0.30/MTok.