Zen LM
Frontier AI models for code, reasoning, vision, video, audio, 3D, and agentic workflows
55 models across 10 modalities. Production API models from 4B to 1T+ parameters. Open weights on HuggingFace. OpenAI-compatible API. Built by Hanzo AI (Techstars '17).
Flagship Models
Three tiers — from efficient edge to trillion-parameter frontier scale
zen4-max
Maximum Intelligence · 1M ctx
Most capable model for complex reasoning, analysis, and agentic coding. 1M token context.
zen4
744B MoE · 40B active · 202K ctx
Flagship MoE for complex reasoning and multi-domain tasks. Optimal intelligence-to-speed ratio.
zen4-coder
480B MoE · 35B active · 262K ctx
Code-specialized MoE model for generation, review, debugging, and agentic programming.
API Pricing
Pay-as-you-go. $5 free credit on signup. No minimum commitment.
Loading pricing...
Quick Start
Install the Hanzo SDK — supports OpenAI and Claude-style endpoints, plus 100+ providers
pip install hanzoai
from hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
response = client.chat.completions.create(
model="zen4",
messages=[
{"role": "user", "content": "Hello, Zen."}
],
)
print(response.choices[0].message.content)npm install hanzoai
import Hanzo from "hanzoai";
const client = new Hanzo({
apiKey: "hk-your-api-key",
});
const response = await client.chat.completions.create({
model: "zen4-coder",
messages: [
{ role: "user", content: "Write a React hook" }
],
});
console.log(response.choices[0].message.content);curl https://api.hanzo.ai/v1/chat/completions \
-H "Authorization: Bearer hk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "zen4",
"messages": [
{"role": "user", "content": "Hello, Zen."}
]
}'from hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
stream = client.chat.completions.create(
model="zen4-max",
messages=[
{"role": "user", "content": "Explain MoE"}
],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")SDK Features
Full Model Library
49 models across 10 categories — text, code, vision, video, audio, 3D, safety, embeddings, and agents
Showing 53 models · 2 legacy/upcoming hidden
Zen 5
Next-generation agentic models with native chain-of-thought.
Next-generation agentic frontier model trained on 10B+ tokens of real-world tool use, multi-step reasoning, and production workflows. 1M+ token context with native chain-of-thought.
High-throughput agentic model for demanding production workloads. Trained on real-world development patterns with deep chain-of-thought reasoning.
Maximum context agentic model for document-scale analysis. Trained on 10B+ tokens of real-world workflows with extended chain-of-thought.
Deepest reasoning model in the Zen family. Multi-pass chain-of-thought with self-verification.
Efficient agentic model delivering zen5-class intelligence at a fraction of the cost.
Zen 4
Latest generation production models with MoDE architecture.
Most capable model for complex reasoning, analysis, and agentic tasks. 1M token context window.
High-performance 1M context model for long-document analysis, large codebase reasoning, and agentic workflows. Best balance of intelligence and cost at million-token scale.
Flagship MoE model for complex reasoning and multi-domain tasks.
HuggingFaceMaximum reasoning capability with extended chain-of-thought on MoE architecture.
HuggingFaceEfficient MoE model for demanding workloads with strong reasoning at production-grade cost.
HuggingFaceDedicated reasoning model with explicit chain-of-thought capabilities.
Ultra-fast lightweight model optimized for speed and cost efficiency. Ideal for free tier.
HuggingFaceCode
Specialized models for code generation, review, and debugging.
Code-specialized MoE model for generation, review, debugging, and agentic programming.
HuggingFaceLightweight code model optimized for speed and inline completions.
HuggingFaceFull-precision BF16 code model for maximum accuracy on complex codebases.
HuggingFaceZen 3 Multimodal
Vision, safety, and multimodal chat models.
Multimodal model supporting text, vision, audio, and structured output.
Vision-language model for image understanding and visual reasoning.
Ultra-lightweight model for edge deployment and low-latency tasks. Available on free tier.
Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.
Embedding & Retrieval
Text embeddings and search reranking via API.
High-quality text embeddings for RAG, search, and classification.
Lightweight embedding model for high-throughput, low-cost applications.
HuggingFaceHigh-quality reranker for improving retrieval accuracy in RAG pipelines.
HuggingFaceBalanced reranker for cost-effective retrieval quality improvement.
HuggingFaceLightweight reranker for high-throughput reranking at minimal cost.
HuggingFaceImage Generation
Text-to-image generation via API.
Audio & Speech
Speech-to-text, text-to-speech, and streaming ASR.
Real-time streaming speech recognition for live transcription and voice agents.
Low-latency text-to-speech for real-time voice agents and interactive applications.
Foundation
General-purpose open-weight models from 0.6B to 235B parameters.
Vision (Open Weights)
Vision-language and multimodal open-weight models.
Multi-modal vision-language model for image understanding.
HuggingFaceHypermodal model combining text, vision, audio, and code.
HuggingFaceSafety
Content moderation and safety guardrail models.
Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.
10 Modalities
One model family covering every AI capability
Text
14 models
Chat, reasoning, analysis
Code
9 models
Generation, review, debugging
Vision
5 models
Understanding, generation, editing
Video
4 models
Generation, understanding, I2V
Audio
7 models
Speech, music, translation
3D
2 models
Generation, world simulation
Safety
3 models
Moderation, guardrails
Embedding
2 models
Search, retrieval, RAG
Agents
1 models
Tool use, planning
Math
6 models
Reasoning, proof, computation
Architecture
Zen MoDE — curating best open-source foundations and fusing them into a unified, high-performance family
Consumer Line
Dense and MoE models from 4B to 80B. Edge-deployable dense models and efficient MoE flagships with only 3B active parameters.
Coder Line
Code-specialized models trained on 8.47B tokens of real agentic programming data. Fast completions to full-precision code intelligence.
Ultra Line
Trillion-parameter MoE models for cloud deployment. 1.04T parameters with 32B active for frontier-scale reasoning.
Efficient MoE
Mixture-of-Experts delivers frontier performance with only 3B active parameters — runs on consumer hardware.
Long Context
Up to 262K context on code models, 256K on frontier models. Dense models support 32–40K for efficient local inference.
Zen MoDE
Mixture of Distilled Experts — curating the best open-source foundations and fusing into a unified model family.
Zen Agentic Dataset
8.47 billion tokens of real-world agentic programming — not synthetic data
Git History
4.03B tokens
Agentic Debug Sessions
2.42B tokens
Architecture Discussions
1.14B tokens
Code Review Sessions
0.86B tokens
Open Weights
Download, self-host, and fine-tune — multiple formats for every platform
SafeTensors
Full precision for HuggingFace Transformers
GGUF
Quantized for llama.cpp / Ollama
MLX
Apple Silicon optimized
ONNX
Cross-platform inference
Hanzo Ecosystem
Zen models power the entire Hanzo AI platform
Hanzo Chat
Chat with all 14 Zen models plus 100+ third-party models. MCP tools, file uploads, and persistent conversations.
Try Hanzo ChatHanzo Cloud
Managed inference API. Console, billing, usage analytics, API key management.
Open ConsoleHanzo MCP
260+ Model Context Protocol tools. Connect Zen models to your codebase, browser, filesystem, and APIs.
Explore MCPHanzo Dev
AI coding agent powered by Zen Coder. Code generation, debugging, and refactoring in your IDE.
Get Hanzo DevLLM Gateway
Unified proxy for 100+ LLM providers. Load balancing, caching, rate limiting, and observability.
View DocsHanzo Industries
Enterprise AI and defense. Custom model training, dedicated infrastructure, and compliance.
Learn MoreResearch
130+ technical papers across AI, cryptography, and consensus
Build with Zen LM
49 models. 10 modalities. Open weights. From $0.15/MTok.