Open-source · Apache 2.0 · 55 models

Zen AI

A family of open-source AI models you can chat with for free — or download and run privately on your own machine.

No account needed to chat · Open weights on HuggingFace · Built by Hanzo AI

Chat with Zen — no signup required

Try any Zen model instantly on Hanzo Chat. Free tier includes access to zen4, zen4-mini, zen3-nano and more.

Run locally — 100% private

Download open-weight Zen models and run them on your own hardware. Works with Ollama, LM Studio, llama.cpp, and Transformers.

2. Run Zen Nano

ollama run hf.co/zenlm/zen-nano-0.6b

Ollama will download the model from HuggingFace automatically on first run. Requires ~1 GB RAM.

Free to chat

Try every model instantly on Hanzo Chat. No credit card, no signup. Upgrade for higher limits and API access.

Run it yourself

Apache 2.0 licensed weights on HuggingFace. Download GGUF for Ollama and LM Studio, or full precision for training.

Build with the API

OpenAI-compatible API at api.hanzo.ai. From $0.15/MTok. Drop-in replacement for GPT-4 or Claude.

Open weight models

Download and run — Apache 2.0

Zen Nano

0.6B · Dense · 32K ctx

docs
ollama run hf.co/zenlm/zen-nano-0.6b

Zen Eco

4B · Dense · 32K ctx

docs
ollama run hf.co/zenlm/zen-eco-4b

Zen

8–32B · Dense · 32K ctx

docs
ollama run hf.co/zenlm/zen-8b

Zen Pro

32B · Dense · 32K ctx

docs
ollama run hf.co/zenlm/zen-pro-32b

Zen Max

235B (22B active) · MoE · 131K ctx

docs
ollama run hf.co/zenlm/zen-max

Zen Coder

32B · Dense · 131K ctx

docs
ollama run hf.co/zenlm/zen-coder

Developer API

OpenAI-compatible. Drop-in replacement.

Python
pip install hanzoai

from hanzoai import Hanzo

client = Hanzo(api_key="hk-...")
r = client.chat.completions.create(
    model="zen4",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(r.choices[0].message.content)
curl
curl https://api.hanzo.ai/v1/chat/completions \
  -H "Authorization: Bearer hk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zen4",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'
Get API keyAPI reference From $0.15 / 1M tokens

All 55 models

Text, code, vision, audio, image, 3D, safety, embeddings, and agents

Showing 53 models · 2 legacy/upcoming hidden

Zen 5

Next-generation agentic models with native chain-of-thought.

5 models
zen5EARLY ACCESS
MoDE + CoT · 1.0M ctx

Next-generation agentic frontier model trained on 10B+ tokens of real-world tool use, multi-step reasoning, and production workflows. 1M+ token context with native chain-of-thought.

zen5-proEARLY ACCESS
MoDE + CoT · 524K ctx

High-throughput agentic model for demanding production workloads. Trained on real-world development patterns with deep chain-of-thought reasoning.

zen5-maxEARLY ACCESS
MoDE + CoT · 2.1M ctx

Maximum context agentic model for document-scale analysis. Trained on 10B+ tokens of real-world workflows with extended chain-of-thought.

zen5-ultraEARLY ACCESS
MoDE + Deep CoT · 1.0M ctx

Deepest reasoning model in the Zen family. Multi-pass chain-of-thought with self-verification.

zen5-miniEARLY ACCESS
MoDE + CoT · 262K ctx

Efficient agentic model delivering zen5-class intelligence at a fraction of the cost.

Zen 4

Latest generation production models with MoDE architecture.

7 models
zen4-max
Dense · 1M ctx

Most capable model for complex reasoning, analysis, and agentic tasks. 1M token context window.

zen4.1
Dense · 1M ctx

High-performance 1M context model for long-document analysis, large codebase reasoning, and agentic workflows. Best balance of intelligence and cost at million-token scale.

zen4
744B (40B active) MoE · 202K ctx

Flagship MoE model for complex reasoning and multi-domain tasks.

zen4-ultra
744B (40B active) MoE + CoT · 262K ctx

Maximum reasoning capability with extended chain-of-thought on MoE architecture.

zen4-pro
80B (3B active) MoE · 131K ctx

Efficient MoE model for demanding workloads with strong reasoning at production-grade cost.

zen4-thinking
80B (3B active) MoE + CoT · 131K ctx

Dedicated reasoning model with explicit chain-of-thought capabilities.

zen4-mini
Dense · 128K ctx

Ultra-fast lightweight model optimized for speed and cost efficiency. Ideal for free tier.

Code

Specialized models for code generation, review, and debugging.

6 models
zen4-coder
480B (35B active) MoE · 163K ctx

Code-specialized MoE model for generation, review, debugging, and agentic programming.

zen4-coder-flash
30B (3B active) MoE · 262K ctx

Lightweight code model optimized for speed and inline completions.

zen4-coder-pro
480B Dense BF16 · 131K ctx

Full-precision BF16 code model for maximum accuracy on complex codebases.

zen-coder
32B Dense · 131K ctx

Baseline code model for generation and completions.

zen-coder-flash
7B Dense · 32K ctx

Fast code model for inline completions and suggestions.

zen-code
14B Dense · 32K ctx

Legacy code model (superseded by Zen4 Coder series).

Zen 3

Previous generation API models — language, vision, multimodal, and safety.

4 models
zen3-omni
~200B Dense Multimodal · 202K ctx

Multimodal model supporting text, vision, audio, and structured output.

zen3-vl
30B (3B active) MoE Vision-Language · 262K ctx

Vision-language model for image understanding and visual reasoning.

zen3-nano
8B Dense · 128K ctx

Ultra-lightweight model for edge deployment and low-latency tasks. Available on free tier.

zen3-guard
4B Dense · 65K ctx

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

Embedding & Retrieval

Text embeddings and search reranking via API.

8 models
zen3-embedding
3072 dimensions Embedding · 8K ctx

High-quality text embeddings for RAG, search, and classification.

zen3-embedding-medium
4B Embedding · 40K ctx

Balanced embedding model for cost-effective retrieval workloads.

zen3-embedding-small
0.6B Embedding · 32K ctx

Lightweight embedding model for high-throughput, low-cost applications.

zen3-reranker
8B Reranker · 40K ctx

High-quality reranker for improving retrieval accuracy in RAG pipelines.

zen3-reranker-medium
4B Reranker · 40K ctx

Balanced reranker for cost-effective retrieval quality improvement.

zen3-reranker-small
0.6B Reranker · 40K ctx

Lightweight reranker for high-throughput reranking at minimal cost.

zen-embedding
3072 dimensions Embedding · 8K ctx

Foundation embedding model for search and retrieval.

zen-reranker
568M Reranker · 8K ctx

Cross-encoder reranker for search result quality.

Image Generation

Text-to-image generation via API.

8 models
zen3-image
Diffusion

Best general-purpose image generation.

zen3-image-max
Diffusion

Maximum quality image generation for professional creative work.

zen3-image-dev
Diffusion

Development model for experimentation and iteration.

zen3-image-fast
Diffusion

Fastest image model for real-time generation.

zen3-image-sdxl
Diffusion

High-resolution image generation at 1024px.

zen3-image-playground
Diffusion

Aesthetic model for artistic image generation.

zen3-image-ssd
1B Diffusion

Fastest diffusion model for real-time generation.

zen3-image-jp
Diffusion

Japanese-specialized image generation model.

Audio & Speech

Speech-to-text, text-to-speech, and streaming ASR.

7 models
zen3-audio
1.5B ASR

Best quality speech-to-text transcription. 100+ languages.

zen3-audio-fast
809M ASR

Fastest speech-to-text transcription for high-throughput workloads.

zen3-asr
Streaming ASR

Real-time streaming speech recognition for live transcription and voice agents.

zen3-asr-v1
Streaming ASR

First-generation streaming ASR for legacy compatibility.

zen3-tts
82M TTS

High-quality text-to-speech with natural prosody. 40+ voices, 8 languages.

zen3-tts-hd
TTS HD

Maximum fidelity text-to-speech for broadcast-quality audio production.

zen3-tts-fast
82M TTS

Low-latency text-to-speech for real-time voice agents and interactive applications.

Foundation

General-purpose open-weight models from 0.6B to 235B parameters.

5 models
zen-nano
0.6B Dense · 32K ctx

Ultra-lightweight LLM for edge and mobile deployment.

zen-eco
4B Dense · 32K ctx

Efficient 4B model for general-purpose tasks.

zen
8–32B Dense · 32K ctx

Standard model available in 8B and 32B variants.

zen-pro
32B Dense · 32K ctx

Professional-grade 32B dense model for demanding workloads.

zen-max
235B (22B active) MoE · 131K ctx

High-capability MoE model with 235B parameters.

Vision (Open Weights)

Vision-language and multimodal open-weight models.

2 models
zen-vl
32B Dense Multimodal · 32K ctx

Multi-modal vision-language model for image understanding.

zen-omni
72B Dense Multimodal · 131K ctx

Hypermodal model combining text, vision, audio, and code.

Safety

Content moderation and safety guardrail models.

2 models
zen3-guard
4B Dense · 65K ctx

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

zen-guard
8B Dense · 32K ctx

Content safety and moderation classifier.

Ready to start?

Chat for free, download weights, or build with the API.

Research & Writing

Papers, technical reports, and updates from the Zen LM team