Zen MoDE — Mixture of Distilled Experts

Zen LM

Frontier AI models for code, reasoning, vision, video, audio, 3D, and agentic workflows

49 models across 10 modalities. 15 production API models from 4B to 1T+ parameters. Open weights on HuggingFace. OpenAI-compatible API. Built by Hanzo AI (Techstars '17).

49
Models
1T+
Max Parameters
262K
Max Context
10
Modalities
$0.30
From $/MTok

Flagship Models

Three tiers — from efficient edge to trillion-parameter frontier scale

API Pricing

Pay-as-you-go. $5 free credit on signup. No minimum commitment.

FREE TIER
$5

Free credit on every new account

  • All 15 API models
  • OpenAI-compatible API
  • 30-day expiry
Get Started Free
PAY AS YOU GO
$0.30/MTok

Starting from — scale with usage

  • Prepaid credits
  • Real-time usage tracking
  • No surprise bills
Add Credits
ENTERPRISE
Custom

Volume pricing, SLAs, dedicated support

  • Volume discounts
  • Dedicated infrastructure
  • SLA guarantees
Contact Sales

Zen4 Generation — Production API

ModelArchitectureContextTierInput $/MTokOutput $/MTok
zen4-max1.04T (32B active) MoE256Kultra max$3.60$3.60
zen4744B (40B active) MoE202Kultra max$3.00$9.60
zen4-ultra744B (40B active) MoE + CoT202Kultra max$3.00$9.60
zen4-pro80B (3B active) MoE131Kultra$2.70$2.70
zen4-thinking80B (3B active) MoE + CoT131Kpro max$2.70$2.70
zen4-coder480B (35B active) MoE262Kultra$3.60$3.60
zen4-coder-pro480B Dense BF16262Kultra max$4.50$4.50
zen4-coder-flash30B (3B active) MoE262Kpro max$1.50$1.50
zen4-mini8B Dense40Kpro$0.60$0.60

Zen3 Generation — Specialized Models

ModelArchitectureContextTierInput $/MTokOutput $/MTok
zen3-omni~200B Dense Multimodal202Kpro max$1.80$6.60
zen3-vl30B (3B active) MoE VL131Kpro max$0.45$1.80
zen3-nano4B Dense40Kpro$0.30$0.30
zen3-guard4B Dense40Kpro$0.30$0.30
zen3-embeddingEmbedding (3072 dim)8Kpro max$0.39
100 chat messages
zen4-mini · 50K in + 50K out
$0.06
1K code completions
zen4-coder-flash · 500K in + 500K out
$1.50
10K embeddings
zen3-embedding · 1M input
$0.39
Heavy daily use
zen4 · 1M in + 1M out
$12.60

Quick Start

Install the Hanzo SDK — supports OpenAI and Claude-style endpoints, plus 100+ providers

Python — Hanzo SDK
pip install hanzoai

from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

response = client.chat.completions.create(
    model="zen4",
    messages=[
        {"role": "user", "content": "Hello, Zen."}
    ],
)
print(response.choices[0].message.content)
TypeScript — Hanzo SDK
npm install hanzoai

import Hanzo from "hanzoai";

const client = new Hanzo({
  apiKey: "hk-your-api-key",
});

const response = await client.chat.completions.create({
  model: "zen4-coder",
  messages: [
    { role: "user", content: "Write a React hook" }
  ],
});
console.log(response.choices[0].message.content);
curl
curl https://api.hanzo.ai/v1/chat/completions \
  -H "Authorization: Bearer hk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zen4",
    "messages": [
      {"role": "user", "content": "Hello, Zen."}
    ]
  }'
Streaming (Python)
from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

stream = client.chat.completions.create(
    model="zen4-max",
    messages=[
        {"role": "user", "content": "Explain MoE"}
    ],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

SDK Features

Multi-SDK — Python, TypeScript, Go, Rust
OpenAI Compatible — drop-in replacement
100+ Providers — Zen + Claude + GPT + more
Streaming — SSE, async, batch

Full Model Library

49 models across 10 categories — text, code, vision, video, audio, 3D, safety, embeddings, and agents

Foundation

Core models from 0.6B edge to 1T+ frontier scale

6 models
zen-nano
0.6B Dense · 32K ctx

Ultra-lightweight edge model — 44K tokens/sec, fits in 0.4GB

HuggingFace
zen-eco
4B Dense · 32K ctx

Efficient general-purpose — 33K tokens/sec, 2–8GB memory

HuggingFace
zen
8–32B Dense · 32K ctx

Standard foundation — versatile for most tasks

HuggingFace
zen-pro
32B Dense · 32K ctx

Professional grade — 19K tokens/sec, high-quality outputs

HuggingFace
zen-max
1.04T MoE · 256K ctx

Maximum scale — same model as zen4-max, open weights

HuggingFace
zen-nextPREVIEW
TBD Dense · 256K ctx

Next-generation preview

Zen4 Generation

Latest generation — 10 production API models

10 models
zen4-max
1.04T (32B active) MoE · 256K ctx

Frontier scale — deepest reasoning capabilities

HuggingFace
zen4
744B (40B active) MoE · 202K ctx

Flagship intelligence — multi-domain reasoning

HuggingFace
zen4-ultra
744B (40B active) MoE + CoT · 202K ctx

Maximum reasoning with chain-of-thought

HuggingFace
zen4-pro
80B (3B active) MoE · 131K ctx

High capability — efficient MoE architecture

HuggingFace
zen4-thinking
80B (3B active) MoE + CoT · 131K ctx

Deep reasoning — chain-of-thought enabled

zen4-mini
8B Dense · 40K ctx

Ultra-fast inference — cost-effective

HuggingFace
zen4-coder
480B (35B active) MoE · 262K ctx

Code generation — 262K context for large codebases

HuggingFace
zen4-coder-pro
480B Dense BF16 · 262K ctx

Premium code — full-precision dense model

HuggingFace
zen4-coder-flash
30B (3B active) MoE · 262K ctx

Fast code — low-latency completions

HuggingFace
zen4-coder-nextPREVIEW
TBD MoE · 262K ctx

Next-gen code generation — preview

Code

Specialized for generation, review, debugging, and agentic programming

3 models
zen-coder
32B Dense · 131K ctx

Multi-language code generation and understanding

HuggingFace
zen-coder-flash
7B Dense · 32K ctx

Low-latency code completions

HuggingFace
zen-code
14B Dense · 32K ctx

Legacy code model — still available on HuggingFace

HuggingFace

Vision & Image

Image understanding, generation, editing, and design

5 models
zen-vl
32B Dense Multimodal · 32K ctx

Vision-language understanding — image analysis

HuggingFace
zen-omni
72B Dense Multimodal · 131K ctx

Hypermodal — text, vision, audio, code in one model

HuggingFace
zen-artist
Image Generation

High-resolution image generation, multiple styles

zen-artist-edit
Image Editing

Edit-by-instruction, inpainting, outpainting

zen-designerSOON
Design Generation

UI/UX design, graphics, mockups

Video

Video generation, understanding, and world modeling

4 models
zen-directorSOON
Text-to-Video

Cinematic-quality text-to-video generation

zen-videoSOON
Video Understanding

Video analysis, frame-by-frame understanding

zen-video-i2vSOON
Image-to-Video

Animate images into video sequences

zen-voyagerSOON
World Model

Spatial reasoning and world simulation

Audio & Speech

Music generation, voice synthesis, transcription, and translation

7 models
zen-scribe
Speech-to-Text

Multi-language transcription

zen-translator
Translation

100+ languages, context-aware translation

zen-dubSOON
Voice Synthesis

Multi-language voice dubbing and synthesis

zen-dub-liveSOON
Real-time Voice

Ultra-low latency real-time voice

zen-musicianSOON
Music Generation

Multi-instrument music composition

zen-foleySOON
Sound Effects

Text-to-SFX, foley art generation

zen-liveSOON
Real-time Translation

Bidirectional real-time speech translation

3D & Spatial

3D asset generation and world simulation

2 models
zen-3dSOON
3D Generation

Text-to-3D and image-to-3D asset generation

zen-worldSOON
World Simulation

Spatial reasoning and environment simulation

Safety & Guardrails

Content moderation and safety classification

3 models
zen-guard
8B Dense · 32K ctx

Content moderation and safety classification

HuggingFace
zen-guard-gen
8B Dense · 32K ctx

Safe generation with built-in guardrails

zen-guard-stream
4B Dense · 8K ctx

Low-latency streaming moderation

Embedding & Retrieval

Text embeddings and search reranking

2 models
zen-embedding
3072 dim · 8K ctx

High-dimensional text embeddings

HuggingFace
zen-reranker
568M Dense · 8K ctx

Cross-encoder search reranking

HuggingFace

Agents

Agentic AI with tool use and multi-step planning

1 models
zen-agentPREVIEW
32B Dense · 131K ctx

Agentic AI — tool use, multi-step planning, autonomous execution

10 Modalities

One model family covering every AI capability

Text

14 models

Chat, reasoning, analysis

Code

9 models

Generation, review, debugging

Vision

5 models

Understanding, generation, editing

Video

4 models

Generation, understanding, I2V

Audio

7 models

Speech, music, translation

3D

2 models

Generation, world simulation

Safety

3 models

Moderation, guardrails

Embedding

2 models

Search, retrieval, RAG

Agents

1 models

Tool use, planning

Math

6 models

Reasoning, proof, computation

Architecture

Zen MoDE — curating best open-source foundations and fusing them into a unified, high-performance family

Consumer Line

Dense and MoE models from 4B to 80B. Edge-deployable dense models and efficient MoE flagships with only 3B active parameters.

Coder Line

Code-specialized models trained on 8.47B tokens of real agentic programming data. Fast completions to full-precision code intelligence.

Ultra Line

Trillion-parameter MoE models for cloud deployment. 1.04T parameters with 32B active for frontier-scale reasoning.

Efficient MoE

Mixture-of-Experts delivers frontier performance with only 3B active parameters — runs on consumer hardware.

Long Context

Up to 262K context on code models, 256K on frontier models. Dense models support 32–40K for efficient local inference.

Zen MoDE

Mixture of Distilled Experts — curating the best open-source foundations and fusing into a unified model family.

Zen Agentic Dataset

8.47 billion tokens of real-world agentic programming — not synthetic data

8.47B
Training Tokens
3.35M
Training Samples
1,452
Repositories
15yr
History (2010–2025)
48%

Git History

4.03B tokens

29%

Agentic Debug Sessions

2.42B tokens

13%

Architecture Discussions

1.14B tokens

10%

Code Review Sessions

0.86B tokens

Open Weights

Download, self-host, and fine-tune — multiple formats for every platform

SafeTensors

Full precision for HuggingFace Transformers

GGUF

Quantized for llama.cpp / Ollama

MLX

Apple Silicon optimized

ONNX

Cross-platform inference

Hanzo Ecosystem

Zen models power the entire Hanzo AI platform

Build with Zen LM

49 models. 10 modalities. Open weights. From $0.30/MTok.

Try Zen