pip install hanzoai

from hanzoai import Hanzo

client = Hanzo(api_key="hk-...")
r = client.chat.completions.create(
    model="zen4",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(r.choices[0].message.content)

curl

curl https://api.hanzo.ai/v1/chat/completions \
  -H "Authorization: Bearer hk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zen4",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Get API key API reference From $0.15 / 1M tokens

All 55 models

Text, code, vision, audio, image, 3D, safety, embeddings, and agents

Showing 53 models · 2 legacy/upcoming hidden

Zen 5

Next-generation agentic models with native chain-of-thought.

5 models

zen5EARLY ACCESS

MoDE + CoT · 1.0M ctx

Next-generation agentic frontier model trained on 10B+ tokens of real-world tool use, multi-step reasoning, and production workflows. 1M+ token context with native chain-of-thought.

Request Access

zen5-proEARLY ACCESS

MoDE + CoT · 524K ctx

High-throughput agentic model for demanding production workloads. Trained on real-world development patterns with deep chain-of-thought reasoning.

Request Access

zen5-maxEARLY ACCESS

MoDE + CoT · 2.1M ctx

Maximum context agentic model for document-scale analysis. Trained on 10B+ tokens of real-world workflows with extended chain-of-thought.

Request Access

zen5-ultraEARLY ACCESS

MoDE + Deep CoT · 1.0M ctx

Deepest reasoning model in the Zen family. Multi-pass chain-of-thought with self-verification.

Request Access

zen5-miniEARLY ACCESS

MoDE + CoT · 262K ctx

Efficient agentic model delivering zen5-class intelligence at a fraction of the cost.

Request Access

Zen 4

Latest generation production models with MoDE architecture.

7 models

zen4-max

Dense · 1M ctx

Most capable model for complex reasoning, analysis, and agentic tasks. 1M token context window.

Chat

zen4.1

Dense · 1M ctx

High-performance 1M context model for long-document analysis, large codebase reasoning, and agentic workflows. Best balance of intelligence and cost at million-token scale.

Chat

zen4

744B (40B active) MoE · 202K ctx

Flagship MoE model for complex reasoning and multi-domain tasks.

Chat Weights Repo

zen4-ultra

744B (40B active) MoE + CoT · 262K ctx

Maximum reasoning capability with extended chain-of-thought on MoE architecture.

Chat Weights Repo

zen4-pro

80B (3B active) MoE · 131K ctx

Efficient MoE model for demanding workloads with strong reasoning at production-grade cost.

Chat Weights Repo

zen4-thinking

80B (3B active) MoE + CoT · 131K ctx

Dedicated reasoning model with explicit chain-of-thought capabilities.

Chat

zen4-mini

Dense · 128K ctx

Ultra-fast lightweight model optimized for speed and cost efficiency. Ideal for free tier.

Chat Weights Repo

Code

Specialized models for code generation, review, and debugging.

6 models

zen4-coder

480B (35B active) MoE · 163K ctx

Code-specialized MoE model for generation, review, debugging, and agentic programming.

Chat Weights Repo

zen4-coder-flash

30B (3B active) MoE · 262K ctx

Lightweight code model optimized for speed and inline completions.

Chat Weights Repo

zen4-coder-pro

480B Dense BF16 · 131K ctx

Full-precision BF16 code model for maximum accuracy on complex codebases.

Chat Weights Repo

zen-coder

32B Dense · 131K ctx

Baseline code model for generation and completions.

Chat Weights

zen-coder-flash

7B Dense · 32K ctx

Fast code model for inline completions and suggestions.

Chat Weights

zen-code

14B Dense · 32K ctx

Legacy code model (superseded by Zen4 Coder series).

Chat Weights

Zen 3

Previous generation API models — language, vision, multimodal, and safety.

4 models

zen3-omni

~200B Dense Multimodal · 202K ctx

Multimodal model supporting text, vision, audio, and structured output.

Chat

zen3-vl

30B (3B active) MoE Vision-Language · 262K ctx

Vision-language model for image understanding and visual reasoning.

Chat

zen3-nano

8B Dense · 128K ctx

Ultra-lightweight model for edge deployment and low-latency tasks. Available on free tier.

Chat

zen3-guard

4B Dense · 65K ctx

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

Chat

Embedding & Retrieval

Text embeddings and search reranking via API.

8 models

zen3-embedding

3072 dimensions Embedding · 8K ctx

High-quality text embeddings for RAG, search, and classification.

Chat

zen3-embedding-medium

4B Embedding · 40K ctx

Balanced embedding model for cost-effective retrieval workloads.

Chat Weights

zen3-embedding-small

0.6B Embedding · 32K ctx

Lightweight embedding model for high-throughput, low-cost applications.

Chat Weights

zen3-reranker

8B Reranker · 40K ctx

High-quality reranker for improving retrieval accuracy in RAG pipelines.

Chat Weights

zen3-reranker-medium

4B Reranker · 40K ctx

Balanced reranker for cost-effective retrieval quality improvement.

Chat Weights

zen3-reranker-small

0.6B Reranker · 40K ctx

Lightweight reranker for high-throughput reranking at minimal cost.

Chat Weights

zen-embedding

3072 dimensions Embedding · 8K ctx

Foundation embedding model for search and retrieval.

Chat Weights

zen-reranker

568M Reranker · 8K ctx

Cross-encoder reranker for search result quality.

Chat Weights

Image Generation

Text-to-image generation via API.

8 models

zen3-image

Diffusion

Best general-purpose image generation.

Chat

zen3-image-max

Diffusion

Maximum quality image generation for professional creative work.

Chat

zen3-image-dev

Diffusion

Development model for experimentation and iteration.

Chat

zen3-image-fast

Diffusion

Fastest image model for real-time generation.

Chat

zen3-image-sdxl

Diffusion

High-resolution image generation at 1024px.

Chat

zen3-image-playground

Diffusion

Aesthetic model for artistic image generation.

Chat

zen3-image-ssd

1B Diffusion

Fastest diffusion model for real-time generation.

Chat

zen3-image-jp

Diffusion

Japanese-specialized image generation model.

Chat

Audio & Speech

Speech-to-text, text-to-speech, and streaming ASR.

7 models

zen3-audio

1.5B ASR

Best quality speech-to-text transcription. 100+ languages.

Chat

zen3-audio-fast

809M ASR

Fastest speech-to-text transcription for high-throughput workloads.

Chat

zen3-asr

Streaming ASR

Real-time streaming speech recognition for live transcription and voice agents.

Chat

zen3-asr-v1

Streaming ASR

First-generation streaming ASR for legacy compatibility.

Chat

zen3-tts

82M TTS

High-quality text-to-speech with natural prosody. 40+ voices, 8 languages.

Chat

zen3-tts-hd

TTS HD

Maximum fidelity text-to-speech for broadcast-quality audio production.

Chat

zen3-tts-fast

82M TTS

Low-latency text-to-speech for real-time voice agents and interactive applications.

Chat

Foundation

General-purpose open-weight models from 0.6B to 235B parameters.

5 models

zen-nano

0.6B Dense · 32K ctx

Ultra-lightweight LLM for edge and mobile deployment.

Chat Weights

zen-eco

4B Dense · 32K ctx

Efficient 4B model for general-purpose tasks.

Chat Weights

zen

8–32B Dense · 32K ctx

Standard model available in 8B and 32B variants.

Chat Weights

zen-pro

32B Dense · 32K ctx

Professional-grade 32B dense model for demanding workloads.

Chat Weights

zen-max

235B (22B active) MoE · 131K ctx

High-capability MoE model with 235B parameters.

Chat Weights Repo

Vision (Open Weights)

Vision-language and multimodal open-weight models.

2 models

zen-vl

32B Dense Multimodal · 32K ctx

Multi-modal vision-language model for image understanding.

Chat Weights

zen-omni

72B Dense Multimodal · 131K ctx

Hypermodal model combining text, vision, audio, and code.

Chat Weights

Safety

Content moderation and safety guardrail models.

2 models

zen3-guard

4B Dense · 65K ctx

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

Chat

zen-guard

8B Dense · 32K ctx

Content safety and moderation classifier.

Chat Weights

Ready to start?

Chat for free, download weights, or build with the API.

Start chatting free Download weights Get API key

Research & Writing

Papers, technical reports, and updates from the Zen LM team

All papers Blog Documentation