Zen Model Catalog
95 open foundation models across Zen3, Zen4, and Zen5
Chat, code, vision-language, web agentic, embeddings, rerankers, image generation, streaming ASR, and TTS. From edge-class Zen5 Nano 0.8B to the Zen5 Max frontier MoE. 8K - 1M context. OpenAI- and Anthropic-compatible API.
Zen 5 - Next-Generation Agentic
Native chain-of-thought, large-scale RL on 200K+ environments, OpenAI + Anthropic API.
Zen5 Nano 0.8B
AvailableEdge / on-device tier. Raspberry Pi, phone, browser WASM class.
Zen5 Nano 2B
AvailableLow-end multimodal dense at 2B parameters. Low iGPU / 8 GB RAM laptop class.
Zen5 Nano 4B
AvailableMid multimodal dense at 5B. 16 GB RAM laptop / mobile NPU class.
Zen5 Nano 9B
AvailableUpper-nano multimodal dense. 24 GB+ unified RAM / consumer GPU class.
Zen5 Flash
AvailableSmallest and cheapest text-only Zen5 chat tier. For high-volume routing and simple agent loops.
Zen5 Mini
AvailableFrontier agentic at the lowest $/token in the family. 230B MoE / 10B active, trained on 200K+ real environments via large-scale RL.
Zen5
AvailableCanonical Zen5 default. 35B frontier MoE (3B active per token). The everyday Zen5 chat model. OpenAI + Anthropic API.
Zen5 Coder
AvailableCode-specialized Zen5 tier. Sparse MoE tuned for repo-scale code understanding, agentic refactoring, and tool-use coding loops.
Zen5 Pro
AvailableZen Flash IQ2_XXS-imatrix weights (81 GB GGUF). 284B total / 37B active. Fits a single 128 GB Apple Silicon / DGX Spark / H100 80 GB.
Zen5 Max
Cloud OnlyTop quality tier in the family. Requires Mac Studio M3 Ultra 512 GB or 8x H100/H200 class GPU pool.
Zen 5 Embedding
Three-SKU embedding lineup served on /v1/embeddings.
Zen5 Embedding 0.6B
AvailableLightweight embedding model for high-throughput RAG and search.
Zen5 Embedding 4B
AvailableBalanced embedding model for production RAG.
Zen5 Embedding 8B
AvailableHigh-quality embeddings for production RAG, semantic search, and classification.
Zen 4 - Production Chat
The everyday Zen production line: MoE flagships, thinking models, and long-context.
Zen4 Mini
AvailableUltra-fast lightweight model optimized for speed and cost efficiency. Ideal for free tier.
Zen4
AvailableFlagship MoE model for complex reasoning and multi-domain tasks.
Zen4 Pro
AvailableEfficient MoE model for demanding workloads with strong reasoning at production-grade cost.
Zen4 Thinking
AvailableDedicated reasoning model with explicit chain-of-thought capabilities.
Zen4 Ultra
AvailableMaximum reasoning capability with extended chain-of-thought on MoE architecture.
Zen4.1
AvailableHigh-performance 1M context model for long-document analysis, large codebase reasoning, and agentic workflows.
Zen4 Max
AvailableMost capable model for complex reasoning, analysis, and agentic tasks. 1M token context window.
Zen 4 Coder
Code-specialized MoE and dense models tuned for generation, review, debugging, and agentic programming.
Zen4 Coder Flash
AvailableLightweight code model optimized for speed and inline completions.
Zen4 Coder
AvailableCode-specialized MoE model for generation, review, debugging, and agentic programming.
Zen4 Coder Pro
AvailableFull-precision BF16 code model for maximum accuracy on complex codebases.
Zen 3 - Multimodal & Specialty
Vision, audio, web agentic, safety, and edge.
Zen3 Omni
AvailableHypermodal model supporting text, vision, audio, and structured output.
Zen3 VL
AvailableVision-language model for image understanding and visual reasoning. Default 30B-A3B MoE plus 2B, 8B, 32B, and frontier 235B-A22B variants.
Zen3 Web
AvailableWeb-agentic models for browser automation, scraping, and on-page reasoning. Three tiers from edge to top-end.
Zen3 Nano
AvailableUltra-lightweight model for edge deployment and low-latency tasks. Available on free tier.
Zen3 Guard
AvailableContent safety classifier for moderation and guardrails. 9 safety categories, 119 languages.
Zen 3 Embedding & Reranker
Text and multimodal embeddings plus rerankers for retrieval pipelines.
Zen3 Embedding
AvailableHigh-quality text embeddings for RAG, search, and classification. OpenAI-compatible endpoint available.
Zen3 Reranker
AvailableHigh-quality rerankers for improving retrieval accuracy in RAG pipelines.
Zen3 VL Embedding
AvailableMultimodal embeddings (text + image) for vision-aware retrieval and semantic search.
Zen3 VL Reranker
AvailableVision-language rerankers for multimodal RAG. Reranks (query, image+text) pairs.
Zen 3 Image Generation
Eight image-generation SKUs from fast diffusion to broadcast-quality.
Zen3 Image
AvailableBest general-purpose image generation.
Zen3 Image Max
AvailableMaximum quality image generation for professional creative work.
Zen3 Image Fast
AvailableFastest image model for real-time generation.
Zen3 Image SDXL / Dev / Playground / SSD / JP
AvailableSpecialized image models: SDXL (1024px), Dev (experimentation), Playground (aesthetic), SSD (fastest diffusion), JP (Japanese-specialized).
Zen 3 Audio & Speech
Speech-to-text, text-to-speech, streaming ASR, voice cloning, and forced alignment.
Zen3 Audio (STT)
AvailableHigh-quality and fast speech-to-text transcription. 100+ languages.
Zen3 ASR (Streaming)
AvailableReal-time streaming ASR for voice agents. Edge variant (0.6B) for on-device, aligner for word-level timestamps.
Zen3 TTS
AvailableHigh-quality text-to-speech with natural prosody. Four tiers from edge to broadcast-grade HD.
Zen3 TTS Voice Design & Custom Voice
AvailablePremium TTS with prompt-driven voice design and few-shot voice cloning from a short audio sample.
Full Zen Catalog Summary
Live catalog from the Zen API. Pricing fetched at runtime.
| Generation | Family | SKUs | Endpoint(s) |
|---|---|---|---|
| Zen 5 | Chat ladder | 10 (nano 0.8B / 2B / 4B / 9B, flash, mini, default, coder, pro, max) | /v1/chat/completions |
| Zen 5 | Embedding | 3 (0.6B / 4B / 8B) | /v1/embeddings |
| Zen 4 | Chat | 7 (mini, default, pro, thinking, ultra, 4.1, max) | /v1/chat/completions |
| Zen 4 | Coder | 3 (flash, coder, pro) | /v1/chat/completions |
| Zen 3 | Chat & VL | 10+ (omni, nano, guard, vl x5, web x3) | /v1/chat/completions |
| Zen 3 | Embedding & Reranker | 11 (text + multimodal embeddings, rerankers) | /v1/embeddings, /v1/rerank |
| Zen 3 | Image | 8 (image, max, dev, fast, sdxl, playground, ssd, jp) | /v1/images/generations |
| Zen 3 | Audio | 6 STT/ASR + 6 TTS | /v1/audio/transcriptions, /v1/audio/speech |
The Complete Catalog
All 95 open Zen models — every one linked to its weights on HuggingFace and its paper.