Blog
Research notes, deep dives, and releases
71 posts on Zen architecture, training, and the open-model ecosystem.
GT-QLoRA: Uncensoring Trillion-Parameter MoE Models
Why standard abliteration techniques fail on Mixture-of-Experts models, and how Gate-Targeted QLoRA solves the expert routing problem at 1 trillion parameters.
Drop-Upcycling and the Birth of Zen MoDE Architecture
How Drop-Upcycling (arXiv:2502.19261) transforms dense checkpoints into MoE models at 1/4 training cost, and how it shapes Zen MoDE — our Mixture of Distilled Experts architecture.
SuRe + OPCM: Production-Grade Continual Learning for Open Models
Deep dive on Surprise-Driven Prioritized Replay (SuRe) and Orthogonal Projection Continual Merging (OPCM) — the two SOTA techniques we use for catastrophic-forgetting-free LLM adaptation in the Zen model family.
BitDelta: 1-Bit Behavioral Compression Across the Zen Model Family
How BitDelta (arXiv:2402.10193) compresses fine-tuned behavioral deltas to 1-bit precision, enabling the full Zen model family — nano through ultra — to share a single GPU cluster.
Zen4 Ultra: 480B Parameters, 1M Token Context
Zen4 Ultra is our most capable model: 480B total parameters, 35B active per token, 1M token context window. Benchmark results and use cases.
Zen MoDE: Mixture of Distilled Experts
A technical deep dive into Zen MoDE — Mixture of Distilled Experts — the architecture underlying all Zen models.
Introducing Zen LM: Open Frontier Models from Hanzo AI and Zoo Labs
Announcing the Zen model family: 94+ open models built on Zen MoDE architecture, co-developed by Hanzo AI and Zoo Labs Foundation.
Qwen3Guard: Real-time Safety for Your Token Stream
We are excited to introduce Qwen3Guard, the first safety guardrail model in the Qwen family. Built upon the powerful Qwen3 foundation models and fine-tuned specifically for safety classificatoin, Qwen3Guard ensures responsible AI interactions by delivering precise safety detection for both prompts a
Qwen-Image-Edit: Image Editing with Higher Quality and Efficiency
We are excited to introduce Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Image's unique text rendering capabilities to image editing tasks, enabling precise text editing. Furthermore, Qwen-Image-Edit simultan
Qwen-Image: Crafting with Native Text Rendering
We are thrilled to release **Qwen-Image**, a 20B MMDiT image foundation model that achieves significant advances in complex text rendering and precise image editing. To try the latest model, feel free to visit [Qwen Chat](https://chat.qwenlm.ai) and choose “Image Generation”.
GSPO: Towards Scalable Reinforcement Learning for Language Models
Reinforcement Learning (RL) has emerged as a pivotal paradigm for scaling language models and enhancing their deep reasoning and problem-solving capabilities. To scale RL, the foremost prerequisite is maintaining stable and robust training dynamics. However, we observe that existing RL algorithms (s
Qwen-MT: Where Speed Meets Smart Translation
Here we introduce the latest update of Qwen-MT (qwen-mt-turbo) via [Qwen API](https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?type=model&url=https://www.alibabacloud.com/help/en/doc-detail/2840914_2.html&renderType=component&modelId=qwen-mt-turbo). This update builds upon the powerful Qw
Qwen3-Coder: Agentic Coding in the World
Today, we're announcing Qwen3-Coder, our most agentic code model to date. Qwen3-Coder is available in multiple sizes, but we're excited to introduce its most powerful variant first: Qwen3-Coder-480B-A35B-Instruct — a 480B-parameter Mixture-of-Experts model with 35B active parameters which supports t
Time to Speak Some Dialects, Qwen-TTS!
Here we introduce the latest update of **Qwen-TTS** (`qwen-tts-latest` or `qwen-tts-2025-05-22`) through [Qwen API](https://help.aliyun.com/zh/model-studio/qwen-tts) . Trained on a large-scale dataset encompassing over millions of hours of speech, Qwen-TTS achieves human-level naturalness and expres
Qwen VLo: From "Understanding" the World to "Depicting" It
The evolution of multimodal large models is continually pushing the boundaries of what we believe technology can achieve. From the initial QwenVL to the latest zen VL, we have made progress in enhancing the model's ability to understand image content. Today, we are excited to introduce a new model,
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
{/* {{< video src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/zen/qwen2-main-video.m4v" width="100%" alt="zen Main Video" autoplay=true loop=true controls=false muted=true playsinline=true >}} */}
Qwen3: Think Deeper, Act Faster
Today, we are excited to announce the release of **Qwen3**, the latest addition to the Qwen family of large language models. Our flagship model, **Qwen3-235B-A22B**, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier mod
QVQ-Max: Think with Evidence
Last December, we launched QVQ-72B-Preview as an exploratory model, but it had many issues. Today, we are officially releasing the first version of QVQ-Max, our visual reasoning model. This model can not only "understand" the content in images and videos but also analyze and reason with this informa
zen Omni: See, Hear, Talk, Write, Do It All!
We release **zen-Omni**, the new flagship end-to-end multimodal model in the Qwen series. Designed for comprehensive multimodal perception, it seamlessly processes diverse inputs including text, images, audio, and video, while delivering real-time streaming responses through both text generation and
zen-VL-32B: Smarter and Lighter
At the end of January this year, we launched the zen-VL series of models, which received widespread attention and positive feedback from the community. Building on the zen-VL series, we continued to optimize the model using reinforcement learning and open-sourced the new VL model with the beloved 32
QwQ-32B: Embracing the Power of Reinforcement Learning
Scaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studies have demonstrated that RL can significantly improve the reasoning capabilities of models. For instance, DeepSeek R1 has achieved state-of-the-a
<think>...</think> QwQ-Max-Preview
This is a blog created by QwQ-Max-Preview. We hope you enjoy it!
zen-Max: Exploring the Intelligence of Large-scale MoE Model
It is widely recognized that continuously scaling both data size and model size can lead to significant improvements in model intelligence. However, the research and industry community has limited experience in effectively scaling extremely large models, whether they are dense or Mixture-of-Expert (
zen-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens
Two months after upgrading [zen-Turbo](../qwen3-turbo) to support context length up to one million tokens, we are back with the open-source zen-1M models and the corresponding inference framework support. Here's what you can expect from this release:
zen VL! zen VL! zen VL!
We release **zen-VL**, the new flagship vision-language model of Qwen and also a significant leap from the previous zen-VL. To try the latest model, feel free to visit [Qwen Chat](https://chat.qwenlm.ai) and choose zen-VL-72B-Instruct. Also, we open both base and instruct models in 3 sizes, includin
Global-batch load balance almost free lunch to improve your MoE LLM training
The Mixture-of-Experts (MoEs) architecture has become a popular model-parameter-scale-up technique. Typically, one MoE layer consists of a router (often parameterized as one single Linear layer) and a group of experts (for transformer-based models, each expert is one feedforward layer). Given an inp
Towards Effective Process Supervision in Mathematical Reasoning
In recent years, Large Language Models (LLMs) have made remarkable advances in mathematical reasoning, yet they can make mistakes, such as miscalculations or logical errors, leading to wrong conclusions. Moreover, even when achieving correct final answers, these powerful models can still regularly m
Zen 3.0: The Next Generation of Open AI
Announcing Zen 3.0, our most capable open model family yet.
QVQ: To See the World with Wisdom
Language and vision intertwine in the human mind, shaping how we perceive and understand the world around us. Our ability to reason is deeply rooted in both linguistic thought and visual memory - but what happens when we extend these capabilities to AI? Today's large language models have demonstrate
QwQ: Reflect Deeply on the Boundaries of the Unknown
<i style="color: grey;">Note: This is the pronunciation of QwQ: /kwju:/ , similar to the word "quill".</i>
Extending the Context Length to 1M Tokens!
After the release of zen, we heard the community's demand for processing longer contexts. In recent months, we have made many optimizations for the model capabilities and inference performance of extremely long context. Today, we are proud to introduce the new zen-Turbo version, which features:
zen-Coder Series: Powerful, Diverse, Practical.
{/* {{< figure src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/zen/zen-Coder/coder-main.png#center" width="70%">}} */}
The Future of Open AI
Reflections on where open AI development is heading and what it will take to get there.
zen: A Party of Foundation Models!
{/* {{< video src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/zen/qwen2-main-video.m4v" width="100%" alt="zen Main Video" autoplay=true loop=true controls=false muted=true playsinline=true >}} */}
zen-Math: The world's leading open-sourced mathematical LLMs
> <div align="center"> > <b> > 🚨 zen-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > </b> > </div>
zen-LLM: Extending the boundary of LLMs
In this blog, we delve into the details of our latest zen series language models. We have developed a range of decoder-only dense models, with seven of them open-sourced, spanning from 0.5B to 72B parameters. Our research indicates a significant interest among users in models within the 10-30B range
zen-Coder: Code More, Learn More!
In early April, we introduced CodeQwen1.5, which garnered significant attention from the community. Since then, we have been working to enhance the coding model. Today, we are excited to announce the release of the next generation of open-source coding models, **zen-Coder**, and officially rename Co
zen-VL: To See the World More Clearly
After a year's relentless efforts, today we are thrilled to release **zen-VL**! zen-VL is the latest version of the vision language models based on **zen** in the Qwen model familities. Compared with Qwen-VL, zen-VL has the capabilities of:
zen-Audio: Chat with Your Voice!
To achieve the objective of building an AGI system, the model should be capable of understanding information from different modalities. Thanks to the rapid development of large language models, LLMs are now capable of understanding language and reasoning. Previously we have taken a step forward to e
Introducing zen-Math
> <div align="center"> > <b> > 🚨 This model mainly supports English. We will release bilingual (English and Chinese) math models soon. > </b> > </div>
Decentralized Compute for AI Training
How we're building a decentralized compute network for training large AI models.
Hello zen
After months of efforts, we are pleased to announce the evolution from Qwen1.5 to zen. This time, we bring to you:
Generalizing an LLM from 8k to 1M Context using Qwen-Agent
**TLDR:** We've created an agent using zen models with an 8k context size to understand documents with 1M tokens, surpassing RAG and native long-context models. This agent was also used to generate data for training new long-context Qwen models.
ZIPs: Decentralized Governance for Open AI
How Zoo Improvement Proposals enable community-driven governance of open AI development.
Notes on Qwen-Max-0428
Previously, we opensourced a series of Qwen1.5 model ranging from 0.5 to 110 billion parameters. Now, we release a larger model, Qwen-Max-0428. Qwen-Max-0428 is an instruction-tuned model for chat service. Very recently, it is available via [Chatbot Arena](https://chat.lmsys.org/) and it has now bec
Qwen1.5-110B: The First 100B+ Model of the Qwen1.5 Series
Recently we have witnessed a burst of large-scale models with over 100 billion parameters in the opensource community. These models have demonstrated remarkable performance in both benchmark evaluation and chatbot arena. Today, we release the first 100B+ model of the Qwen1.5 series, Qwen1.5-110B, wh
Code with CodeQwen1.5
The advent of advanced programming tools, which harnesses the power of large language models (LLMs), has significantly enhanced programmer productivity and accuracy. Notwithstanding these advancements, dominant coding assistants like Github Copilot, built upon proprietary LLMs, pose notable challeng
Qwen1.5-32B: Fitting the Capstone of the Qwen1.5 Language Model Series
The open-source community has long sought a model that strikes an ideal balance between performance, efficiency, and memory footprint. Despite the emergence of cutting-edge models like Qwen1.5-72B and DBRX, the models have faced persistent challenges such as large memory consumption, slow inference
Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters
Since the surge in interest sparked by Mixtral, research on mixture-of-expert (MoE) models has gained significant momentum. Both researchers and practitioners are keenly interested in understanding how to effectively train such models and assessing their efficiency and effectiveness. Today, we intro
Announcing the Zoo Labs Foundation
Today we formally launch Zoo Labs Foundation, an open research network dedicated to decentralized AI and decentralized science.
Introducing Qwen1.5
In recent months, our focus has been on developing a "good" model while optimizing the developer experience. As we progress towards **Qwen1.5**, the next iteration in our Qwen series, this update arrives just before the Chinese New Year.
Introducing Qwen-VL
Along with the rapid development of our large language model Qwen, we leveraged Qwen’s capabilities and unified multimodal pretraining to address the limitations of multimodal models in generalization, and we opensourced multimodal model Qwen-VL in Sep. 2023. Recently, the Qwen-VL series has undergo
Introducing Qwen
4 months after our first release of Qwen-7B, which is the starting point of our opensource journey of large language models (LLM), we now provide an introduction to the Qwen series to give you a whole picture of our work as well as our objectives. Below are important links to our opensource projects
Agent NFTs: Ownership and Identity for AI Agents
Introducing Agent NFTs, a framework for giving AI agents persistent identity and enabling ownership of their capabilities.
Training Gym: A Platform for Open Model Development
Announcing Training Gym, our open platform for collaborative large model training.
Proof of AI: Verifiable Machine Learning on Chain
How we're bringing cryptographic verification to AI inference, enabling trustless machine learning.
Zen Reranker: Two-Stage Retrieval Done Right
Introducing the Zen Reranker, a cross-encoder model that dramatically improves retrieval quality in two-stage pipelines.
OFASys: Enabling Multitask Learning with One Line of Code!
Generalist Models are hot! We all see an opportunity towards a real generalist model by multimodal multitask learning. We previously release an opensourced unified multimodal pretrained model OFA for this goal. However, we actually met a lot of difficulties in our implementation. For example, it is
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
CLIP[^1] is a phenomenal playmaker in vision and multimodal representation learning. It plays not only as a foundation model but also a bridge between vision and language. It has triggered a series of research in different fields, especially text-to-image generation. However, we find that there is a
Embedding Spaces at 7680 Dimensions
Exploring high-dimensional embedding spaces for semantic search and retrieval.
7680-Dimensional Embeddings: More Dimensions, Better Retrieval
Why we trained embedding models with 7680 dimensions and what we learned about the relationship between dimensionality and retrieval quality.
OFA: Towards Building a One-For-All Model
2022 is a year of generalist models! With the bloom of multimodal pretraining, especially the unified model, we have witnessed the opportunity to building a generalist model that is capable of processing tasks of different modalities or multi-modalities! Thus, we propose OFA[^1], namely One-For-All,
GRPO: Group Relative Policy Optimization
Introducing GRPO, a new approach to reinforcement learning from human feedback that improves sample efficiency and alignment stability.
GRPO: Group Relative Policy Optimization
A companion post to our GRPO paper, explaining group relative policy optimization for language model alignment.
Federated Learning Without Compromise
Privacy-preserving machine learning that maintains model quality through novel aggregation protocols.
Federated Learning for Open AI
How federated learning enables collaborative model training while preserving data privacy.
Experience Ledgers: Persistent Memory for AI Agents
Introducing experience ledgers, a framework for giving AI agents persistent, verifiable memory.
The Case for Decentralized Science
Why scientific research needs decentralization, and how blockchain can help.
The Case for Decentralized Science
A manifesto for decentralized science (DeSci) and its application to AI research.
Training LLMs on Collective Intelligence
How we're approaching training data curation to capture humanity's collective intelligence.
Introducing Zen: Open AI for the Open Web
We're launching Zen, an open research initiative to build AI that serves everyone.