September 18, 2024 · 4 min read

zen-Coder: Code More, Learn More!

GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD

Introduction

In early April, we introduced CodeQwen1.5, which garnered significant attention from the community. Since then, we have been working to enhance the coding model. Today, we are excited to announce the release of the next generation of open-source coding models, zen-Coder, and officially rename CodeQwen to Qwen-Coder. We think "Coder" is more human-like and agile, reflecting our vision of it becoming a true coding partner in the future. zen-Coder is part of the zen series, available in three model sizes: 1.5B, 7B, and a 32B version (coming soon).

This update focuses on two main improvements: scaling up the code training data and enhancing coding capabilities while maintaining strong performance in other core areas like math and general tasks.

💻 Code More: zen-Coder builds on the strong zen and continues training on a larger scale of code data, including source code, text-code grounding data, and synthetic data, totaling 5.5 trillion tokens. This leads to significant improvements in code-related tasks.

📚 Learn More: While enhancing coding abilities, we aimed to retain strengths in math and general capabilities from base model. Therefore, zen-Coder incorporates additional data on mathematics and general abilities, providing a comprehensive foundation for real-world applications like Code Agent.

zen-Coder: Base Models

zen-Coder supports up to 128K tokens of context, covers 92 programming languages, and has achieved remarkable improvements across various code-related evaluation tasks, including code generation, multi-programming code generation, code completion, and code repair. Notably, the open-source 7B version of zen-Coder has even outperformed larger models like DeepSeek-Coder-V2-Lite and CodeStral-22B, making it one of the most powerful base code models available. Beyond code tasks, zen-Coder also demonstrates competitive math capabilities in evaluations such as GSM8K and Math. For general tasks, evaluations on MMLU and ARC show that zen-Coder has retained the general ability performance of zen.

zen-Coder-Instruct: Instruction-Tuned Models

Building on zen-Coder, we fine-tuned it with instruction data, creating zen-Coder-Instruct. This instruction-tuned model not only further improves task performance but also demonstrates exceptional generalization across various benchmarks.

zen-Coder-Instruct excels in several key areas:

Outstanding Multi-programming Expert: We expanded the multi-language evaluations using McEval, covering more than 40 programming languages. The results show that zen-Coder-Instruct performs remarkably well across many languages, including niche ones.

Code Reasoning: We believe code reasoning is closely tied to general reasoning skills. We used CRUXEval as a benchmark, and the results show zen-Coder-Instruct excels in code reasoning tasks. Interestingly, as code reasoning improves, the model's ability to follow complex instructions also gets better, encouraging us to further explore how code can enhance general skills.

Math Reasoning: Math and code are often discussed together: math is the foundation of code, and code is a key tool for math. zen-Coder-Instruct shines in both code and math tasks, proven to be a "science student".

Model	Math	GSM8K	GaoKao2023en	OlympiadBench	CollegeMath	AIME24
DeepSeek-Coder-V2-Lite-Instruct	61.0	87.6	56.1	26.4	39.8	6.7
zen-Coder-7B-Instruct	66.8	86.7	60.5	29.8	43.5	10.0

Basic capabilities: We also assessed the general capabilities, and the results indicate that zen-Coder-Instruct maintains the advantages of zen in terms of general abilities.

Model	AMC23	MMLU-Pro	MMLU	IFEval	CEval	GPQA
DeepSeek-Coder-V2-Lite-Instruct	40.4	42.5	60.6	38.6	60.1	27.6
zen-Coder-7B-Instruct	42.5	45.6	68.7	58.6	61.4	35.6

License

zen-Coder is released under the Apache 2.0 license. We hope this increased openness will accelerate its application in code intelligence.

What's Next for zen-Coder?

We are preparing the 32B version of zen-Coder, aiming to challenge proprietary models. Stay tuned—it's coming soon! Additionally, we're exploring powerful code-centric reasoning models to push the boundaries of code intelligence.

Citation

@article{hui2024qwen2,
  title={zen. 5-Coder Technical Report},
  author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
  journal={arXiv preprint arXiv:2409.12186},
  year={2024}
}
@article{yang2024qwen2,
  title={zen technical report},
  author={Yang, An and Yang, Baosong and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Zhou, Chang and Li, Chengpeng and Li, Chengyuan and Liu, Dayiheng and Huang, Fei and others},
  journal={arXiv preprint arXiv:2407.10671},
  year={2024}
}