Zhenru Zhang / Zhen-Ru Zhang

I am a researcher in the Qwen Team, Alibaba Group. My current interests are focusing on enhancing the intelligence of Large Language Models, particularly reasoning and agent capabilities. I have contributed to the development and research of the Qwen series models, primarily involving in post-training and agent.

My prior research also includes pre-training and fine-tuning language models on NLU tasks, as well as weakly supervised learning in machine learning, which provide support for my subsequent work in robust language model development and data-efficient training.

Please feel free to contact me (zhangzhenru.zzr@alibaba-inc.com) if you are interested in the Qwen Team or my work.

News

[05/2025] 3 papers are accepted to ACL 2025
[05/2025] Release the paper of modeling world preference (WorldPM) [paper]
[04/2025] Release the Qwen3 series foundation models [blog] [model] [paper]
[03/2025] Release the QwQ-32B reasoning model [blog] [model]
[01/2025] Release the Qwen2.5-Math-PRM models for process supervision in mathematical reasoning [paper] [model]
[12/2024] Release the ProcessBench benchmark for process supervision in mathematical reasoning [paper] [repo] [data]
[11/2024] Release the QwQ-32B-Preview reasoning model [blog] [model]
[09/2024] Release the Qwen2.5 series foundation models [paper] [blog]
[09/2024] Release the Qwen2.5-Math series mathematical models [paper] [blog] and Qwen2.5-Math-RM reward model [model]
[08/2024] Release the Qwen2-Math series mathematical models [blog] and Qwen2-Math-RM reward model [model]
[06/2024] Release the Qwen2 series foundation models [paper] [blog]
[02/2024] Release the Qwen1.5 series foundation models [blog]
[09/2023] Release the Qwen series foundation models [paper] [blog] and the benchmark for Code Interpreter [code]
[05/2023] 1 papers is accepted to ACL 2023
[03/2023] Release the cross-lingual pre-trained model VECO 2.0 and rank 1st on Google XTREME leaderboard. [paper]
[10/2022] 2 papers are accepted to EMNLP 2022
[12/2020] 1 paper is accepted to AAAI 2021

Selected Publications ( Full publications on Google Scholar )

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
S Wang, L Yu, C Gao, C Zheng, S Liu, R Lu, K Dang, X Chen, J Yang, …
arXiv preprint arXiv:2506.01939, 2025
WorldPM: Scaling Human Preference Modeling
B Wang, R Lin, K Lu, L Yu, Z Zhang, F Huang, C Zheng, K Dang, Y Fan, …
arXiv preprint arXiv:2505.10527, 2025
Qwen3 Technical Report
A Yang, A Li, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Gao, C Huang, …
arXiv preprint arXiv:2505.09388, 2025
Start: Self-taught Reasoner with Tools
C Li, M Xue, Z Zhang, J Yang, B Zhang, X Wang, B Yu, B Hui, J Lin, D Liu
arXiv preprint arXiv:2503.04625, 2025
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Z Zhang, C Zheng, Y Wu, B Zhang, R Lin, B Yu, D Liu, J Zhou, J Lin
ACL 2025 Findings
Disentangling Reasoning Tokens and Boilerplate Tokens for Language Model Fine-tuning
Z Ye, Z Zhang, Y Zhang, J Ma, J Lin, F Feng
ACL 2025 Findings
ProcessBench: Identifying Process Errors in Mathematical Reasoning
C Zheng, Z Zhang, B Zhang, R Lin, K Lu, B Yu, D Liu, J Zhou, J Lin
ACL 2025
Qwen2.5 Technical Report
A Yang, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Li, D Liu, F Huang, …
arXiv preprint arXiv:2412.15115, 2024
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
A Yang, B Zhang, B Hui, B Gao, B Yu, C Li, D Liu, J Tu, J Zhou, J Lin, K Lu, …
arXiv preprint arXiv:2409.12122, 2024
Qwen2 Technical Report
A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, …
arXiv preprint arXiv:2407.10671, 2024
Qwen Technical Report
J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng, Y Fan, W Ge, Y Han, F Huang, …
arXiv preprint arXiv:2309.16609, 2023
Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-Tuning
ZR Zhang, C Tan, H Xu, C Wang, J Huang, S Huang
ACL 2023
Veco 2.0: Cross-lingual Language Model Pre-training with Multi-Granularity Contrastive Learning
ZR Zhang, C Tan, S Huang, F Huang
arXiv preprint arXiv:2304.08205, 2023
Contrastive Demonstration Tuning for Pre-trained Language Models
X Liang, N Zhang, S Cheng, Z Zhang, C Tan, H Chen
EMNLP 2022 Findings
DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
N Zhang, X Xu, L Tao, H Yu, H Ye, S Qiao, X Xie, X Chen, Z Li, L Li, …
EMNLP 2022 System Demonstrations
Exploiting Unlabeled Data via Partial Label Assignment for Multi-Class Semi-Supervised Learning
ZR Zhang, QW Zhang, Y Cao, ML Zhang
AAAI 2021