Zhenru Zhang / Zhen-Ru Zhang

I am a researcher in the Qwen Team, Alibaba Group. My current interests are focusing on enhancing the intelligence of Large Language Models, particularly reasoning and agent capabilities. I have contributed to the development and research of the Qwen series models, primarily involving in post-training and agent.

My prior research also includes pre-training and fine-tuning language models on NLU tasks, as well as weakly supervised learning in machine learning, which provide support for my subsequent work in robust language model development and data-efficient training.

Please feel free to contact me (zhangzhenru.zzr@alibaba-inc.com) if you are interested in the Qwen Team or my work.

News

  • [05/2025] 3 papers are accepted to ACL 2025
  • [05/2025] Release the paper of modeling world preference (WorldPM) [paper]
  • [04/2025] Release the Qwen3 series foundation models [blog] [model] [paper]
  • [03/2025] Release the QwQ-32B reasoning model [blog] [model]
  • [01/2025] Release the Qwen2.5-Math-PRM models for process supervision in mathematical reasoning [paper] [model]
  • [12/2024] Release the ProcessBench benchmark for process supervision in mathematical reasoning [paper] [repo] [data]
  • [11/2024] Release the QwQ-32B-Preview reasoning model [blog] [model]
  • [09/2024] Release the Qwen2.5 series foundation models [paper] [blog]
  • [09/2024] Release the Qwen2.5-Math series mathematical models [paper] [blog] and Qwen2.5-Math-RM reward model [model]
  • [08/2024] Release the Qwen2-Math series mathematical models [blog] and Qwen2-Math-RM reward model [model]
  • [06/2024] Release the Qwen2 series foundation models [paper] [blog]
  • [02/2024] Release the Qwen1.5 series foundation models [blog]
  • [09/2023] Release the Qwen series foundation models [paper] [blog] and the benchmark for Code Interpreter [code]
  • [05/2023] 1 papers is accepted to ACL 2023
  • [03/2023] Release the cross-lingual pre-trained model VECO 2.0 and rank 1st on Google XTREME leaderboard. [paper]
  • [10/2022] 2 papers are accepted to EMNLP 2022
  • [12/2020] 1 paper is accepted to AAAI 2021

Selected Publications ( Full publications on Google Scholar )

  • Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
    S Wang, L Yu, C Gao, C Zheng, S Liu, R Lu, K Dang, X Chen, J Yang, …
    arXiv preprint arXiv:2506.01939, 2025

  • WorldPM: Scaling Human Preference Modeling
    B Wang, R Lin, K Lu, L Yu, Z Zhang, F Huang, C Zheng, K Dang, Y Fan, …
    arXiv preprint arXiv:2505.10527, 2025

  • Qwen3 Technical Report
    A Yang, A Li, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Gao, C Huang, …
    arXiv preprint arXiv:2505.09388, 2025

  • Start: Self-taught Reasoner with Tools
    C Li, M Xue, Z Zhang, J Yang, B Zhang, X Wang, B Yu, B Hui, J Lin, D Liu
    arXiv preprint arXiv:2503.04625, 2025

  • The Lessons of Developing Process Reward Models in Mathematical Reasoning
    Z Zhang, C Zheng, Y Wu, B Zhang, R Lin, B Yu, D Liu, J Zhou, J Lin
    ACL 2025 Findings

  • Disentangling Reasoning Tokens and Boilerplate Tokens for Language Model Fine-tuning
    Z Ye, Z Zhang, Y Zhang, J Ma, J Lin, F Feng
    ACL 2025 Findings

  • ProcessBench: Identifying Process Errors in Mathematical Reasoning
    C Zheng, Z Zhang, B Zhang, R Lin, K Lu, B Yu, D Liu, J Zhou, J Lin
    ACL 2025

  • Qwen2.5 Technical Report
    A Yang, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Li, D Liu, F Huang, …
    arXiv preprint arXiv:2412.15115, 2024

  • Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
    A Yang, B Zhang, B Hui, B Gao, B Yu, C Li, D Liu, J Tu, J Zhou, J Lin, K Lu, …
    arXiv preprint arXiv:2409.12122, 2024

  • Qwen2 Technical Report
    A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, …
    arXiv preprint arXiv:2407.10671, 2024

  • Qwen Technical Report
    J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng, Y Fan, W Ge, Y Han, F Huang, …
    arXiv preprint arXiv:2309.16609, 2023

  • Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-Tuning
    ZR Zhang, C Tan, H Xu, C Wang, J Huang, S Huang
    ACL 2023

  • Veco 2.0: Cross-lingual Language Model Pre-training with Multi-Granularity Contrastive Learning
    ZR Zhang, C Tan, S Huang, F Huang
    arXiv preprint arXiv:2304.08205, 2023

  • Contrastive Demonstration Tuning for Pre-trained Language Models
    X Liang, N Zhang, S Cheng, Z Zhang, C Tan, H Chen
    EMNLP 2022 Findings

  • DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
    N Zhang, X Xu, L Tao, H Yu, H Ye, S Qiao, X Xie, X Chen, Z Li, L Li, …
    EMNLP 2022 System Demonstrations

  • Exploiting Unlabeled Data via Partial Label Assignment for Multi-Class Semi-Supervised Learning
    ZR Zhang, QW Zhang, Y Cao, ML Zhang
    AAAI 2021