Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Active Prompting with Chain-of-Thought for Large Language Models

Published in ACL, 2024

This paper proposes a new method, Active-Prompt, to adapt LLMs to different tasks with task-specific example prompts (annotated with human-designed CoT reasoning).

Recommended citation: Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, and Tong Zhang. 2024. Active Prompting with Chain-of-Thought for Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1330–1350, Bangkok, Thailand. Association for Computational Linguistics.
Download Paper

Decoding cortical folding patterns in marmosets using machine learning and large language model

Published in NeuroImage, 2025

Identification of genes with transcriptomic differences between concave and convex cortical patterns using machine learning and LLM.

Recommended citation: Yue Wu, Xuesong Gao, Zhengliang Liu, Pengcheng Wang, Zihao Wu, Yiwei Li, Tuo Zhang, Tianming Liu, Tao Liu, Xiao Li, Decoding cortical folding patterns in marmosets using machine learning and large language model, NeuroImage, Volume 308, 2025
Download Paper

Entropy Regularized Process Reward Model

Published in TMLR, 2025

This paper proposes an Entropy-Regularized Process Reward Model (ER-PRM) to improve mathematical reasoning in large language models. The key novelty is formulating multi-step reasoning under an entropy-regularized Markov Decision Process framework, which balances reward optimization with preventing the policy from deviating too far from its initial distribution. The method derives process reward scores using a novel aggregation approach based on KL-regularized optimization, where rewards are computed as the logarithm of expected exponentiated rewards from completion trajectories sampled by the initial policy. This approach offers theoretical advantages including dual formulation flexibility (soft-max when sampling from initial policy, soft-min from optimal policy) and independence from the optimal policy during reward computation.

Recommended citation: Hanning Zhang*, Pengcheng Wang*, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, & Tong Zhang (2025). Entropy-Regularized Process Reward Model. Transactions on Machine Learning Research.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.