Entropy Regularized Process Reward Model
Hanning Zhang*, Pengcheng Wang*, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, & Tong Zhang (2025). Entropy-Regularized Process Reward Model. Transactions on Machine Learning Research.