Trust Region-Guided Proximal Policy Optimization

Yuhui Wang, Hao He, Xiaoyang Tan, and Yaozhong Gan. NeurIPS 2019. (Poster PDF ยท Paper)