Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Erwan Lecarpentier and Emmanuel Rachelson. NeurIPS 2019. (Poster PDF ยท Paper)