Model-augmented Prioritized Experience Replay

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 77
  • Download : 0
Experience replay is an essential component in off-policy model-free reinforcement learning (MfRL). Due to its effectiveness, various methods for calculating priority scores on experiences have been proposed for sampling. Since critic networks are crucial to policy learning, TD-error, directly correlated to -values, is one of the most frequently used features to compute the scores. However, critic networks often under- or overestimate -values, so it is often ineffective to learn to predict -values by sampled experiences based heavily on TD-error. Accordingly, it is valuable to find auxiliary features, which positively support TD-error in calculating the scores for efficient sampling. Motivated by this, we propose a novel experience replay method, which we call model-augmented prioritized experience replay (MaPER), that employs new learnable features driven from components in model-based RL (MbRL) to calculate the scores on experiences. The proposed MaPER brings the effect of curriculum learning for predicting -values better by the critic network with negligible memory and computational overhead compared to the vanilla PER. Indeed, our experimental results on various tasks demonstrate that MaPER can significantly improve the performance of the state-of-the-art off-policy MfRL and MbRL which includes off-policy MfRL algorithms in its policy optimization procedure. One-sentence Summary: We propose a novel experience replay which employs additional auxiliary learnable features as well as TD-errors for prioritizing experiences
Publisher
International Conference on Learning Representations, ICLR
Issue Date
2022-04-25
Language
English
Citation

10th International Conference on Learning Representations, ICLR 2022

URI
http://hdl.handle.net/10203/301656
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0