DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Kang Hoon | ko |
dc.contributor.author | Kim, Geon-Hyeong | ko |
dc.contributor.author | Ortega, Pedro | ko |
dc.contributor.author | Lee, Daniel D. | ko |
dc.contributor.author | Kim, Kee-Eung | ko |
dc.date.accessioned | 2019-06-24T01:30:17Z | - |
dc.date.available | 2019-06-24T01:30:17Z | - |
dc.date.created | 2019-03-08 | - |
dc.date.created | 2019-03-08 | - |
dc.date.created | 2019-03-08 | - |
dc.date.created | 2019-03-08 | - |
dc.date.issued | 2019-05 | - |
dc.identifier.citation | MACHINE LEARNING, v.108, no.5, pp.765 - 783 | - |
dc.identifier.issn | 0885-6125 | - |
dc.identifier.uri | http://hdl.handle.net/10203/262791 | - |
dc.description.abstract | We consider a Bayesian approach to model-based reinforcement learning, where the agent uses a distribution of environment models to find the action that optimally trades off exploration and exploitation. Unfortunately, it is intractable to find the Bayes-optimal solution to the problem except for restricted cases. In this paper, we present BOKLE, a simple algorithm that uses Kullback–Leibler divergence to constrain the set of plausible models for guiding the exploration. We provide a formal analysis that this algorithm is near Bayes-optimal with high probability. We also show an asymptotic relation between the solution pursued by BOKLE and a well-known algorithm called Bayesian exploration bonus. Finally, we show experimental results that clearly demonstrate the exploration efficiency of the algorithm. | - |
dc.language | English | - |
dc.publisher | SPRINGER | - |
dc.title | Bayesian Optimistic Kullback-Leibler Exploration | - |
dc.type | Article | - |
dc.identifier.wosid | 000470185100004 | - |
dc.identifier.scopusid | 2-s2.0-85058968448 | - |
dc.type.rims | ART | - |
dc.citation.volume | 108 | - |
dc.citation.issue | 5 | - |
dc.citation.beginningpage | 765 | - |
dc.citation.endingpage | 783 | - |
dc.citation.publicationname | MACHINE LEARNING | - |
dc.identifier.doi | 10.1007/s10994-018-5767-4 | - |
dc.contributor.localauthor | Kim, Kee-Eung | - |
dc.contributor.nonIdAuthor | Ortega, Pedro | - |
dc.contributor.nonIdAuthor | Lee, Daniel D. | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article; Proceedings Paper | - |
dc.subject.keywordAuthor | Model-based Bayesian reinforcement learning | - |
dc.subject.keywordAuthor | Bayes-adaptive Markov decision process | - |
dc.subject.keywordAuthor | PAC-BAMDP | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.