Multi-armed bandit problem with intra- and inter- correlations슬롯머신들 간의 내/외부 상관관계가 주어진 다중 슬롯머신 문제에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 372
  • Download : 0
We address a veiled expected reward multi-slot machine problem. Object of this problem is to select the highest expected reward arm which is called optimal arm. This problem is usually called Multi-armed Bandit problem which is famous for reinforcement learning problem about exploration and exploitation problem. In order to extend the MAB problem, we consider Unimodality correlation between arm and additionally observable conditions. In the proposed framework, we prove the asymptotic fundamental limit and suggest an algorithm which is achieving limit.
Advisors
Yi, Yungresearcher이융researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2017.8,[iv. 30 p. :]

Keywords

Reinforcement learning▼aMulti-armed Bandit problem▼aExploration exploitation tradeoff▼asequential decision problem▼aUnimodal condition; 강화학습▼a다중 슬롯머신 문제▼a탐사 이용 균형▼a순차적 결정 문제▼a단봉 상관관계

URI
http://hdl.handle.net/10203/243351
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=718683&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0