Improving Thompson sampling via information relaxation for budgeted multi-armed bandits예산제약이 있는 멀티암드벤딧에서 정보 완화를 통한 톰슨샘플링 개선

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1
  • Download : 0
We consider a Bayesian budgeted multi-armed bandit problem, in which each arm consumes a different amount of resources when selected and there is a budget constraint on the total amount of resources that can be used. Bud- geted Thompson Sampling (BTS) offers a very effective heuristic to this problem, but its arm-selection rule does not take into account the remaining budget information.We adopt Information Relaxation Sampling framework that generalizes Thompson Sampling for classical K-armed bandit problems, and propose a series of algorithms that are randomized like BTS but more carefully optimize their decisions with respect to the budget constraint. In a one-to-one correspondence with these algorithms, a series of performance benchmarks that improve the conven- tional benchmark are also suggested. Our theoretical analysis and simulation results show that our algorithms (and our benchmarks) make incremental improvements over BTS (respectively, the conventional benchmark) across various settings including a real-world example.
Advisors
민승기researcher
Description
한국과학기술원 :산업및시스템공학과,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 산업및시스템공학과, 2023.8,[v, 42 p. :]

Keywords

멀티암드벤딧▼a베이지안▼a예산제약▼a톰슨셈플링▼a정보완화; Multi armed bandit▼aBayesian▼aBudget constraint▼aThompsons sampling▼aInformation relaxation

URI
http://hdl.handle.net/10203/320641
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045864&flag=dissertation
Appears in Collection
IE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0