Sequential decision making with only return and action보상반환값과 행동만이 주어진 상황에서의 순차적 의사결정

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1
  • Download : 0
As recent success of transformer architectures have shown superior performance in sequence modeling, several approaches have been proposed to apply transformers in various fields, including sequential decision-making and reinforcement learning, such as the prior work on Decision Transformers. However, Markov Decision Processes (MDPs), the standard problem setting in sequential decision making and reinforcement learning, require information on the transition sequence of state, action, and reward. This information is not always available in real-world problems. In this paper, we propose a new problem setting for decision making, which is a relaxation of the MDP that requires fewer conditions, thus making it easier to apply in many real-world situations, such as robotic control or experimental design. By extending the approach used in Decision Transformers, we suggest a decision making method that leverages the sequence modeling power of transformers in this new problem setting. Additionally, we propose an active learning framework that could enable goal-oriented active learning in this new problem setting, using uncertainty modeling and sequence generation.
Advisors
황성주researcher
Description
한국과학기술원 :김재철AI대학원,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.8,[i, 17 p. :]

Keywords

순차적 의사 결정▼a강화 학습▼a의사결정 트랜스포머▼a트랜스포머 구조▼a지피티 구조▼a자기주도학습▼a불확실성 모델링▼a액티브 러닝▼a실험계획법; Sequential decision making▼aReinforcement learning▼aDecision transformer▼aTransformer architecture▼aGPT architecture▼aSelf-supervised learning▼aUncertainty modeling▼aActive learning▼aExperimental design

URI
http://hdl.handle.net/10203/320552
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045740&flag=dissertation
Appears in Collection
AI-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0