DSpace at KOASAS: A Discrete-Time Switching System Analysis of Q-learning

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

A Discrete-Time Switching System Analysis of Q-learning

Cited 1 time in

Cited 0 time in

Hit : 98
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Lee, Donghwan	ko
dc.contributor.author	Hu, Jianghai	ko
dc.contributor.author	He, Niao	ko
dc.date.accessioned	2023-08-14T02:00:09Z	-
dc.date.available	2023-08-14T02:00:09Z	-
dc.date.created	2022-11-14	-
dc.date.created	2022-11-14	-
dc.date.issued	2023-08	-
dc.identifier.citation	SIAM JOURNAL ON CONTROL AND OPTIMIZATION, v.61, no.3, pp.1861 - 1880	-
dc.identifier.issn	0363-0129	-
dc.identifier.uri	http://hdl.handle.net/10203/311457	-
dc.description.abstract	This paper develops a novel control-theoretic framework to analyze the non-asymptotic convergence of Q-learning. We show that the dynamics of asynchronous Q-learning with a constant step-size can be naturally formulated as a discrete-time stochastic affine switching system. Moreover, the evolution of the Q-learning estimation error is over- and underestimated by trajectories of two simpler dynamical systems. Based on these two systems, we derive a new finite-time error bound of asynchronous Q-learning when a constant stepsize is used. Our analysis also sheds light on the overestimation phenomenon of Q-learning. We further illustrate and validate the analysis through numerical simulations.	-
dc.language	English	-
dc.publisher	SIAM PUBLICATIONS	-
dc.title	A Discrete-Time Switching System Analysis of Q-learning	-
dc.type	Article	-
dc.identifier.wosid	001031998600033	-
dc.identifier.scopusid	2-s2.0-85165535477	-
dc.type.rims	ART	-
dc.citation.volume	61	-
dc.citation.issue	3	-
dc.citation.beginningpage	1861	-
dc.citation.endingpage	1880	-
dc.citation.publicationname	SIAM JOURNAL ON CONTROL AND OPTIMIZATION	-
dc.identifier.doi	10.48550/arXiv.2102.08583	-
dc.contributor.localauthor	Lee, Donghwan	-
dc.contributor.nonIdAuthor	Hu, Jianghai	-
dc.contributor.nonIdAuthor	He, Niao	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Q-learning	-
dc.subject.keywordAuthor	switched linear system	-
dc.subject.keywordAuthor	stochastic approximation	-
dc.subject.keywordPlus	STOCHASTIC-APPROXIMATION	-
dc.subject.keywordPlus	CONVERGENCE	-
dc.subject.keywordPlus	RATES	-

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 1 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

A Discrete-Time Switching System Analysis of Q-learning

This item is cited by other documents in WoS

KOASAS

Communities & Collections