New Versions of Gradient Temporal-Difference Learning

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 65
  • Download : 0
Sutton, Szepesvári and Maei introduced the first gradient temporal-difference (GTD) learning algorithms compatible with both linear function approximation and off-policy training. The goal of this paper is (a) to propose some variants of GTDs with extensive comparative analysis and (b) to establish new theoretical analysis frameworks for the GTDs. These variants are based on convex-concave saddle-point interpretations of GTDs, which effectively unify all the GTDs into a single framework, and provide simple stability analysis based on recent results on primal-dual gradient dynamics. Finally, numerical comparative analysis is given to evaluate the new approaches.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2023-08
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, v.68, no.8, pp.5006 - 5013

ISSN
0018-9286
DOI
10.1109/TAC.2022.3213763
URI
http://hdl.handle.net/10203/311840
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0