Deep reinforcement learning based finite-horizon optimal tracking control for nonlinear system

Cited 11 time in webofscience Cited 0 time in scopus
  • Hit : 105
  • Download : 0
Reinforcement learning (RL) can be used to obtain an approximate numerical solution to the Hamilton-Jacobi-Bellman (HJB) equation. Recent advances in machine learning community enable the use of deep neural networks (DNNs) to approximate high-dimensional nonlinear functions as those that occur in RL, accurately without any domain knowledge. In the standard RL setting, both system and cost structures are unknown, and the amount of data needed to obtain an accurate approximation can be impractically large. Meanwhile, when the structures are known, they can be used to solve the HJB equation efficiently. Herein, the model based globalized dual heuristic programming (GDHP) is proposed, in which the HJB equation is separated into value, costate, and policy functions. A particular class of interest in this research is finite horizon optimal tracking control (FHOC) problem. Additional issues that arise, such as time-varying functions, terminal constraints, and delta-input formulation, are addressed in the context of FHOC. The DNN structure and training algorithm suitable for FHOC are presented. A benchmark continuous reactor example is provided to illustrate the proposed approach.
Publisher
Elsevier BV
Issue Date
2018-09
Language
English
Citation

Joint Meeting of the 2nd IFAC Workshop on Linear Parameter Varying Systems (LPVS) / 9th IFAC Symposium on Robust Control Design (ROCOND), pp.257 - 262

ISSN
2405-8963
DOI
10.1016/j.ifacol.2018.11.115
URI
http://hdl.handle.net/10203/312166
Appears in Collection
CBE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 11 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0