Incremental Receptive Field Weighted Actor-Critic

Cited 8 time in webofscience Cited 0 time in scopus
  • Hit : 732
  • Download : 43
DC FieldValueLanguage
dc.contributor.authorLee, Dong-Hyunko
dc.contributor.authorLee, Ju-Jangko
dc.date.accessioned2013-03-13T04:28:11Z-
dc.date.available2013-03-13T04:28:11Z-
dc.date.created2012-12-03-
dc.date.created2012-12-03-
dc.date.issued2013-02-
dc.identifier.citationIEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, v.9, no.1, pp.62 - 71-
dc.identifier.issn1551-3203-
dc.identifier.urihttp://hdl.handle.net/10203/104462-
dc.description.abstractIn this paper, a novel actor-critic method using an incrementally constructed radial basis function network is developed to deal with continuous state and action problems. There exists one local model for each basis function, and the number of local models is increased as the basis function network grows. The normalized weighted sum of their outputs is used to estimate the value function for the critic, and the models are updated with the local temporal difference error in the receptive field of the corresponding basis function. A Gaussian policy is used for continuous action, and it is parameterized by the mean and the standard deviation. The parameters are determined by the normalized weighed sum of the corresponding sub-parameters assigned to the basis functions, and the regular policy gradient method is used for their update process. A new error is introduced for the online shape adaptation of the basis functions. Reducing this error prevents some of the basis functions from dominating the value function approximation and the policy, and improves the performance when the incrementally constructed basis function network is used. Simulation results for three benchmark problems show the performance and effectiveness of the proposed method in comparison to other methods.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.subjectREINFORCEMENT-
dc.subjectALGORITHMS-
dc.titleIncremental Receptive Field Weighted Actor-Critic-
dc.typeArticle-
dc.identifier.wosid000312839600006-
dc.identifier.scopusid2-s2.0-84871996847-
dc.type.rimsART-
dc.citation.volume9-
dc.citation.issue1-
dc.citation.beginningpage62-
dc.citation.endingpage71-
dc.citation.publicationnameIEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS-
dc.identifier.doi10.1109/TII.2012.2209660-
dc.embargo.liftdate9999-12-31-
dc.embargo.terms9999-12-31-
dc.contributor.localauthorLee, Ju-Jang-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorActor-critic-
dc.subject.keywordAuthorlocal model-
dc.subject.keywordAuthorpolicy gradient-
dc.subject.keywordAuthorreceptive field weighted regression-
dc.subject.keywordAuthorreinforcement learning-
dc.subject.keywordPlusREINFORCEMENT-
dc.subject.keywordPlusALGORITHMS-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 8 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0