Reinforcement Learning Based Optimal Control of Batch Processes Using Monte-Carlo Deep Deterministic Policy Gradient with Phase Segmentation

Cited 63 time in webofscience Cited 33 time in scopus
  • Hit : 318
  • Download : 818
DC FieldValueLanguage
dc.contributor.authorYoo, Haeunko
dc.contributor.authorKim, Boeunko
dc.contributor.authorKim, Jong Wooko
dc.contributor.authorLee, Jay Hyungko
dc.date.accessioned2021-01-28T05:52:31Z-
dc.date.available2021-01-28T05:52:31Z-
dc.date.created2020-11-04-
dc.date.issued2021-01-
dc.identifier.citationCOMPUTERS & CHEMICAL ENGINEERING, v.144, pp.107133-
dc.identifier.issn0098-1354-
dc.identifier.urihttp://hdl.handle.net/10203/280027-
dc.description.abstractBatch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues.-
dc.languageEnglish-
dc.publisherPERGAMON-ELSEVIER SCIENCE LTD-
dc.titleReinforcement Learning Based Optimal Control of Batch Processes Using Monte-Carlo Deep Deterministic Policy Gradient with Phase Segmentation-
dc.typeArticle-
dc.identifier.wosid000598170500004-
dc.identifier.scopusid2-s2.0-85096193107-
dc.type.rimsART-
dc.citation.volume144-
dc.citation.beginningpage107133-
dc.citation.publicationnameCOMPUTERS & CHEMICAL ENGINEERING-
dc.identifier.doi10.1016/j.compchemeng.2020.107133-
dc.contributor.localauthorLee, Jay Hyung-
dc.contributor.nonIdAuthorKim, Boeun-
dc.contributor.nonIdAuthorKim, Jong Woo-
dc.description.isOpenAccessY-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorBatch process-
dc.subject.keywordAuthorReinforcement learning-
dc.subject.keywordAuthorOptimal control-
dc.subject.keywordAuthorActor-Critic-
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 63 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0