Embedding active learning in batch-to-batch optimization using reinforcement learning

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 85
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorByun, Ha-Eunko
dc.contributor.authorKim, Boeunko
dc.contributor.authorLee, Jay H.ko
dc.date.accessioned2023-10-17T05:00:15Z-
dc.date.available2023-10-17T05:00:15Z-
dc.date.created2023-10-16-
dc.date.issued2023-11-
dc.identifier.citationAUTOMATICA, v.157-
dc.identifier.issn0005-1098-
dc.identifier.urihttp://hdl.handle.net/10203/313424-
dc.description.abstractBatch-to-batch (B2B) or run-to-run (R2R) optimization refers to the strategy of updating the operating parameters of a batch run based on the results of previous runs and exploits the repetitive nature of batch process operation. Although B2B optimization uses feedback from previous batch runs to learn about model uncertainty and improve the operation of future runs, the standard techniques have the limitations of passive learning and being myopic in making adjustments. This work proposes a novel way to use the reinforcement learning approach to embed the active learning feature into B2B optimization. For this, the B2B optimization problem is formulated as a maximization of a long-term performance of repeated batch runs, which are modeled as a stochastic process with uncertain parameters. To solve the resulting Bayes-Adaptive Markov decision process (BAMDP) problem in a near-optimal manner, a policy gradient reinforcement learning algorithm is employed. Through case studies, the behavior and effectiveness of the proposed B2B optimization method are examined by comparing it with the traditional certainty equivalence based B2B optimization method with passive learning.-
dc.languageEnglish-
dc.publisherPERGAMON-ELSEVIER SCIENCE LTD-
dc.titleEmbedding active learning in batch-to-batch optimization using reinforcement learning-
dc.typeArticle-
dc.identifier.wosid001072594200001-
dc.identifier.scopusid2-s2.0-85170571318-
dc.type.rimsART-
dc.citation.volume157-
dc.citation.publicationnameAUTOMATICA-
dc.identifier.doi10.1016/j.automatica.2023.111260-
dc.contributor.localauthorLee, Jay H.-
dc.contributor.nonIdAuthorKim, Boeun-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorBatch process optimization-
dc.subject.keywordAuthorBatch-to-batch optimization-
dc.subject.keywordAuthorReinforcement learning-
dc.subject.keywordAuthorActive learning-
dc.subject.keywordAuthorHyper-state-
dc.subject.keywordAuthorOptimization under uncertainty-
dc.subject.keywordAuthorModel-plant mismatch-
dc.subject.keywordAuthorBayes-Adaptive Markov decision process-
dc.subject.keywordPlusMODEL-PREDICTIVE CONTROL-
dc.subject.keywordPlusDYNAMIC OPTIMIZATION-
Appears in Collection
CBE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0