Multi-step lookahead Bayesian optimization with active learning using reinforcement learning and its application to data-driven batch-to-batch optimization

Cited 4 time in webofscience Cited 0 time in scopus
  • Hit : 116
  • Download : 0
This study presents a novel multi-step lookahead Bayesian optimization method which strives for optimal active learning by balancing exploration and exploitation over multiple future sampling-evaluation trials. The approach adopts a Gaussian process (GP) model to represent the underlying function, which is updated after each sampling and evaluation. Then, a reinforcement learning method of Proximal Policy Optimization (PPO) is used to locate the next optimal point to sample while considering multiple future such trials using the current GP model as the fictitious environment. The approach is applied to batch-to-batch (B2B) optimization where an optimal batch recipe is searched for without any process knowledge. The B2B optimization is formulated as a partially observable Markov decision process (POMDP) problem, and GP model learning and policy learning through PPO are iteratively performed to suggest the next batch recipe. The effectiveness of the approach in the B2B optimization problem is demonstrated through two case studies.
Publisher
PERGAMON-ELSEVIER SCIENCE LTD
Issue Date
2022-11
Language
English
Article Type
Article
Citation

COMPUTERS & CHEMICAL ENGINEERING, v.167

ISSN
0098-1354
DOI
10.1016/j.compchemeng.2022.107987
URI
http://hdl.handle.net/10203/298766
Appears in Collection
CBE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 4 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0