Reinforcement Learning Based Optimal Control of Batch Processes Using Monte-Carlo Deep Deterministic Policy Gradient with Phase Segmentation

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 103
  • Download : 99
Batch process control represents a challenge given its dynamic operation over a large operating envelope. Nonlinear model predictive control (NMPC) is the current standard for optimal control of batch processes. The performance of conventional NMPC can be unsatisfactory in the presence of uncertainties. Reinforcement learning (RL) which can utilize simulation or real operation data is a viable alternative for such problems. To apply RL to batch process control effectively, however, choices such as the reward function design and value update method must be made carefully. This study proposes a phase segmentation approach for the reward function design and value/policy function representation. In addition, the deep deterministic policy gradient algorithm (DDPG) is modified with Monte-Carlo learning to ensure more stable and efficient learning behavior. A case study of a batch polymerization process producing polyols is used to demonstrate the improvement brought by the proposed approach and to highlight further issues.
Publisher
PERGAMON-ELSEVIER SCIENCE LTD
Issue Date
2021-01
Language
English
Article Type
Article
Citation

COMPUTERS & CHEMICAL ENGINEERING, v.144, pp.107133

ISSN
0098-1354
DOI
10.1016/j.compchemeng.2020.107133
URI
http://hdl.handle.net/10203/280027
Appears in Collection
CBE-Journal Papers(저널논문)
Files in This Item
1-s2.0-S0098135420307912-main.pdf(1.87 MB)Download

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0