Skip to main content
placeholder image

Comparison of online and offline deep reinforcement learning with model predictive control for thermal energy management

Journal Article


Abstract


  • This paper proposes a comparison between an online and offline Deep Reinforcement Learning (DRL) formulation with a Model Predictive Control (MPC) architecture for energy management of a cold-water buffer tank linking an office building and a chiller subject to time-varying energy prices, with the objective of minimizing operating costs. The intrinsic model-free approach of DRL is generally lost in common implementations for energy management, as they are usually pre-trained offline and require a surrogate model for this purpose. Simulation results showed that the online-trained DRL agent, while requiring an initial 4 weeks adjustment period achieving a relatively poor performance (160% higher cost), it converged to a control policy almost as effective as the model-based strategies (3.6% higher cost in the last month). This suggests that the DRL agent trained online may represent a promising solution to overcome the barrier represented by the modelling requirements of MPC and offline-trained DRL approaches.

Publication Date


  • 2022

Citation


  • Brandi, S., Fiorentini, M., & Capozzoli, A. (2022). Comparison of online and offline deep reinforcement learning with model predictive control for thermal energy management. Automation in Construction, 135. doi:10.1016/j.autcon.2022.104128

Scopus Eid


  • 2-s2.0-85122507811

Volume


  • 135

Issue


Place Of Publication


Abstract


  • This paper proposes a comparison between an online and offline Deep Reinforcement Learning (DRL) formulation with a Model Predictive Control (MPC) architecture for energy management of a cold-water buffer tank linking an office building and a chiller subject to time-varying energy prices, with the objective of minimizing operating costs. The intrinsic model-free approach of DRL is generally lost in common implementations for energy management, as they are usually pre-trained offline and require a surrogate model for this purpose. Simulation results showed that the online-trained DRL agent, while requiring an initial 4 weeks adjustment period achieving a relatively poor performance (160% higher cost), it converged to a control policy almost as effective as the model-based strategies (3.6% higher cost in the last month). This suggests that the DRL agent trained online may represent a promising solution to overcome the barrier represented by the modelling requirements of MPC and offline-trained DRL approaches.

Publication Date


  • 2022

Citation


  • Brandi, S., Fiorentini, M., & Capozzoli, A. (2022). Comparison of online and offline deep reinforcement learning with model predictive control for thermal energy management. Automation in Construction, 135. doi:10.1016/j.autcon.2022.104128

Scopus Eid


  • 2-s2.0-85122507811

Volume


  • 135

Issue


Place Of Publication