Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Deep reinforcement learning strategy integrating environmental perception and semantic masking for real-time operational optimization of building cluster microgrids

Jiang, Ben, Zhang, Chengyu, Li, Yu, Rezgui, Yacine ORCID: https://orcid.org/0000-0002-5711-8400, Luo, Zhiwen, Ghoroghi, Ali, Wang, Peng and Zhao, Tianyi 2026. Deep reinforcement learning strategy integrating environmental perception and semantic masking for real-time operational optimization of building cluster microgrids. Energy 349 , 140676. 10.1016/j.energy.2026.140676

Full text not available from this repository.

Abstract

With the diversification of building functions and the popularization of electric vehicles, the difficulty in the operation and regulation of community microgrids that integrate renewable energy and energy storage systems has increased significantly. This study integrates real data, public datasets, and a scenario-based data generation framework for charging piles to construct a virtual community-level microgrid system. To increase the number of effective samples in reinforcement learning training, a semantic masking mechanism with environmental perception capability is introduced to achieve real-time optimal regulation of the microgrid under dynamic electricity price scenarios. The study conducted a systematic analysis under three action step-size settings and a 30-day hourly optimization scenario, integrating multiple baseline models with multidimensional evaluation metrics. The results indicate that the generated charging pile power data accurately reflects both the consistency of group charging behavior and individual variation characteristics. Compared to baseline models, the Semantic mask DQN policy achieves an average reduction of 1.21%-3.73% in total electricity consumption, while simultaneously realizing operational cost savings of 1.76%-5.86%. This strategy effectively enhances the training stability of reinforcement learning models and significantly reduces the frequency of boundary triggers in energy storage systems. Under this framework, microgrids have enhanced their ability to cope with short-term power outage scenarios. The findings of this study provide intelligent optimization approaches and theoretical support for the efficient and low-carbon operation of building cluster microgrids with charging regions.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Schools > Engineering
Publisher: Elsevier
ISSN: 0360-5442
Date of Acceptance: 6 March 2026
Last Modified: 16 Mar 2026 12:15
URI: https://orca.cardiff.ac.uk/id/eprint/185781

Actions (repository staff only)

Edit Item Edit Item