Deep reinforcement learning for scheduling of a steel plant in the electricity spot market

Shah, Margi

, Zhou, Yue

, Wu, Jianzhong

and Mowbray, Max 2026. Deep reinforcement learning for scheduling of a steel plant in the electricity spot market. Engineering 10.1016/j.eng.2025.12.038

[thumbnail of 1-s2.0-S2095809926000706-main.pdf]

PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB)

Official URL: https://doi.org/10.1016/j.eng.2025.12.038

Abstract

The steel industry, characterized by its substantial energy consumption, is grappling with rising energy costs and the imperative to decarbonize. However, the scheduling of a steel plant is challenged by the complexity and interdependency of its processes with various uncertainties. This study introduces a deep reinforcement learning (DRL) methodology specifically designed to optimize scheduling in the presence of the exogenous uncertainties brought by electricity prices and on-site renewable generation. The scheduling problem is formulated as a partially observable Markov decision process (POMDP), which enables decision-making despite the state not being fully observable. The attention mechanism is utilized to abstract a representation of a window of observations upon which decisions are conditioned. The control space is defined by domain knowledge-informed heuristic rules, and evolutionary search is utilized for the purpose of policy optimization. The case study considers an electric arc furnace (EAF)-based steel plant with various problem sizes and processing times for steelmaking tasks. The performance of the proposed method is compared with a traditional mixed integer linear programming (MILP) approach and the policy gradient method, proximal policy optimization (PPO). The proposed method is evaluated under uncertainty conditions arising from market prices and on-site renewable energy sources. Case study results reveal that the proposed DRL strategy effectively integrates uncertainties into real-time decision-making, achieving a desirable performance level with minimal online computational cost.

Item Type:	Article
Date Type:	Published Online
Status:	In Press
Schools:	Schools > Engineering
Publisher:	Elsevier
ISSN:	2095-8099
Date of First Compliant Deposit:	23 February 2026
Date of Acceptance:	23 December 2025
Last Modified:	23 Feb 2026 12:45
URI:	https://orca.cardiff.ac.uk/id/eprint/185116

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)