668430-Roa

144 Part III: Maintenance optimisation of multi-state components main degradation and maintenance strategies for sewer rehabilitation planning. F. Taillandier and Bennabi, 2020 proposes the AGORAmethod based on multicriteria decision analysis, which incorporates uncertainties and enables comparing di!erent management strategies. Khurelbaatar, Al Marzuqi, Van A!erden, et al., 2021 employs the ALLOWS method, enabling comparison of di!erent management scenarios and stakeholder selection of the most cost-e!ective. Ramos-Salgado, Muñuzuri, Aparicio-Ruiz, et al., 2022 proposes a five-step framework for long-term infrastructure asset management and planning. Assaf and Assaad, 2023 adopts an agent-based approach combined with Monte Carlo analysis to determine optimal preventive maintenance, repair, and replacement policies. Reinforcement Learning The integration of Reinforcement Learning (RL) into sewer asset management is largely unexplored, with existing research mainly concentrating on real-time control for smart infrastructure, adapting to environmental changes such as storms. Mullapudi, Lewis, Gruden, et al., 2020 utilises DRL for controlling stormwater system valves through simulation of varied storm scenarios. Z. Yin, Leon, Sharifi, et al., n.d. employ RL for near real-time control to minimise sewer overflows. Meanwhile, Z. Zhang, Tian, and Liao, 2023 and Tian, Liao, Zhi, et al., 2022 both delve into enhancing urban drainage systems’ robustness, the former through decentralised multi-agent RL and the latter via Multi-RL, with Tian, Fu, Xin, et al., 2024 further improving model interpretability using DRL. Additionally, Kerkkamp, Bukhsh, Y. Zhang, et al., 2022 investigates sewer network MPO by combining DRL with Graphical Neural Networks to optimise maintenance actions grouping. Jeung, Jang, Yoon, et al., 2023 proposes a DRL-based data assimilation methodology to enhance stormwater and water quality simulation accuracy by integrating observational data with simulation outcomes. Based on recent review papers, we o!er an overview of the general advantages of employing DRL for MPO tasks: - The RL paradigm o!ers a unified framework for formulating problems that integrates both condition and predictive-based maintenance objectives with maintenance optimisation goals (Ogunfowora and Najjaran, 2023). - DRL excels in complex, dynamic environments and is adaptable to uncertain conditions through its trial-and-error learning approach, which does not require pre-collected data or prior knowledge. This makes it ideal for addressing MPO challenges (Real Torres, Andreiana, Ojeda Roldán, et al., 2022). - The proven e!ectiveness of DRL in MPO tasks within infrastructure systems, such as pavement (Yao, Dong, Jiang, et al., 2020), highlights its potential to improve maintenance strategies across other infrastructure systems (Marugán, 2023). - DRL methods o!er time-e"cient and cost-e!ective solutions compared to traditional approaches by minimising maintenance costs and risks, and balancing

RkJQdWJsaXNoZXIy MjY0ODMw