154 Chapter 7. Maintenance Strategies for Sewer Pipes with Multi-State Deterioration and Deep Reinforcement Learning 7.1 Introduction Sewer network systems, crucial for public health, population well-being, and environmental protection, require maintenance to ensure their reliability and availability (M.A. Cardoso and Silva, 2016). This maintenance is challenged by limited budgets, environmental changes, ageing infrastructure, and hard-to-predict system deterioration (Tscheikner-Gratl, Caradot, Cherqui, et al., 2019). Optimising maintenance policies for sewer networks requires methodologies that can e"ciently explore a broad solution space while adapting to the system’s dynamic constraints and complexities. Maintenance Policy Optimisation (MPO) addresses these needs by developing and analysing mathematical models to derive maintenance strategies (de Jonge and Scarf, 2020) that reduce maintenance costs, extend asset life, maximize availability, and ensure workplace safety (Ogunfowora and Najjaran, 2023). This research explores the potential of Deep Reinforcement Learning (DRL) for MPO of sewer networks, first focusing on a component-level (i.e., pipe-level) analysis. DRL is a framework that merges neural network representation learning capabilities with Reinforcement Learning (RL), a branch of Machine Learning known for its e!ectiveness in sequential decision-making problems. RL is increasingly recognised for its role in developing cost-e!ective policies in MPO across diverse domains such as transportation, manufacturing, civil infrastructure and energy systems. It is emerging as a prominent paradigm in the search for optimal maintenance policies (Marugán, 2023). This chapter aims to achieve two primary objectives: first, to present a comprehensive model for pipe-level MPO analysis facilitated by DRL, considering deterioration over the pipe length and employing Inhomogeneous Time Markov Chain models to simulate the non-linear stochastic behaviour associated with sewer main deterioration; second, to assess the e"cacy of the model’s policy through a case study of a large-scale sewer network in the Netherlands, comparing it with heuristics, including condition-based, scheduled, and reactive maintenance. We acknowledge as limitations in our approach the focus on fully observable state spaces, which means that inspection actions are not necessary, and our analysis is at the component-level. Future research will aim to broaden this scope to include partially observable state spaces and system-level analysis. Contributions. This work’s primary contributions include: (i) We propose a framework to carry out maintenance policy optimisation for sewer mains considering the deterioration along the pipe length. This framework integrates Multi-State Deterioration Models (MSDM) and Deep Reinforcement Learning (DRL).
RkJQdWJsaXNoZXIy MjY0ODMw