668430-Roa

7 7.2. Technical background 155 (ii) Our framework introduces a novel approach by encoding the prediction of the MSDM into the state space, aiming to harness prognostics that describe the deterioration pattern of sewer mains. (iii) We demonstrate that DRL has the potential to devise intelligent strategic maintenance strategies adaptable to various conditions, such as pipe age. (iv) We provide our framework in Python and all data used in this study at zenodo.org/records/11258904. Chapter outline. Section 7.2 presents the technical background. Section 7.3 outlines our research methodology. Section 7.4 formulates the MSDM. Section 7.5 details the framework for maintenance policy optimisation via DRL. Section 7.6 presents our experimental setup. Section 7.7 analyses the results. Section 7.8 discusses findings, concludes, and suggests future research. 7.2 Technical background Refer to the following sections for the technical background of this chapter: Markov chains (Section II.4.1, page 95); Multi-State Deterioration Models (MSDMs) (Section 6.2.1, page 126); and Markov Decision Process (Section III.4.1, page 145). We aim to use Deep Reinforcement Learning (DRL) (Section III.4.2, page 145) to train agents in virtual environments with degradation patterns determined by the MSDM, as described in Section 7.5. For this, we apply Proximal Policy Optimisation (PPO) (Section III.4.4, page 148), a policy-based method in DRL. 7.3 Methodology Our methodology, illustrated in Figure 7.1, comprises six steps, detailed below. Maintenance policy implementation Step 1 - Data handling, cohorts selection, and calibration of MSDMs Step 2 - Define training environments Step 3 - Hyper-parameter tuning Sk(t) Sk(t) Agent A Agent B Step 4 - Model selection Step 5 - Testing selected agents in testing environment Step 6 - Maintenance policy analysis Sk(t) Figure 7.1: Methodology overview for sewer main maintenance policy optimisation using Deep Reinforcement Learning and Multi-State Deterioration models. Step 1. Perform data handling of historical inspection records, selecting subsets (cohorts) of interest, and calibrating the MSDM on this data. This step

RkJQdWJsaXNoZXIy MjY0ODMw