668430-Roa

162 Chapter 7. Maintenance Strategies for Sewer Pipes with Multi-State Deterioration and Deep Reinforcement Learning • If at =1: the agent decides to “perform maintenance,” all damage points with severity levels k ↓ {3,4,5} are moved to k = 2. Consequently, this action does not a!ect damage points with severity levels k ↓{1,2, F}. The new state space becomes sa=1 t=30.5. sa=1 t=30.5 =↔30.5, 0.60, 0.40, 0.0, 0.0, 0.0, 0.0, 0.47, 0.439, 0.071, 0.010, 0.05, 0.05↗ Notice that the pipe age increased to 30.5, and1k = [24, 16, 0, 0, 0, 0]. However, pk(t) is updated by evaluating t =30.5, same as when at =0. • If at =2: the agent decides to “replace” the pipe, resetting its condition to as good-as-new. The new state space is sa=2 t=0.0: sa=2 t=0.0 =↔0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.986, 0.014, 0.0, 0.0, 0.0, 0.0↗. The pipe age is reset to 0.0, with 1k = [40, 0, 0, 0, 0, 0], and pk(t) is updated for t =0.0. 7.5.4 Reward function R Our reward function R(st, at, st+1) assigns a reward rt at every decision point t, determined by the current state st and action at. This function integrates the costs of maintenance (CM), replacement (CR), and failures (CF). Ris sparse because it issues a non-zero value only when failures occur or interventions are undertaken. Maintenance cost CM is calculated as per Eq. 7.6, where it combines a variable cost based on severity k with a fixed logistic cost of €500, covering the expenses related to maintenance. Table 7.2: Maintenance costs per severity k per segment (ck M) k =1 k =2 k =3 k =4 k =5 k =F ck M = 0 0 -€500 -€700 -€900 N.A. These costs vary with the severity level k, as detailed in Table 7.2. Note that no maintenance costs are associated with k =F because maintenance cannot be performed on a segment that has already failed. In this case, the agent must replace. Replacement costs (CR) is computed with Eq. 7.7: CM =→ i→k (1k · c k M) →500 (7.6) CR =→(450+0.66D+0.0008D 2 )L (7.7)

RkJQdWJsaXNoZXIy MjY0ODMw