668430-Roa

16 Chapter 1. Introduction gaming (Arulkumaran, Deisenroth, Brundage, et al., 2017), as well as applications in Maintenance Management (Fink, Q. Wang, Svensén, et al., 2020; Ogunfowora and Najjaran, 2023; Siraskar, Kumar, Patil, et al., 2023). X. Wang, S. Wang, Liang, et al., 2024 classify DRL algorithms into three categories: value-based, policy-based, and maximum-entropy-based, where the nature of the classification resides primarily in the learning objectives. Section III.4.2 provides formal definitions of DRL; here, we present an example to illustrate this concept. Figure 1.11(a) presents the same grid-world problem discussed earlier. Figure 1.11(b) demonstrates how DRL could be applied to “solve” the MDP. Here, a Neural Network (NN) is shown, conceptually representing the mouse’s behaviour in the grid world; essentially, the NNis the virtual agent. A 1 2 3 B C P(action=Up) = 0.8 P(action=Down) = 0.0 P(action=Right) = 0.0 P(action=Left) = 0.2 (a) Grid-world problem (b) Policy Network Input layer Hidden layer Output layer Figure 1.11: Solving grid-world problem using an DRL (example). The state—the mouse’s current position on the grid—is encoded in the input layer, and this information is propagated through the hidden layers—a series of mathematical transformations—to the output layer, where four nodes correspond to the possible actions the agent can take. As mentioned, various DRL algorithms can train the agent. After training, the virtual mouse should ideally navigate the grid-world using the optimal policy; for the state in Figure 1.11(a), the NN suggests moving Up with 80% probability, bringing the mouse a step closer to the cheese! The background and examples provided in this section should adequately convey the philosophy behind the key concepts in this dissertation. In the next section, we will discuss the research gaps addressed by this work. 1.3 Research gaps The previous section provided an overview of the main concepts and relevant literature used in this dissertation, with PHM as the overarching concept. This dissertation is organised into three parts: Part I—with the main concepts discussed in Section 1.2.2—addresses how to obtain e!cient and compact Fault Tree models

RkJQdWJsaXNoZXIy MjY0ODMw