Thesis

1 15 GENERAL INTRODUCTION of data in sports is considerably more than the ability to extract meaningful insights from it [39]. Six years later, in 2020, as highlighted in a survey on the use of wearables producing GPS data in English professional soccer (2020) [40], it was posed that there is still a struggle to distil predictions due to the amount and complexity of the data and the outcomes are often inconclusive [40]. A potential solution to the second problem is using more sensitive performance measures and applying various machine learning algorithms. The more sensitive a physical performance measure the more responsive or reactive it is to changes in the underlying system being measured [42]. The fundamental consideration in machine learning is not only which algorithm is superior but, rather, the conditions (e.g. physical performance measures used) under which an algorithm can outperform others [43]. Therefore, to determine these physical performance measures’ relative applicability and sensitivity in prediction models, we examine the effectiveness of converting raw physical activity data from daily life and sports into various physical performance measures. We combine these physical performance measures with multiple machine learning algorithms to identify the optimal combination enabling meaningful predictions. The third problem is the use of simplified models and assumptions. When using data from wearables and monitoring systems applying models to derive meaningful insights is complex. The complexity arises from the fact that extracting insights using traditional statistical models often entails unrealistic assumptions on the underlying reality and often reduces complex questions to simple statistical assumptions using a fixed set of parameters [41], [44]–[47]. For example, the often-used Generalised Linear Models depend on the linearity and completeness of the parameters. However, this is seldom the case and yields suboptimal results [48]. A potential solution to the third problem is to apply a causal roadmap in combination with a causal model. The causal roadmap is a framework for designing and evaluating causal inference studies [49]. It provides a structured approach for identifying potential sources of bias and confounding in observational studies and selecting appropriate methods for adjusting for these sources of bias. The causal roadmap demands a welldefined causal model of reality. A causal model, as proposed by Judea Pearl in the Book of Why [50], is a mathematical framework for understanding the relationships between variables in a system and how changes in one variable can cause changes in another while identifying the potential sources of bias and confounding. Pearl’s causal model allows for the analysis and manipulation of complex systems. A fourth problem is the absence of confounding variables. For example, contextual variables in soccer, such as match location (home or away), score (win, draw or lose), and rival level, and individual characteristics [51], such as the athlete’s psychological well-being or social support, might influence physical performance during a match. Although monitoring systems and log data may include some of these variables, the availability and integration of all relevant variables in sports is a specific problem for the prediction of physical performance or injury risk [52]. The inability to include all relevant variables in sports analytics, is

RkJQdWJsaXNoZXIy MjY0ODMw