6 139 GENERAL DISCUSSION drive the team’s physical performance and the potential consequences of substitution. Moreover, applying the Directed Acyclic Graph (DAG) [8] as a causal model helped identify and explicate unobserved confounding variables that may influence the results. This approach ensured that the assumptions were based on a thorough understanding of substitution and the team’s physical performance. For example, in the case of a soccer match, by explicating that the physical performance of the team is influenced by contextual variables such as match location (home or away), score (win, draw or lose), or competitive level of the opponent [9]–[11], while not present in the dataset, prevents the reliance on oversimplified assumptions. However, the second assumption of the applied causal roadmap dictates that there should be no unmeasured confounding between the variable of interest’s change and the outcome response. For instance, in Chapter 4, the second assumption can be expressed as no unmeasured confounding influences the outcome of substitution on the team performance. Therefore, explicating unmeasured confounding variables using the DAG while ensuring an explicit definition of reality contradicts this assumption. It is challenging to include all confounding variables, as numerous variables can affect the variable of interest’s change and the outcome response. Therefore, it is important to acknowledge the limitations of following the causal roadmap in combination with DAG. A solution might be found in the application of a less strict but formal and explicit data analytics method such as ‘Knowledge Discovery in Databases’ [12]–[14] (KDD) in sports and daily life. KDD provides a framework for formalising data analytics, data handling, machine learning applications, and statistical modelling [12]. As such, explicating the steps taken during data analytics offers guidance and insights that avoid the unrealistic assumption of the causal roadmap that there is no unmeasured confounding between the variable of interest’s change and the response of the outcome in sports and daily life. Using statistical methods that account for the absence of confounding variables improves the quality of the statistical insights. In Chapter 4, we combined TMLE with the ensemble machine learning method Super Learner to determine the influence of substitution on the soccer team’s physical performance. The findings revealed that TMLE effectively reduces the negative impact of missing or unmeasured variables on the calculated accuracy of the effect of substitution. Also, the applied ensemble machine learning method Super Learner is known for its accurate predictions on data containing missing variables [6], [15]. Additionally, the combination of TMLE and the Super Learner could be effectively employed in other fields where unmeasured confounding variables are a concern. Although the combination of TMLE with the Super Learner reduces the negative impact of missing variables in statistical insights, as highlighted by references [16], [17], the
RkJQdWJsaXNoZXIy MjY0ODMw