Thesis

4 97 TRANSFERRING TARGETED MAXIMUM LIKELIHOOD ESTIMATION INTO SPORT SCIENCE system under study. The aim of this paper is to introduce readers to TMLE and the causal roadmap. To reduce the complexity of the paper, we have reduced the complexity of the causal model by leaving out some possible time depending relations. We believe that this impact is low, but we would advise readers who are dealing with time-series data to look into TMLE methods that make use of time-series data. TMLE is known as a double robust estimator, meaning that is it is consistent whenever the propensity score model is correctly specified or the outcome regression is correctly specified [6]. Although there are other double robust estimators methods like the Augmented Inverse Propensity Weighted (AIWP) Estimator, we limit ourselves to one method. Van der Laan and Rose [2], compared different methods and found that Maximum likelihood estimation (MLE) based methods and estimating equations (IPTW and AIPTW) will underperform in comparison with TMLE. For we aimed to introduce causal inference and targeted learning in sport science, we choose to use the novel TMLE using machine learning and targeted learning. In our experiments TMLE and TMLEH outperformed GLM for the observed data between the causal model and the miss-specified model. However, the difference in the effect size between the causal model and the miss-specified model was considerable for every method. The difference in effect size may be affected by the limited selection of the contextual factors. Since well-known contextual factors with an important influence on the physical performance, such as match location (home or away), score (win, draw or lose), rival level [7]–[9] were not available in our dataset and not taken into account. Therefore, our study does not fully meet the second assumption that there is no unmeasured confounding between treatment A and outcome Y, hence the use of the convenience assumption. In contrast, in the simulation study we have full control over the data generating distributions and their relations, and this study therefore allows us to fulfil the second assumption. Our goal with the simulation study is to show the applicability of the roadmap and TMLE to a practical problem, whilst having an objective means to compare the performance of TMLE to other methods. The double robustness of TMLE implies more resilience to endogeneity although the double robustness does not solve the endogeneity problem completely. In a study in pharmacoepidemiology, it is found that the more factors are taken into account, the better TMLE performs and becomes more independent of the treatment model specification [12]. When applying the complete set of factors, the outcomes were correct regardless of the treatment model specification [12]. In theory, when all factors are taken into account in the performance of a soccer team, TMLE will engage the true influence of a substitution.

RkJQdWJsaXNoZXIy MjY0ODMw