4 80 CHAPTER 4 was conducted to differentiate outcome variables as functions of time. A variance components model with no predictors was established for each outcome measure before sequentially allowing intercepts and slopes to vary. A combination of random slopes and intercepts was employed based upon Bayesian information criterion assessments of model fit. One of the conclusions was: substitutes covered a greater (p < 0.05) total (+67 to +93 m) and high-speed (+14 to +33 m) distances during the first five minutes of match-play versus all subsequent epochs. M. Lorenzo et al. [16] aimed, amongst others, to analyse the physical and technical performance of substitute players versus entire match players or players who were replaced. Linear mixed models analysed the differences between the performance of substitute, replaced, and entire match players. Bonferroni’s post-hoc test and Cohens’ d conducted the group comparison and the effect size. One of the results was substitute players showing higher total distance covered (ES: 0.99–1.06), number of sprints (ES: 0.60–0.64), and number of fast runs (ES: 0.83–0.91) relative to playing time than replaced and entire match players. All studies mentioned above, and their applied methods have in common that they indicate an association between elements of a soccer match but leave out many factors that influence the association’s actual effect size. A combination of the results of Modric et al. [14] and the remaining three [3], [4], [16] indicate that a substitute player has a better game performance. Even the combination leaves out the influence of the substitutions on the total performance. The methods used and the factors investigated grab only a tiny part of the overall complex system of a soccer match. As Morgulev et al. [18] indicate, it is hard to conclude causality in sports complex systems due to endogeneity problems even when a correlation is found. Endogeneity means either a variable correlated with both the independent variable in the model and with the error term or a left-out variable affecting the independent variable and separately affects the dependent variable. Sports complex systems are influenced by various left-out factors in the studied phenomenon, making it complex to find causal inference [18]. 2.2. TMLE and causal modelling Targeted learning is a unique methodology, which reconciles advanced machine learning algorithms and semi-parametric inferential statistics [2]. The data available for analysis in sports is proliferating [19] and presents a challenge to both inferential statistics and machine learning. The vast amount of data in sports from, for instance, a semiautomatic multiple-camera video technology in soccer, combined with the inherent complexity of the data-generating process complicates statistical inference and the underlying mathematical theory. Such as limiting the use of miss-specified models, acknowledging that the models do not contain and compensate for the truth, looking for
RkJQdWJsaXNoZXIy MjY0ODMw