Thesis

6 142 CHAPTER 6 Next to the strengths of this thesis, there are limitations. One of the limitations is the low number of variables in the datasets, limiting the accuracy and usability of prediction or identified relationships. Additionally, as mentioned in the discussions of Chapters 2, 3, 4, and 5, the studies use only the physical activity data captured by wearables, monitoring systems, or athletes’ logs, not solving the endogeneity problem. Therefore, it may be necessary to design and construct monitoring systems according to the selected physical performance measures and add internal and contextual factors that may improve the applicability of machine learning in practice. For instance, as aforementioned, in predicting injury risk in running, adding internal factors, such as personality dimensions such as attribution style or cognitive style [23], and contextual factors, such as social context or a negative life event [24], may contribute to the applicability of machine learning. Alternatively, in soccer, adding contextual factors, such as match location (home or away), score (win, draw or lose), and rival level, may improve the accuracy of machine learning models [9]–[11]. Another limitation of this thesis is the labour-intensive process of data preparation. The physical activity data used in this thesis was initially raw data from wearables, monitoring systems and athletes’ logs. However, the raw data must be prepared to construct physical performance measures. The data was prepared by hand for every study, erroneous data were excluded, and missing data were either excluded or imputed. The data preparation took much time before statistical analysis or machine learning could be applied. To enable timely prediction and more informed decisions, labour-intensive data preparation processes must be eliminated to apply machine learning in a live situation. To ensure a minimum of data preparation and live prediction, predictive monitoring systems must be developed with the selection of the physical performance measures in the back of one’s mind. The predictive monitoring systems must also transform tracking data in a pipeline directly to physical performance measures that enable machine learning and prediction in the short term or even in real-time. The predictive monitoring systems preferably have to conform to the guidelines of trustworthy AI, as stated by the European Union [25]. The systems must be lawful (respecting all applicable laws and regulations), ethical (respecting ethical principles and values), and robust (from a technical perspective) while considering its social environment. Taking advantage of data analytics and machine learning requires transparency and trustworthiness. By ensuring that machine learning and prediction are built on solid ethical principles and practices and open to scrutiny and validation, we can increase the confidence level in the results and enable their wider adoption. While we hope this thesis has contributed to confidence in data analytics and machine learning in daily life and elite sports, realising the adoption, transparency, and trustworthiness of data analytics and machine learning needs to be examined.

RkJQdWJsaXNoZXIy MjY0ODMw