Thesis

5B 131 MACHINE LEARNING PREDICTING RUNNING INJURIES DISCUSSION The main objective was to explore the possibility of predicting the occurrence of an injury with machine learning in competitive runners and provide support to trainers and runners to intervene on time in the training load. The Random Forest model outperformed the Naïve Bayes algorithm. However, none of the models will accurately predict sustaining an injury. Although the overall accuracy and f1-score of the Random Forest models are reasonably high (between 0.86 and 0.92), the recall and precision are very low (between 0.00 and 0.11). A very low precision means that it is impossible to identify injuries correctly. Although the AUC of the Naïve Bayes models was still above 0.5, the precision and recall are too low to be of practical use. A limitation of the study was the limited amount of injuries compared to the available training weeks. Only 31 injuries were identified as applicable in a data set over two years of training of 22 competitive runners. Meaning there was only a sparse small dataset. Lövdal et al. did construct machine learning models predicting injuries on an extended dataset of the same team, using seven years of data of 64 runners and extensively more features [7]. Lövdal et al. suggest a practical appliance of machine learning to predict injuries is possible[7]. However, the specificity of the prediction was between 0.741 and 0.746 (how good is the model at avoiding falsely identified injuries), and sensitivity(= recall) was between 0.504 and 0.584 (how good is the model in identifying the injuries) [7], which has limited use in practice. We found that the ACWR ratio relates to sustaining an injury [1]; however, the relation seems too weak to train a machine learning model. Although generally, Random Forest is a well-performing algorithm, as we showed in [8], some algorithms are supposed to perform better on small data sets, such as the Artificial Neural Network algorithm [9]. Using machine learning algorithms that are more equipped explicitly for small, sparse datasets for further research might be interesting. However, probably more data is needed to train the machine learning algorithm effectively. CONCLUSION The prediction of sustaining an injury using the ACWR and machine learning is inaccurate. The realized machine learning models offer no support for trainers or runners in practice, and injury cannot be predicted precisely enough. Acknowledgments The authors would like to thank Henk van der Worp for identifying the injuries in the runners’ data and Marco Aiello for suggestions on improving the original paper.

RkJQdWJsaXNoZXIy MjY0ODMw