6 140 CHAPTER 6 issue of incomplete datasets in personalised predictions is a common challenge that can significantly impact the accuracy and reliability of the predictions. Gaining insights into the relations of physical activity, physical performance, and the resulting outcomes is sometimes the maximum that can be achieved. Chapter 5A used an incomplete dataset on running and overuse injuries to perform statistical analysis. The analysis revealed that increased load is associated with increased injury risk. However, the low specificity and sensitivity of the relationship between load and increased injury risk made it an insufficient predictor of injury risk. The low specificity results in possible falsely identified runners who are not at risk of sustaining an injury. Conversely, the low sensitivity may fail to identify runners at risk of injury. Although it is a meaningful insight that increased load is associated with increased risk, this also highlights the limitations of statistics when the necessary variables are not present in the dataset. Also, the practical use of machine learning predictions can be limited when an incomplete dataset is used. For example, in Chapter 5B, machine learning did not contribute to correctly predicting an overuse injury. We used machine learning on the load dataset of competitive runners of Chapter 5A. Our findings revealed that the machine learning model’s precision and recall of the prediction of an overuse injury due to an increase in the load were lower than desired. The machine learning model’s low precision implies possibly falsely identifying runners as being at risk of injury. In addition, the model’s low recall suggests a failure to identify a significant number of runners at risk of injury. Our results align with those of Lövdal et al., who applied machine learning to a partially similar dataset [18]. For statistical or machine learning models to be practical, high specificity, sensitivity, precision, and recall are essential. Although the insight that load can impact injury risk is valuable, predicting an overuse injury with only load as the variable, without considering other important confounding variables, is unlikely to be useful [19]. Regularly intentionally increasing the load is a crucial part of training to improve physical performance [20]– [22]. The pattern of increased load is frequently observed before an injury occurs, but it is even more prevalent in the dataset due to being a standard training pattern. Thus, it is crucial to incorporate confounding internal and external variables in the dataset to improve the reliability of the injury risk prediction. For instance, in predicting injury risk in running, adding internal factors, such as personality dimensions like attribution style or cognitive style [23], and contextual factors, such as social context or a negative life event [24], may contribute to the applicability of machine learning.
RkJQdWJsaXNoZXIy MjY0ODMw