Physical performance in daily life and sports: bridging the data analytics gap Talko B. Dijkhuis

Cover: Ilse Modder | Layout: Ilse Modder | Printed by: Ipskamp Printing | © Copyright 2024, T.B. Dijkhuis, The Netherlands. All rights reserved. No part of this thesis may be reproduced, stored in a retrieval system, or transmitted in any form of by any means, electronic, mechanical, by photocopying, recording, or otherwise, without prior written permission of the author.

Physical performance in daily life and sports: bridging the data analytics gap PhD thesis to obtain the degree of PhD at the University of Groningen on the authority of the Rector Magnificus Prof. J.M.A. Scherpen and in accordance with the decision by the College of Deans. This thesis will be defended in public on Wednesday 7 February 2024 at 16.15 hours by Talko Bernhard Dijkhuis born on 28 February 1969

Supervisors Prof. K.A.P.M. Lemmink Prof. M. Aiello Co-supervisor Dr. H. Velthuijsen Assessment Committee Prof. M. Biehl Prof. J.E.W.C. van Gemert-Pijnen Prof. J.N. Kok

Paranymphs Dr. Frank J. Blaauw Dr. Wico Mulder

TABLE OF CONTENTS Chapter 1 General Introduction Chapter 2 Personalized physical activity coaching: a machine learning approach Chapter 3 Early prediction of physical performance in elite soccer matches - a machine learning approach to support substitutions Chapter 4 Transferring Targeted Maximum Likelihood Estimation for causal inference into sports science Chapter 5A Increase in the Acute:Chronic Workload Ratio relates to injury risk in competitive runners. Chapter 5B Prediction of running injuries from acute:chronic workload ratio: a machine learning approach Chapter 6 General Discussion Appendices Summary Samenvatting Achtergrond en academisch werk Dankwoord Research Insititute SHARE 11 25 51 75 105 125 135 148 152 156 159 165

CHAPTER General introduction 1

1 12 CHAPTER 1 BACKGROUND Running a marathon at the Olympic level represents the ultimate challenge for an elite athlete, whereas walking independently to the supermarket may be an outstanding achievement for a frail older person. While both engage in physical activity, ‘any bodily movement produced by skeletal muscle that requires energy expenditure’ [1], the difference in the level of physical performance between an Olympic athlete and a frail older person illustrates that valuing physical performance depends on an individual’s physical capacities and context [2], [3]. Zehr categorises the physical performance of humans as a continuum from the lowest category, humans being hindered from performing basic activities due to illness or injury, through a middle category of humans being able to perform in an everyday day setting to the highest category of humans that can perform at an athletic level [4]. In this thesis, we concentrate our research on humans functioning in daily life and elite sports. In everyday life, being physically active helps to maintain general functioning as long as it is done regularly and with sufficient duration and intensity [5]. Physical activity in daily life can be undertaken in many ways, such as walking, cycling, gardening, household activities, sports, and work. Regular physical activity generally prevents chronic or noncommunicable diseases such as heart disease, stroke, and diabetes and supports mental health, quality of life, and well-being [5]. Inversely, insufficient physical activity may lead to deterioration of physical performance in daily life and higher risks of health problems, illness, and lower life expectancy [6]–[8]. Consequently, it is essential to monitor physical activity and to intervene to enhance physical activity when the physical activity is insufficient to maintain general functioning. In elite sports, for example, high-level running or professional soccer, humans require many years of targeted physical activity (i.e., training). During these years of training, a delicate balance between training (referred to as ‘load’), recovery, and individual capacity must be maintained to increase physical performance and prevent injuries and illnesses [9]. A short-term increase in training frequency, duration, or intensity provokes a short-term decrement in performance, termed functional overreaching (FOR) [10]. FOR is considered a necessary component of a training programme to improve physical performance [11]–[13]. On the other hand, high training loads without sufficient recovery over a longer period may lead to overtraining, causing a decrease in physical performance and a higher risk of sustaining an injury [14]–[19]. Therefore, balancing load, recovery, and capacity is crucial for elite athletes to maintain or increase physical performance and prevent injuries. Hence, monitoring the balance and adjust the balance is essential for elite athletes to maintain and increase physical performance.

1 13 GENERAL INTRODUCTION The frequency, duration, and intensity of physical activity in daily life or elite sports can be expressed in physical performance measures, such as the number of steps in a day, the covered distance at different running intensities in a soccer match, or the perceived exertion of a training session in elite runners. An intriguing question is whether physical activity or performance can be predicted based on earlier physical activity or physical performance, such as predicting the total number of steps on a day based on the number of steps in the morning or predicting the covered distance at the end of a soccer match based on the covered distance at the beginning of a match. Also, the potential of physical activity or physical performance measures to predict other outcomes, such as the risk of sustaining injuries [14]–[19] or the rate of improvement in the performance of running a marathon [20], is of interest. The combination of monitoring physical activity and physical performance, along with predictive analytics of physical activity, performance, and injuries, presents an opportunity to intervene in physical activity patterns throughout the day, during training sessions, races, or matches. This intervention can help optimize physical performance and reduce the risk of injuries. MONITORING AND ANALYTICS In the last decade, options for measuring physical activity and physical performance have dramatically increased [21], [22]. This increase is mainly based on technological developments (but not limited to), such as wearable sensor devices, automatic tracking systems, video-based motion analysis, and Global Positioning Systems (GPS) [21], [22]. These systems increase the ability to provide insight into physical activity patterns in daily life and sports [23]. For example, wearable sensor devices, such as the activity trackers of Fitbit or Garmin, can monitor physical activity in daily life. The global shipment volume of wearable sensor devices was 266.3 million in 2020 and is expected to reach 776.23 million units by 2026 [24]. In the last decade, researchers have taken advantage of Fitbit’s public appeal, prominence, and relatively low cost by incorporating these devices into their studies on physical activity [25]. In sports, computer-aided tracking technology has developed substantially to monitor athletes’ physical activity and physical performance during training and match play. Monitoring systems are used in all kinds of team sports, such as soccer [26], Australian football [27], basketball [28], hockey [29], or individual sports like speed skating [30] and running [31]. Monitoring systems using tracking technology have evolved considerably. For example, Van Gool et al. were the first to track a soccer match in the eighties filming at 5hz and processing it afterwards [32]. Nowadays, technology can quickly record and process the data of all athletes’ physical activity throughout an entire match or training session [33]. These monitoring systems have become commonplace in professional

1 14 CHAPTER 1 sports [34], enabling automatic analysis of physical activity during races or match-play [35]. Wearable sensor devices and monitoring systems provide an immense amount of physical activity and physical performance data [27]–[29], which present opportunities to develop knowledge in scientific areas like behavioural science, human movement, and sports science [23], [36] and to translate this knowledge to daily practice. For example, this data can be utilised in lifestyle interventions, rehabilitation programmes, or training programmes for elite athletes. In addition, training logs kept by athletes and coaches provide a wealth of data on physical activity, physical performance, psychological wellbeing, injury, and recovery. Data analytics, a field that encompasses techniques for working with data, such as machine learning and advanced statistics [37], [38], enables the extraction of insights and predictions from the collected data. However, the use of data analytics in behavioural, human movement and sport sciences is limited so far [23], [35]. Also, the application of artificial intelligence (AI) and machine learning (ML) based on wearable data and monitoring systems data in sports is still in its preliminary stage [38][39]. Although data analytics offers opportunities, there are various problems in extracting actionable and meaningful insights and predictions based on physical activity and physical performance data. Four problems can be identified in data analytics of physical activity and physical performance data, creating a data analytics gap. To address these problems, we propose potential solutions for each of the identified problems. The first problem is the limited use of individualised prediction based on personalised data [39]–[41], limiting the provision of meaningful insights and predictions for individuals, athletes and coaches. In order to provide individualised insights and predictions, a boundary condition is that the data contains sufficient personal information. A potential solution to the first problem to enable performing data analytics at the individual level using personalised data is to use datasets containing personalised data from wearable sensor devices such as Fitbit or Garmin or optical tracking systems, such as SportsVU, monitoring each individual during their daily life, training or match play, or individual test and exercise log data collected by smart apps such as Sports Tracker or Runkeeper. The second problem is the vast amount and complexity of data. Several authors acknowledged the complexity of data analysis and data analytics in the past decade, given the vast amount of data. For example, Silver’s book on prediction, published in 2012, indicated that realising a correct prediction model, among others in basketball, is challenging because the amount of meaningful information relative to the increasing overall amount of data is declining [41]. In 2014, Davenport concluded that the amount

1 15 GENERAL INTRODUCTION of data in sports is considerably more than the ability to extract meaningful insights from it [39]. Six years later, in 2020, as highlighted in a survey on the use of wearables producing GPS data in English professional soccer (2020) [40], it was posed that there is still a struggle to distil predictions due to the amount and complexity of the data and the outcomes are often inconclusive [40]. A potential solution to the second problem is using more sensitive performance measures and applying various machine learning algorithms. The more sensitive a physical performance measure the more responsive or reactive it is to changes in the underlying system being measured [42]. The fundamental consideration in machine learning is not only which algorithm is superior but, rather, the conditions (e.g. physical performance measures used) under which an algorithm can outperform others [43]. Therefore, to determine these physical performance measures’ relative applicability and sensitivity in prediction models, we examine the effectiveness of converting raw physical activity data from daily life and sports into various physical performance measures. We combine these physical performance measures with multiple machine learning algorithms to identify the optimal combination enabling meaningful predictions. The third problem is the use of simplified models and assumptions. When using data from wearables and monitoring systems applying models to derive meaningful insights is complex. The complexity arises from the fact that extracting insights using traditional statistical models often entails unrealistic assumptions on the underlying reality and often reduces complex questions to simple statistical assumptions using a fixed set of parameters [41], [44]–[47]. For example, the often-used Generalised Linear Models depend on the linearity and completeness of the parameters. However, this is seldom the case and yields suboptimal results [48]. A potential solution to the third problem is to apply a causal roadmap in combination with a causal model. The causal roadmap is a framework for designing and evaluating causal inference studies [49]. It provides a structured approach for identifying potential sources of bias and confounding in observational studies and selecting appropriate methods for adjusting for these sources of bias. The causal roadmap demands a welldefined causal model of reality. A causal model, as proposed by Judea Pearl in the Book of Why [50], is a mathematical framework for understanding the relationships between variables in a system and how changes in one variable can cause changes in another while identifying the potential sources of bias and confounding. Pearl’s causal model allows for the analysis and manipulation of complex systems. A fourth problem is the absence of confounding variables. For example, contextual variables in soccer, such as match location (home or away), score (win, draw or lose), and rival level, and individual characteristics [51], such as the athlete’s psychological well-being or social support, might influence physical performance during a match. Although monitoring systems and log data may include some of these variables, the availability and integration of all relevant variables in sports is a specific problem for the prediction of physical performance or injury risk [52]. The inability to include all relevant variables in sports analytics, is

1 16 CHAPTER 1 known as the endogeneity problem [53]. A potential solution to the fourth problem is using statistical methods that account for the absence of confounding variables. One such method involves utilising an ensemble of machine learning techniques in conjunction with Targeted Maximum Likelihood Estimation (TMLE). The goal of this strategy is to minimise the impact of the missing variables by using TMLE, which is known to be more robust to inaccuracies in modelling the underlying reality compared to traditional statistical methods [44], [45], [54]. As an alternative potential solution to the fourth problem, we take a two-way approach by applying traditional statistical analysis and machine learning techniques to the same incomplete data set. The use of traditional statistical analysis on the one hand and machine learning on the other provides insight into their relative performance when dealing with incomplete data [55]. This approach allows us to understand the applicability and discuss the strengths of traditional statistical analysis and machine learning. AIM AND OUTLINE This thesis aimed to reduce the data analytics gap represented by the four identified problems while examining the associated potential solutions to enable meaningful insights and predictions related to physical activity and physical performance. These insights and predictions can be used to make more informed decisions regarding physical activity and physical performance interventions. In Figure 1, we present an outline of the thesis and visualise how each chapter is mapped to one or more problems and solutions to reduce the identified data analytics gap. The structure of this thesis is as follows: In Chapter 2, we investigated the possibility of predicting the daily physical performance of employees based on wearable data. The study involved coaching Hanze University of Applied Science employees to increase their physical activity during the daytime and monitor their steps using Fitbits. These steps were subsequently transformed into physical performance measures such as number of steps per hour and total number of steps until a particular hour. In addition, we applied machine learning to predict whether an employee would achieve his or her overall daily step goal during the working day. Through automated analysis of physical activity and physical performance, timely detection of anomalies in behaviour and identifying effective coaching strategies may become feasible.

1 17 GENERAL INTRODUCTION Figure 1 Mapping of the problems creating a data analytics gap, the proposed potential solutions, and the outline of the thesis.

1 18 CHAPTER 1 In Chapter 3, we studied the predictability of physical performance in elite soccer matches using various physical performance measures and machine learning techniques. Match data was collected from 302 matches in elite soccer throughout one season. Semi-automatic multiple-camera video technology, the SportsVU optical tracking system, recorded each player’s position over time. The individual positions in time translated into three increasingly sensitive physical performance measures, i.e. distance covered, distance in speed category, and energy expenditure in power category. These physical performance measures were used in different machine learning models to identify and predict the physical performance of individual players throughout an elite soccer match. In Chapter 4, we investigated the effect of substitution in soccer on a team’s physical performance and the suitability of following the causal roadmap and a causal model in this context. A causal model of the relationships between substitution variables and the soccer team’s physical performance was created as an essential step in following the causal roadmap. The causal model included variables such as the number of substitutions made by a team, the moment of the substitution and the soccer team’s total distance covered. We used the causal model to identify the assumptions needed to infer causality from the data and the potential sources of bias and confounding that may affect the causal effect estimates. We also provided an in-depth analysis of statistical methods. We evaluated the accuracy of estimating the impact of substitutions on a football team’s physical performance using synthetically generated position and substitution data and data from the SportsVU optical tracking system. We compared the accuracy of the TMLE and generalized linear model using the complete data set versus the accuracy of these models when a crucial variable was removed from the data set. The difference in accuracy between the two methods indicated whether a more robust statistical method such as TMLE could provide more accurate insight into the effect of substitutions on football team physical performance when a crucial variable is absent, compared to a traditional generalized linear model. In Chapter 5, we investigated the relationship between training characteristics and injuries in competitive runners while evaluating the feasibility of using statistical analysis and exploring the potential of machine learning. The dataset comprised test, training, and injury log data collected from individual competitive runners and their coach. We used the log data on multiple physical load measures, such as training intensity and rate of perceived exertion, to construct the physical performance measure acute:chronic workload ratio. Traditionally running injuries have a complex origin, and datasets lack relevant confounding variables that can provide insight into injury risk

1 19 GENERAL INTRODUCTION factors [56]. We aimed to assess the viability of using statistical analysis and prediction in injury risk and evaluate the potential of using machine learning to predict injuries using a dataset known for its absence of several relevant variables. Finally, Chapter 6 provides a general discussion of the results, an overall conclusion and ends with some concrete ideas on practical applications and possible directions for further investigations.

1 20 CHAPTER 1 REFERENCES [1] World Health Organisation, “Global recommendations on physical activity for health,” 2010. [2] L. Holsbeeke, M. Ketelaar, M. M. Schoemaker, and J. W. Gorter, “Capacity, Capability, and Performance: Different Constructs or Three of a Kind?,” Arch. Phys. Med. Rehabil., vol. 90, no. 5, pp. 849–855, 2009, doi: 10.1016/j.apmr.2008.11.015. [3] C. Dunn, W., & Brown, “The Ecology of Human Framework for Considering the Effect of Context,” Am. J. Occup. Ther., vol. 48, pp. 595–607, 1994, doi: 10.5014/ajot.48.7.595. [4] E. P. Zehr, “The potential transformation of our species by neural enhancement,” J. Mot. Behav., vol. 47, no. 1, pp. 73–78, 2015, doi: 10.1080/00222895.2014.916652. [5] World Health Organisation, Global action plan on physical activity 2018-2030: more active people for a healthier world. 2019. [6] I. Min-Lee et al., “Effect of physical inactivity on major non-communicable diseases worldwide: An analysis of burden of disease and life expectancy,” Lancet, vol. 380, no. 9838, pp. 219–229, 2012, doi: 10.1016/S0140-6736(12)61031-9. [7] U. Ekelund et al., “Does physical activity attenuate, or even eliminate, the detrimental association of sitting time with mortality? A harmonised meta-analysis of data from more than 1 million men and women,” Lancet, vol. 388, no. 10051, pp. 1302–1310, 2016, doi: 10.1016/S0140-6736(16)30370-1. [8] E. Losina, H. Y. Yang, B. R. Deshpande, J. N. Katz, and J. E. Collins, “Physical activity and unplanned illnessrelated work absenteeism: Data from an employee wellness program,” PLoS One, vol. 12, no. 5, pp. 1–8, 2017, doi: 10.1371/journal.pone.0176872. [9] R. T. A. Otter, “Monitoring endurance athletes,” 2016. [10] P. Bellinger et al., “Muscle fiber typology is associated with the incidence of overreaching in response to overload training,” J. Appl. Physiol., vol. 129, no. 4, pp. 823–836, 2020, doi: 10.1152/ japplphysiol.00314.2020. [11] S. L. Halson, “Monitoring Training Load to Understand Fatigue in Athletes,” Sports Medicine, vol. 44, no. Suppl 2. Springer, pp. 139–147, Nov. 2014, doi: 10.1007/s40279-014-0253-z. [12] S. L. Halson and A. E. Jeukendrup, “Does overtraining exist? An analysis of overreaching and overtraining research.,” Sport. Med., vol. 34, no. 14, pp. 967–981, 2004, [Online]. Available: http://search.ebscohost. com/login.aspx?direct=true&db=ccm&AN=106594041&site=ehost-live. [13] R. Meeusen et al., “Prevention, diagnosis, and treatment of the overtraining syndrome: Joint consensus statement of the european college of sport science and the American College of Sports Medicine,” Med. Sci. Sports Exerc., vol. 45, no. 1, pp. 186–205, 2013, doi: 10.1249/MSS.0b013e318279a10a. [14] A. Esmaeili, W. G. Hopkins, A. M. Stewart, G. P. Elias, B. H. Lazarus, and R. J. Aughey, “The individual and combined effects of multiple factors on the risk of soft tissue non-contact injuries in elite team sport athletes,” Front. Physiol., vol. 9, no. SEP, pp. 1–16, 2018, doi: 10.3389/fphys.2018.01280. [15] N. B. Murray, T. J. Gabbett, A. D. Townshend, B. T. Hulin, and C. P. Mclellan, “Individual and combined effects of acute and chronic running loads on injury risk in elite Australian footballers,” Scand. J. Med. Sci. Sport., no. 2007, pp. 1–9, 2016, doi: 10.1111/sms.12719. [16] J. D. Ruddy, C. W. Pollard, R. G. Timmins, M. D. Williams, A. J. Shield, and D. A. Opar, “Running exposure

1 21 GENERAL INTRODUCTION is associated with the risk of hamstring strain injury in elite Australian footballers,” Br. J. Sports Med., p. bjsports-2016-096777, 2016, doi: 10.1136/bjsports-2016-096777. [17] B. Hulin et al., “Spikes in acute workload are associated with increased injury risk in elite cricket fast bowlers,” Artic. Br. J. Sport. Med., 2013, doi: 10.1136/bjsports-2013-092524. [18] B. T. Hulin, T. J. Gabbett, D. W. Lawson, P. Caputi, and J. a Sampson, “The acute:chronic workload ratio predicts injury: high chronic workload may decrease injury risk in elite rugby league players,” Br. J. Sports Med., vol. 50, no. 4, pp. 231–236, 2016, doi: 10.1136/bjsports-2015-094817. [19] A. Jaspers, J. P. Kuyvenhoven, F. Staes, W. G. P. Frencken, W. F. Helsen, and M. S. Brink, “Examination of the external and internal load indicators’ association with overuse injuries in professional soccer players,” J. Sci. Med. Sport, vol. 21, no. 6, pp. 579–585, 2018, doi: 10.1016/j.jsams.2017.10.005. [20] V. L. Billat, “Current perspectives on performance improvement in the marathon: From universalisation to training optimisation,” New Stud. Athl., vol. 20:3, pp. 21–39, 2005. [21] A. Rossi, L. Pappalardo, P. Cintia, F. M. Iaia, J. Fernàndez, and D. Medina, “Effective injury forecasting in soccer with GPS training data and machine learning,” PLoS One, vol. 13, no. 7, pp. 1–15, 2018, doi: 10.1371/journal.pone.0201264. [22] M. Herold, F. Goes, S. Nopp, P. Bauer, C. Thompson, and T. Meyer, “Machine learning in men’s professional football: Current applications and future directions for improving attacking play,” Int. J. Sport. Sci. Coach., vol. 14, no. 6, pp. 798–817, 2019, doi: 10.1177/1747954119879350. [23] B. Caulfield, B. Reginatto, and P. Slevin, “Not all sensors are created equal: a framework for evaluating human performance measurement technologies,” npj Digit. Med., vol. 2, no. 1, 2019, doi: 10.1038/ s41746-019-0082-4. [24] “Industry report on smart wearables market.” smart-wearables-market. [25] R. G. St Fleur, S. M. St George, R. Leite, M. Kobayashi, Y. Agosto, and D. E. Jake-Schoffman, “Use of fitbit devices in physical activity intervention studies across the life course: Narrative review,” JMIR mHealth uHealth, vol. 9, no. 5, 2021, doi: 10.2196/23411. [26] J. Castellano, D. Alvarez-Pastor, and P. S. Bradley, “Evaluation of research using computerised tracking systems (amisco® and prozone®) to analyse physical performance in elite soccer: A systematic review,” Sport. Med., vol. 44, no. 5, pp. 701–712, 2014, doi: 10.1007/s40279-014-0144-3. [27] S. Ryan, T. Kempton, and A. J. Coutts, “Data reduction approaches to athlete monitoring in professional Australian football,” Int. J. Sports Physiol. Perform., vol. 16, no. 1, pp. 59–65, 2021, doi: 10.1123/ IJSPP.2020-0083. [28] A. Heishman et al., “Associations Between Two Athlete Monitoring Systems Used to Quantify External Training Loads in Basketball Players,” Sports, vol. 8, no. 3, p. 33, 2020, doi: 10.3390/sports8030033. [29] T. Kim, J. H. Cha, and J. C. Park, “Association between in-game performance parameters recorded via global positioning system and sports injuries to the lower extremities in elite female field hockey players,” Cluster Comput., vol. 21, no. 1, pp. 1069–1078, 2016, doi: 10.1007/s10586-016-0690-6. [30] T. Purevsuren, B. Khuyagbaatar, K. Kim, and Y. H. Kim, “Investigation of Knee Joint Forces and Moments during Short-Track Speed Skating Using Wearable Motion Analysis System,” Int. J. Precis. Eng. Manuf., vol. 19, no. 7, pp. 1055–1060, 2018, doi: 10.1007/s12541-018-0125-9.

1 22 CHAPTER 1 [31] T. Haugen and M. Buchheit, “Sprint Running Performance Monitoring: Methodological and Practical Considerations,” Sport. Med., vol. 46, no. 5, pp. 641–656, 2016, doi: 10.1007/s40279-015-0446-0. [32] D. Van Gool, D. Van Gerven, and J. Boutmans, “The physiological load imposed on soccer players during real match-play.,” in Science and football, W. J. Reilly, T.; Lees, A.; Davids, K.; Murphy, Ed. London: Spon, 1988, pp. 51–59. [33] M. Buchheit, A. Allen, T. K. Poon, M. Modonutti, W. Gregson, and V. Di Salvo, “Integrating different tracking systems in football: multiple camera semi-automatic system, local position measurement and GPS technologies,” J. Sports Sci., vol. 32, no. 20, pp. 1844–1857, 2014, doi: 10.1080/02640414.2014.942687. [34] L. Torres-Ronda, E. Beanland, S. Whitehead, A. Sweeting, and J. Clubb, “Tracking Systems in Team Sports: A Narrative Review of Applications of the Data and Sport Specific Analysis,” Sport. Med. - Open, vol. 8, no. 1, 2022, doi: 10.1186/s40798-022-00408-z. [35] S. Barris and C. Button, “A review of vision-based motion analysis in sport,” Sport. Med., vol. 38, no. 12, pp. 1025–1043, 2008, doi: 10.2165/00007256-200838120-00006. [36] Statista Technology & Telecommunications, “Connected wearable devices worldwide 2016-2022,” Statista, 2021. (accessed Jun. 30, 2021). [37] T. A. Runkler, Data Analytics, 3rd ed. Wiesbaden: Springer Fachmedien Wiesbaden GmbH, 2020. [38] W. E. Nagel and T. Ludwig, “Data Analytics,” Informatik-Spektrum, vol. 42, no. 6, pp. 385–386, 2020. [39] T. H. Davenport, “Analytics in sports: The new science of winning,” Int. Inst. Anal., vol. 2, no. February, pp. 1–28, 2014. [40] P. Nosek, T. E. Brownlee, B. Drust, and M. Andrew, “Feedback of GPS training data within professional English soccer: a comparison of decision making and perceptions between coaches, players and performance staff,” Sci. Med. Footb., vol. 5, no. 1, pp. 35–47, 2021, doi: 10.1080/24733938.2020.1770320. [41] N. Silver, The Signal and the Noise: Why So Many Predictions Fail--but Some Don’t. Penguin Press, 2012. [42] M. Buchheit and B. M. Simpson, “Player-Tracking Technology : Half-Full or Half-Empty Glass ?,” Int. J. Sports Physiol. Perform., vol. 12, no. S2, pp. 35–41, 2017. [43] F. J. Osisanwo, J. E. T. Akinsola, O. Awodele, J. O. Hinmikaiye, O. Olakanmi, and J. Akinjobi, “Supervised Machine Learning Algorithms: Classification and Comparison,” Int. J. Comput. Trends Technol., vol. 48, no. 3, pp. 128–138, 2017, doi: 10.14445/22312803/ijctt-v48p126. [44] M. J. Van der Laan and Rose., Targeted learning. 2012. [45] M. L. Petersen and M. J. Van Der Laan, “Causal models and learning from data: Integrating causal modeling and statistical estimation,” Epidemiology, vol. 25, no. 3, pp. 418–426, 2014, doi: 10.1097/ EDE.0000000000000078. [46] M. J. van der Laan and S. Rose, Targeted Learning, vol. 20. New York, NY: Springer-Verlag New York, 2011. [47] M. L. Petersen, “Applying a Causal Road Map in Settings with Time-dependent Confounding,” Empidemiology, vol. 25, no. 6, pp. 898–901, 2014, doi: 10.1117/12.2549369.Hyperspectral. [48] A. S. Benjamin et al., “Modern machine learning outperforms GLMs at predicting spikes,” bioRxiv, pp. 1–13, 2017, doi: 10.1101/111450. [49] M. L. Petersen and M. J. Van Der Laan, “Causal models and learning from data: Integrating causal modeling and statistical estimation,” Epidemiology, vol. 25, no. 3, pp. 418–426, 2014, doi: 10.1097/

1 23 GENERAL INTRODUCTION EDE.0000000000000078. [50] J. Pearl and D. Mackenzie, The Book of Why. New York: Basic Books, 2018. [51] C. Lago, L. Casais, E. Dominguez, and J. Sampaio, “The effects of situational variables on distance covered at various speeds in elite soccer,” Eur. J. Sport Sci., vol. 10, no. 2, pp. 103–109, 2010, doi: 10.1080/17461390903273994. [52] J. G. Claudino, D.-O. Capanema, T.-V. De-Souza, J. C. Serrão, A.-C. Machado Pereira, and G.-P. Nassis, "Current Approaches to the Use of Artificial Intelligence for Injury Risk Assessment and Performance Prediction in Team Sports: a Systematic Review," Sport. Med. - Open, vol. 5, no. 1, 2019. [53] E. Morgulev, O. H. Azar, and R. Lidor, “Sports analytics and the big-data era,” Int. J. Data Sci. Anal., vol. 5, no. 4, pp. 213–222, 2018, doi: 10.1007/s41060-017-0093-7. [54] M. J. van der Laan and S. Rose, Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. 2018. [55] D. Bzdok, N. Altman, and M. Krzywinski, “Points of Significance: Statistics versus machine learning,” Nature Methods, vol. 15, no. 4. Nature Publishing Group, pp. 233–234, Apr. 03, 2018, doi: 10.1038/ nmeth.4642. [56] D. van Poppel et al., “Risk factors for overuse injuries in short- and long-distance running: A systematic review,” J. Sport Heal. Sci., vol. 10, no. 1, pp. 14–28, Jan. 2021, doi: 10.1016/J.JSHS.2020.06.006.

CHAPTER Based on “Personalized Physical Activity Coaching: A Machine Learning Approach.” Talko B. Dijkhuis Frank J. Blaauw Miriam W. van Ittersum Hugo Velthuijsen Marco Aiello 2018. Sensors 18(2):1–20. doi: 10.3390/s18020623. Personalized physical activity coaching: a machine learning approach 2

2 26 CHAPTER 2 ABSTRACT Living a sedentary lifestyle is one of the major causes of numerous health problems. To encourage employees to lead a less sedentary life, the Hanze University started a health promotion program. One of the interventions in the program was the use of an activity tracker to record participants’ daily step count. The daily step count served as input for a fortnightly coaching session. In this paper, we investigate the possibility of automating part of the coaching procedure on physical activity by providing personalized feedback throughout the day on a participant’s progress in achieving a personal step goal. The gathered step count data was used to train eight different machine learning algorithms to make hourly estimations of the probability of achieving a personalized, daily steps threshold. In 80% of the individual cases, the Random Forest algorithm was the best performing algorithm (mean accuracy = 0.93, range = 0.88–0.99, and mean F1-score = 0.90, range = 0.87–0.94). To demonstrate the practical usefulness of these models, we developed a proof-of-concept Web application that provides personalized feedback about whether a participant is expected to reach his or her daily threshold. We argue that the use of machine learning could become an invaluable asset in the process of automated personalized coaching. The individualized algorithms allow for predicting physical activity during the day and provides the possibility to intervene in time. Keywords Physical activity; machine learning; coaching; sedentary lifestyle.

2 27 MACHINE LEARNING ENABLED PERSONALIZED PHYSICAL ACTIVITY COACHING 1. INTRODUCTION Unhealthy lifestyles lead to increased premature mortality and are a risk factor for sustaining noncommunicable diseases (NCDs) such as cardiovascular diseases, cancers, chronic respiratory diseases, and diabetes [1]. NCDs caused 63% of all deaths that occurred globally in 2008 [1]. There are four behavioural factors that have a significant influence on the prevention of NDCs: healthy nutrition, not smoking, maintaining a healthy body weight, and sufficient physical activity. Insufficient physical activity is one of the leading risk factors for the major NCDs and not meeting the recommended level of physical activity is associated with approximately 5.3 million deaths that occurred globally in 2008 [2]. A high amount of sedentary time without sufficient daily physical activity leads to a higher rate of all-cause mortality [3]. Besides the increased risk of premature mortality in the long term, the short-term quality of life, being able to work, and social participation is also threatened by insufficient physical activity [4]. Fortunately, these risks are eliminated when this sedentary time is compensated for with sufficient physical activity of moderate intensity [3]. In Western civilization, living a sedentary lifestyle is the rule rather than the exception, as many people work in office environments. In pursuance of preventing the negative effects of insufficient physical activity in the workplace, the Hanze University of Applied Sciences Groningen (HUAS), a large university in the northern part of the Netherlands, started a novel initiative named (in Dutch): ‘Het Nieuwe Gezonde Werken’ (The New Healthy Way of Working; HNGW). With HNGW, the HUAS aims to promote a healthy lifestyle and physical activity during the workday. HNGW consists of providing participants with educational group meetings, food boxes with healthy recipes, and individual coaching sessions supplemented with an activity tracker. Despite the fact that participants are coached every two weeks and measured continuously, it remains difficult for a coach to provide timely personalized feedback. The manual task of creating personalized feedback is time consuming, and as such it is not always possible for the participants to get in-depth and timely daily feedback on their progression. Furthermore, current activity trackers do not provide a prediction for reaching the daily goal. In order to fill this gap, we propose a novel, personalized, and flexible machine-learningbased procedure that can automate a part of the coaching process and serve as a source of information on a participant’s progress with physical activity during the day. The personalized model provides, throughout the day, information on the probability of the participant meeting his or her daily physical activity goal. We demonstrate the accuracy and effectiveness of this solution in practice by training different machine learning

2 28 CHAPTER 2 algorithms and evaluating their performance using a train-test split dataset from the HNGW data. We apply techniques like grid search and cross-validation to optimize each model in order to find their best configuration. To show the applicability of this research in practice, we developed a proof-of-concept Web application, which has, to the best of our knowledge, not been done before. With the personalized actionable information, the application provides, we demonstrate that machine learning automating is feasible as a part of the coaching process. The techniques described in this work could serve two goals in the field of personalized coaching. Firstly, we envision how coaches can use such applications and how these applications can provide them with detailed insight about the participants’ activity during the day. Secondly, the tool could be used as a selfsupport tool, in which the participants’ engagement with their lifestyle might increase as a result of the extra feedback. 2. RELATED WORK A number of studies have been performed on physical activity over days, where the sources of variance in activity is related to the subject, the day of the week, the season, and occupational and non-occupational days [5]. Tudor-Locke et al. (2005) showed that the individual is the main source of variability in physical activity next to the difference between the Sunday and the rest of the week [6]. Another study identified physical inactivity being lower on weekend days, and Saturday was the most active day of the week for both men and women [5]. To reduce sedentary time and increase physical activity levels, individuals need to change their behaviour and daily routines. This is hard to achieve because of various reasons and requires interventions and coaching strategies that use well-established techniques to induce a behaviour change. A review by Gardner et al. (2016) found that self-monitoring, problem solving, and restructuring the social or physical environment were the most promising behaviour change strategies, and—although the evidence base is quite weak—advises environmental restructuring, persuasion, and education to enhance self-regulatory skills [7]. Interventions aimed at increasing physical activity levels or reducing sedentary time varies widely in content and in effectiveness. For example, studies focusing on exercise training and behavioural approaches have demonstrated conflicting results, whereas interventions focusing on reducing sedentary time seem to be more promising [8]–[12]. The use of active video games seems to be effective in increasing physical activity, but has inconsistent findings on whether they are suitable to meet the recommended levels [13]. Also, interventions targeting recreational screen time reduction might be effective when using health promotion curricula or counselling [14]. Web- or app-based interventions to improve diet, physical activity, and sedentary

2 29 MACHINE LEARNING ENABLED PERSONALIZED PHYSICAL ACTIVITY COACHING behaviour can be effective. Multi-component interventions appear to be more effective than stand-alone app interventions, although the optimal number and combination of app features and level of participant contact needed remain to be confirmed [15], [16]. The workplace is often used for health promotion interventions. Recent reviews on workplace interventions for reducing sitting at work found initial evidence that the use of alternative workstations (sit-stand desks or treadmills) can decrease workplace sitting by thirty minutes to two hours. In addition, one review found that interventions promoting stair use and personalized behavioural interventions increase physical activity, while the other found no considerable or inconsistent effects of various interventions [17], [18]. Step counters provide an objective measure of activity levels and enable self-monitoring. Furthermore, most modern consumer-based activity trackers already contain several behaviour change models or theories [19], [20]. Therefore, based on the aforementioned, using activity trackers in interventions to promote healthy lifestyles is promising. From meta-analyses by Qiu et al. and Stephenson et al. it was concluded that step counter use was indeed associated with small but significant effects in reducing sedentary time [21], [22]. Adding an activity tracker to physical therapy or counselling was effective in some populations [23]–[25]. Besides collecting activity data for therapy or counselling, it is known that the Fitbit itself also serves as an intervention mechanism [26]. The mere fact of wearing an activity tracker (even without any form of coaching) could motivate physical activity and improve health-related quality of life [27], [28]. On the other hand, studies on workplace interventions using activity trackers report conflicting results [29]– [33]. There are several studies that use sensor or activity tracker data to build a custom-made application to support research. An example is the social computer game, Fish’n’Steps, which connects the daily steps of an employee to the growth and activity of the individual avatar fish in a virtual fish tank. The more one is active, the faster the fish grows and prospers [34]. Another example is the study on increased physical activity as the effect of social support groups using pedometers and an app [35]. Although applying machine learning to coaching is new, machine learning techniques in combination with sensors have been applied before to identify the type of activity. Identifying human activity using machine learning and sensor data have been studied, for example, by Wang et al. for recognizing human daily activities from an accelerometer signal [36], by Li et al. on the quantification of the lifetime circadian rhythm of physical activity [37], or by Catal et al. on the use of an ensemble of classifiers for accelerometerbased activity recognition [38]. Only a few studies have investigated the use of actionable, data-driven predictive models. A study on creating a predictive physical fatigue model based on sensors identified relevant features for predicting physical fatigue, however

2 30 CHAPTER 2 the model was not proven to be predictive enough to be applied [39]. In order to improve physical activity in combination with activity trackers, a coaching feature is helpful, but only when the messages are personal and placed in context [40]. Perceiving the coaching information as personal and relevant is crucial for the effectiveness of (e)Coaching [41]. Such tailored (e)Coaching has many aspects, two of which are personalization and timing [42]. Timeliness of information is important for participants to be able to process the information and apply the advice while it is still relevant for them. In order to provide such advice, access to real-time predictions is vital, as it allows for timing the moment of coaching, either virtual or in real life and as flexible as needed. To the best of our knowledge, no studies exist about the use of sensor data combined with machine learning techniques for creating validated and individualized predictive models on physical activity. The individualized models could help the coach and the participant in the process of behaviour change and increased physical activity. 3. MATERIALS AND METHODS The present work revolves around the HNGW project. This project was started in 2015 and focuses on promoting a healthy lifestyle. We describe the design of this study and how the resulting data is used in the present work. Next, we describe our analysis pipeline. We describe the conversion of the raw data set into a feature set, the evaluation methods of the predictive models, and the choice of the algorithms. Finally, we shed light on the proof-of-concept application we created to demonstrate how these techniques could be used in practice. 3.1. Study Design The goal of the workplace health promotion intervention HNGW at the HUAS was to increase physical activity during workdays, by improving both physical and mental health, and several work-related variables. In the study, several performance-based tests and self-reported questionnaires were used to assess its effectiveness on a group level. Forty-eight eligible participants from the HUAS were randomized into two groups, stratified according to age, gender, BMI, and baseline self-reported health. One group followed a twelve-week workplace health promotion intervention; the other served as a control during the first twelve weeks and thereafter received the twelve-week workplace health promotion intervention. During the study, minutely step count data of the participants was collected. Step count was measured using a wrist-worn activity tracker, the Fitbit Flex. The Fitbit Flex has

2 31 MACHINE LEARNING ENABLED PERSONALIZED PHYSICAL ACTIVITY COACHING been shown to be a reliable and valid device for step count and suitable for health enhancement programs [13]. Further details of the trial design on HNGW at the HUAS are represented in the manuscript of van Ittersum et al.[43]. 3.2. Data Set The anonymized data used in the present study was collected from participants during their participation in the HNGW health promotion program. All participants provided informed consent for participation in the HNGW study and for the use of their anonymized data for research purposes. We used the steps per minute of each participant, resulting in a total of 349,920 measurements across all participants. We only considered the step data collected during the intervention period. That is, for both the intervention and the control group, we used the last twelve weeks of available step data. By focusing on the intervention period, we have a more homogeneous sample than we would have when including both the intervention and control data. While the Fitbit platform provides us with several minutely measures (e.g., steps, metabolic equivalent of tasks [METs], calories, and distance), in our analysis we only included the steps variable. We used the steps variable as we expect it to be the most accurate and relevant, as all other variables are by-products derived using approximation algorithms. 3.3. Data Processing, Transformation, and Performance To prepare the available minutely step data as input for training the algorithms, we first performed a data cleaning, reformatting, and pre-processing step. First, we removed incomplete days from the data set. We also removed all days with zero steps and weekend days. We then converted all provided variables in a format that could be used by our algorithms, by augmenting our initial data set with several new augmented variables, such as hour of the workday, the number of steps for that hour, and a cumulative sum of the number of steps till that hour. Note that we define a workday as the weekdays Monday to Friday. The normal working hours at the university are between 8:00 AM and 5:00 PM. The HNGW tried to motivate the participants to walk at least a part of the distance they commute daily. As a consequence, the hours of interest are the combination of the working hours and the period of commuting. Therefore, we only considered the number of steps per hour between 7:00 AM and 6:00 PM. As features for training the algorithms, we used the hour per workday (ranged from 7:00 AM to 6:00 PM), the number of steps of that hour, and the cumulative sum of the number of steps till that hour.

2 32 CHAPTER 2 As the outcome measure, we calculated the average number of steps for all workdays over all weeks. That is, for each individual, we calculated one average for all workdays. We considered the number of steps between 7:00 AM and 6:00 PM. Note that this outcome measure is not used as input in the training process. We constructed a binary outcome variable represented by the indicator variable = ( ≥ ), in which in which refers to the number of steps on a workday for individual , and refers to the specific step goal for that j. The indicator function returns one (the `true’ label) when the inside condition holds, and zero (the ‘false’ label) otherwise. Three days of repeated measures are necessary to represent adults’ usual activity levels with an 80% confidence [6]. Forty-four participants met the criteria. The processing and transformation for these forty-four participants resulted in a total of 120,480 data blocks (for the number of steps, mean = 9,031, median = 8,543, range = 0- 47,121). The total number of positives when the threshold is met at 6:00 PM, is 1528. The total number of negatives when the threshold is not met at 6:00 PM, is 1,879. Note that we did not include any of the group level/baseline variables like age or gender, as we only considered personalized models. Although these variables might affect the outcome, they do not vary within the individual and as such do not add information. 3.4. Evaluation of the Performance of Algorithms and Models We trained eight different machine learning algorithms. To compare their performance, we used a method known as `confusion matrices’. The confusion matrices give an overview of the true positives (TP; the model predicted a `true’ label and the actual data contained a `true’ label), true negatives (TN; the model predicted a `false` label and the actual data turned out to have a `false’ label), false positives (FP; the model predicted a `true’ label, but the actual data contained a `false’ label), and false negatives (FN; the model predicted a `false’ label, but in fact the data contained a `true’ label) of a model. An example of a confusion matrix is provided in Table 1. These confusion matrices served as a basis for the calculation of two other performance measures: The accuracy and the F1-score [15]. Table 1. Confusion matrix. True class Yes No Predicted class Yes True Positives (TP) False Negatives (FN) No False Positives (FP) True Negatives (TN) True Positive: the threshold of daily steps was met and predicted; True Negative: the threshold of daily steps was not met and predicted; False Negative: the threshold of daily steps was met and not predicted; False Positive: the threshold of daily steps was not met and not predicted.