Thesis

2 33 MACHINE LEARNING ENABLED PERSONALIZED PHYSICAL ACTIVITY COACHING Accuracy is a metric to determine the nearness of the prediction to the true value. A value of the accuracy close to one indicates the best performance. It calculates the ratio between the correctly classified cases and all cases as Accuracy = "# % " " & # % % " ' & # % '&. Besides the accuracy metric, we calculated the F1-score for each model. Similar to the accuracy metric, the F1-score takes its values from between zero and one, one corresponding to the best performance. To calculate the F1-score, we use two other metrics known as the precision and the recall of the model. Precision is the proportion of the true positives and the false negatives and is calculated as Precision = "# " % # '&. Recall is the true positive rate, which is calculated as Recall = "# " % # '#. Using these definitions of precision and recall, the F1-score can be calculated as F1-score = 2×## (( )) ** ++ ,, ++ -- .. %× 00 )) ** 11 22 22. 3.5. Computing the Personalized Predictive Model We aim to predict (throughout the day) whether or not an individual will meet his or her daily step goal. Prediction of meeting a set goal is a supervised two-class classification problem. Nowadays, many different algorithms for performing such classifications are available. Unfortunately, it is generally considered impossible to determine a priori which algorithm will perform best on any given data set [44]. Although distinct algorithms are better suited for different types of data and problems, the type of algorithm is merely an indication of the most suitable algorithm. Currently, the preferred way to find the bestperforming algorithm is by empirically testing each of them [45]. Nevertheless, there exist general guidelines to direct the search for specific algorithms for the problem at hand. One of the leading organizations on open source machine learning library, scikitlearn.org, offers a flowchart about which algorithms can be chosen in which situation [46]. Also, Microsoft provides a ‘cheat sheet’ on their Azure machine learning platform [47]. The flow chart and ´cheat sheet´ served as a basis for our selection process and we chose the following machine learning classification algorithms: (i) AdaBoost (ADA), (ii) Decision Trees (DT), (iii) KNeighborsClassifier (KNN), (iv) Logistic Regression (LR), (v) Neural Networking(NN), (vi) Stochastic Gradient Descent (SGD), (vii) Random Forest (RF), and (viii) Support Vector Classification (SVC). The performance of each of these

RkJQdWJsaXNoZXIy MjY0ODMw