Real-Time Learning from an Expert in Deep Recommendation Systems with Application to mHealth for Physical Exercises PDF
Document Details
Uploaded by Deleted User
2022
Arash Mahyari, Peter Pirolli, Jacqueline A. LeBlanc
Tags
Summary
This paper presents a novel deep learning-based recommendation system for physical exercises, specifically in the context of mHealth applications. The system leverages user profiles and exercise characteristics to make personalized recommendations. A key feature is the integration of active learning to incorporate expert feedback, which is critical for exercise recommendations given the real-time nature of the problem.
Full Transcript
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/359935200 Real-Time Learning from an Expert in Deep Recommendation Systems with Application to mHealth for Physical Exercises Article in IEEE Journal of Biomedical and Health Informatics ·...
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/359935200 Real-Time Learning from an Expert in Deep Recommendation Systems with Application to mHealth for Physical Exercises Article in IEEE Journal of Biomedical and Health Informatics · April 2022 DOI: 10.1109/JBHI.2022.3167314 CITATIONS READS 9 165 3 authors, including: Andrew Mahyari Peter Pirolli 32 PUBLICATIONS 379 CITATIONS Florida Institute for Human and Machine Cognition 222 PUBLICATIONS 17,746 CITATIONS SEE PROFILE SEE PROFILE All content following this page was uploaded by Andrew Mahyari on 21 May 2022. The user has requested enhancement of the downloaded file. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 1 Real-Time Learning from an Expert in Deep Recommendation Systems with Application to mHealth for Physical Exercises Arash Mahyari , Peter Pirolli, Jacqueline A. LeBlanc Abstract—Recommendation systems play an important role in the development of smartphone applications and connecting today’s digital world. They have found applications in various subjects and expert knowledge. areas such as music platforms, e.g., Spotify, and movie streaming On the other hand, recommendation systems are becoming services, e.g., Netflix. Less research effort has been devoted to physical exercise recommendation systems. Sedentary lifestyles popular in various applications. E-commerce websites have have become the major driver of several diseases as well as been using recommendation systems to suggest new products healthcare costs. In this paper, we develop a recommendation and items to existing and new users to entice them into system to recommend daily exercise activities to users based purchasing new items ,. With the advent of streaming on their history, profiles and similar users. The developed movies and music services, e.g., Netflix, Spotify, service recommendation system uses a deep recurrent neural network with user-profile attention and temporal attention mechanisms. providers need to keep their users interested or lose their Moreover, exercise recommendation systems are significantly customers and profits. Thus, these streaming service providers different from streaming recommendation systems in that we deploy recommendation systems to suggest new movies and are not able to collect click feedback from the participants in musics to their existing and new users based on their history exercise recommendation systems. Thus, we propose a real-time, of watching movies and listening to musics ,. However, expert-in-the-loop active learning procedure. The active learner calculates the uncertainty of the recommendation system at each not many research studies have been devoted to developing time step for each user and asks an expert for recommendation recommendation systems for exercise activities. when the certainty is low. In this paper, we derive the proba- bility distribution function of marginal distance, and use it to In this paper, we develop an attention-based recommenda- determine when to ask experts for feedback. Our experimental tion system for exercise activities to new users using a mHealth results on a mHealth and MovieLens datasets show improved application. The proposed recommendation system is based on accuracy after incorporating the real-time active learner with a deep recurrent neural network that takes advantage of users’ the recommendation system. profiles and exercise characteristics as features and temporal Index Terms—active learning, recommendation system, deep attention mechanisms. However, one major difference between learning, attention networks, marginal distance. exercise activities and other domains is that the recommenda- tion system is not able to collect users’ feedback. In movie, I. I NTRODUCTION e-commerce, and music applications, recommendation systems constantly receive feedback from users’ clicks data. When a A major driver of healthcare costs in different countries recommendation system suggests a new music (or a movie) are unhealthy behaviors such as physical inactivity, increased to a user, the user may click on the suggested music (or food intake, and unhealthy food choice ,. Behavioral movie) and listen to (or watch) it completely. The click and and environmental health factors account for more deaths than duration of playback is used as a feedback to fine-tune the genetics. Pervasive computational, sensing, and commu- recommendation system. Although users are able to provide nication technology can be leveraged to support individuals ratings in these systems, they may choose not to do so. As a in their everyday lives to develop healthier lifestyles. For result, the systems will use the click data, duration of playback, instance, the pervasive use of smartphones is a potential etc. to assess whether the recommended music (or movie) was platform for the delivery of behavior-change methods at great a good recommendation or not. economies of scale. Commercial systems such as noom aim to provide psychological support via mobile health (mHealth) However, when a recommendation system suggests an exer- systems. Research platforms, such as the Fittle+ system , cise activity to the user, the application is not able to collect the have demonstrated the efficacy of translating known behavior- click data because the application cannot observe the user, and change techniques into personal mHealth applications. the only way to collect the feedback is if the user chooses to However, most work in the mHealth domain are limited to rate the exercise. In other words, the recommendation system doesn’t know whether the user completed the exercise, i.e. if Research reported in this paper was supported by the National Insti- it was a good recommendation. In some mHealth apps , tute on Aging of the National Institutes of Health under award number R01AG053163. users were asked to provide that information manually, but A. Mahyari is a Research Scientist at Florida Institute for Human and the missing data is tremendous as most users were ignorant Machine Cognition (IHMC), Pensacola, FL 32502 USA (e-mail: amah- of that feedback. The problem is more challenging when the [email protected]). P. Pirolli is a Senior Research Scientist at Florida Institute for Hu- recommendation system faces a new user without any history. man and Machine Cognition (IHMC), Pensacola, FL 32502 USA (e-mail: To address this issue, in this paper, we take advantage of a real [email protected]). time expert-in-the-loop mechanism. Individuals trust expert J. LeBlanc is a certified health and fitness coach and independent consultant. The code repository: https://github.com/arashmahyari/ personal trainers to provide them with exercises plans based on ExerRecomActiveLearn. the experts’ knowledge and experience. Our proposed system IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 2 will leverage this trust to use experts’ knowledge when the purchased. With the advancements in deep learning algorithms system is uncertain. As a result, more individuals will have , several studies have proposed deep learning-based rec- access to exercises plans with expert personal trainers in the ommendation systems. In , a multi-stack recurrent loop. Our proposed network will calculate the certainty of a neural network (RNN) architecture is used to develop a rec- new recommendation using the probability distribution of the ommendation system to suggest businesses in Yelp based on marginal distance, the difference between the highest proba- their reviews. Wu et al. used an RNN with long-short term bility and the second highest probability of exercise classes memories (LSTM) to predict future behavioral trajectories. A in the output of the recommendation system. To quantify the few studies have proposed exercise recommendation systems certainty, we derive the probability distribution of the marginal. Sami et al. used several independent variables to distance from the probability distribution of the last layer of recommend various sports such as swimming using collabo- the recommender. Even though the marginal distance has been rative approaches. Ni et al. developed an LSTM-based used in active learning, this is the first work to provide its model called FitRec for estimating a user’s heart rate profile probability distribution for statistical hypothesis testing. over candidate activities and then predicting activities on that The other challenge with most recommendation systems is basis. The model was tested against 250 thousand workout the initialization of the recommendation system for new users. records with associated sensor measurements including heart Much work has been devoted to address this issue, including rate. metal-learning approaches. In this paper, we leverage the questionnaire filled by users and their demographic informa- tion to find the existing users with similar interests and demo- In the health care domain, Yoon et al. developed a graphic information. Then, the global recommendation model recommendation system for a personalized clinical decision is fined tuned with the history of the similar users. Comparing making. The system used electronic health records of different to other existing methods, this approach has less complexity. patients and their clinical decisions to recommend clinical Fig. 1 shows the overall architecture of the recommendation decisions for new patients. In a recent work , support system. vector machines (SVMs), random forest, and logistic regres- The rest of this paper is organized as follows: Section II sion were used to recommend skin-health products based on describes the mHealth data used in this study. The architec- genetic phenotypes of consumers. In a similar study , ture of the proposed recommendation system is explained in user contextual features and daily trajectories of steps over Section III-A. Section III-B explains how we take advantage time were used to develop a recommendation system for of the user profiles to initialize a recommendation system planning an hour-by-hour activity. The model classifies users for new users. Section IV describes the proposed active to subgroups and recommend activities based on the history of learning procedure based on the distribution of the marginal similar users. The proposed method doesn’t use any time series distance. Section V is devoted to the experimental results and modeling, e.g., recurrent neural networks, to learn patterns, discussion. thus lacking the ability to generalize to more users. A. Related Work In a related field, several works have been devoted to the More recently, mHealth systems have found applications cold-start problem in recommendation systems ,. The in different healthcare domains since the advent of smart cold-start problem refers to the new users whose historical phones. Dunsmuir et al. developed mHealth for diagnosis data is not available to the recommendation system. Thus, the and management of pregnant women with pre-eclampsia. recommendation system is not able to accurately recommend In , the inter-pulse-interval security keys was used to items, e.g., music, movies, etc., to the new users. Meta- authenticate entities for various mHealth applications. Schiza learning approaches try to learn a global model from all et al. proposed a unified framework for an eHealth users to initialize the recommendation systems for new users national healthcare system for European Union. In another. The shortcoming of these approaches is that they are work , WE-CARE, a mobile 7-lead ECG device, was not personalized in the beginning. The second approach is developed to provide 24/7 cardiovascular monitoring system. zero-shot, one-shot, or transfer learning methods ,. In , an mHealth system was designed and developed to These approaches transfer a global model learned from other provide exercise advice to participants based on their Body users’ historic data to new users. These approaches have great Mass Index, Basal Metabolic Rate, and the energy used in performance in classifiers, but they still have shortcomings as each activity or sport, e.g. aerobic dancing, cycling, jogging recommendation systems dealing with sequences of data. The working and swimming. However, this work does not use proposed approach leverage experts (human personal trainers) machine learning algorithms. to actively learn personalized exercise programs for new users. On the other hand, recommendation systems have been While many people use trainers to get recommendation for used in e-commerce and online shopping for several years their daily exercise activities, the proposed approach leverages , , ,. The goal of recommendation systems the expert knowledge to reach a broader group of users at a is to recommend products that suit the consumers’ tastes. lower cost because the expert does not need to be continuously Traditional recommendation systems have used collaborative involved in recommending new exercises, just in the beginning filters to suggest products similar to those the consumers have when the system is uncertain about new users. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 3 II. M H EALTH DATA The data we use comes from the Konrad et al. mHealth experiment with DStress. It was developed to provide coaching on exercise and meditation goals for adults seeking to reduce stress. The purpose of the experiment was to test the efficacy of an adaptive daily exercise recommender (DStress-adaptive) against two alternative exercises programs in which the daily exercises changed according to fixed schedules (Easy-fixed and Difficult-fixed). The DStress-adpative recommender was a hand engineered finite state machine. The transition rules are described in more detail in Konrad et al. , but they Fig. 1: The overall architecture of the proposed recommenda- implement a policy whereby, if a person successfully com- tion system with expert-in-the-loop. The exercise activities are pletes all three exercises assigned for a day, they advance recommended through a smartphone app, and their completion to the next higher level of exercise difficulty. If they do not is collected from smartphones. The deep recommendation succeed at exercises or meditation activities, then they are system is trained on the collected history data and its augmen- regressed to exercises or meditation activities at an easier tation. The new recommender system is initialized for each level of difficulty. The 44 exercises used in DStree and their new participant from the global trained model, and fine-tuned difficulty ratings were obtained from three certified personal with similar users based on their profiles. At each time step, trainers (e.g., Wall Pushups, Standing Knee Lifts, Squats, a new exercise is recommended to users. If the recommen- and Burpees, etc.). The experiment took place over a 28-day dation system is uncertain about the new recommendation, period. In a given week, users encountered three kinds of days: i.e. whether the user will complete the exercises or not, the Exercise Days (occurring on Mondays, Wednesdays, Fridays), recommendation system will ask the expert for correction. Meditation Days (Tuesdays, Thursdays, Saturdays), and Rest Days (Sundays). 2 weeks was included because stress is often manifested in 72 adult participants (19-59 yr) were randomly assigned such symptoms. The Godin Leisure-Time Exercise Ques- to three conditions with different 28-day goal progressions: tionnaire (GLTEQ; 4 item test) was included to assess pre- (1) a DStress-adaptive condition using the adaptive coaching experimental activity levels. The Exercise Self-efficacy system in which goal difficulties adjusted to the user based Scale (EXSE) is an 8-item test assessing individuals’ beliefs on past performance, (2) an Easy-fixed condition in which the in the ability to exercise. difficulty of daily goals increased at the same slow rate for Exercise Profile: The original Konrad et al. study ob- all participants assigned to that condition, and (3) a Difficult- tained difficulty ratings of the exercises from three subject fixed condition in which the goal difficulties increased at matter experts (SMEs; personal trainers) that were predictive a greater rate. Konrad et al. found that the adaptive of the probability of performing the exercises. We aug- DStress-adaptive condition produced significant reductions in mented these data with exercise classification, attributes, and self-reported stress levels compared to the Easy-fixed and relations obtained from a 60 minute structured interview with Difficult-fixed goal schedules. The DStress-adaptive condition an SME (fitness coach) that was followed up with specific also produced superior rates of performing assigned daily clarification questions. The structured interview consisted of a exercise goals. card-sorting task and an exercise program planning task. User Profile: A variety of pretest survey data were collected For the card-sorting task, exercise names and descriptions in the Konrad et al. study that provides our user data. were placed on 3x5 cards. The cards were shuffled and the These included the (1) Perceived Stress Scale (PSS), which is SME was asked to go through the cards to familiarize herself a 10-item psychometric scale assessing perceived stress over with the exercises. The SME was asked to sort the exercises the past month, (2) Depression, Anxiety, Stress Scale (DASS), into piles by whatever criteria “seemed natural.” The SME was a 21 item assessment of depression, anxiety, and stress, (3) the asked to label those piles. These original piles were grouped Cohen-Hoberman Inventory of Physical Symptoms (CHIPS): into super-categories and labeled. Then the original piles were a 33-item scale measuring concerning physical symptoms over sorted into subcategories and labeled recursively until no the past 2 week, (4) BMI, the Body Mass Index, and (5) Goldin further subgroups made sense to the SME. The SME was Leisure-Time Exercise Questionnaire (GLTEQ), a 4-item scale then asked if there was a possible “alternative grouping” of measuring frequency of physical activity during leisure time, the exercises. The SME was also asked to rate the difficulty of and (6) the Exercise Self-Efficacy Scale (EXSE), and 8-item each exercise. This card sorting produced an initial hierarchical assessment of self-efficacy about exercising in next 1-8 weeks. classification of the exercises into a top level of resistance The Perceived Stress Scale (PSS) is the most widely used exercises and metabolic conditioning exercises. Within those instrument for measuring the perception of street ,. supercategories there were subcategories for push, pull, squats, The Depression Anxiety Stress Scale is a valid, reliable lunges, single leg stance, and core exercises. Further subcate- instrument measuring depression, anxiety, and stress. The gories consisted of back, chest, legs. legs/glutes, core/abs, and Cophen-Hoberman Inventory of Phsyical Symptoms (CHIPS) abs/glute exercises. The alternative grouping consisted of full measures concerns about physical symptoms over the past body, compound, and power categories. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 4 III. P ROPOSED R ECOMMENDATION S YSTEM The goal of this paper is to recommend the next exercise for a given participants based on the history of exercises the partic- ipant completed. Let X i (0), X i (1),... , X i (T ) represent the exercise history for the ith participant, where X i (.) ∈ RN ×1 is the one-hot encoding of the exercise and N is the total number of exercises (N = 44 in mHealth dataset). Let U i ∈ RNU ×1 Fig. 2: The architecture and modules of the proposed deep and E j ∈ RNI ×1 represent the ith participant’s profile and recurrent neural network. The recommendation system uses the jth exercise’s profile, respectively. The recommender gets users and exercise profiles as attention mechanism. The atten- X i (t − ω + 1),... , X i (t − 1) as the input in addition to the ith tion mechanism will highlight the most relevant characters tics user’s profile and the exercises’ profiles and predicts X i (t) to of the exercises to each user. recommend to the user. The length of the window, ω, is deter- mined by the developer. In this paper, we use autocorrelation function (ACF) of X to determine the appropriate window embedding and the attention probability: ψ i (t) = pu (t) length, similar to the statistical time series analysis methods (W1 ×Hxi (t)). The exercise name embedding Hxi is a vector in. a Kx -dimensional space, and multiplying it with the attention probability pu (t) rotates this vector in the Kx -dimensional A. Network Architecture space. The proposed network consists of five modules: encoder, de- Exercise Profile input and temporal attention mechanism: coder, recurrent neural network (RNN), user attention, exercise Similarly, the profile of the jth exercises is provided with a temporal attention. The hyper-parameters of this architecture Ne -dimensional vector E j ∈ RNe ×1 and embedded using a are selected empirically. linear layer: Exercise input: The encoder is a fully connected (FC) linear layer that embeds the N -dimensional exercise names onto a Hej (t) = ReLU (We × E j (t)), (4) KX -dimensional vector space: where Hej (t) ∈ RKe ×1 is the embedding of the jth exercise, i and We ∈ RKe ×Ne is the trainable matrix. Note that we use HX (t) = ReLU (WX × X i (t)), (1) (t) in front of the exercise profile similar to the exercise names where HX i (t) ∈ RKX ×1 is the exercise name’s embedding at X(t) to represent the time information. time t for the ith participant, WX ∈ RKX ×N is a trainable At each time step, the user performs an exercise that has weight matrix learned from the training data, and ReLU is the a huge impact on the future exercises the user has desire to activation function. complete. For example, an user who is doing exercises focused User Profile input: The profile of the ith user U i ∈ RNu ×1 on upper body may want to continue upper-body exercises for is provided as a Nu -dimensional vector. The user profile is a day and then focus on the lower-body exercises on another embedded onto a Ku -dimensional vector space: day. In another example, the difficulty of the current exercise affects the difficulty level of the short-term future exercises. To Hui = ReLU (Wu × U i ), (2) incorporate this information into the recommendation system, Ku ×1 we use the exercise profiles as an attention mechanism to where Hui ∈R is the embedding of the ith user’s profile, give weights to different exercises at different time steps. The and Wu ∈ RKu ×Nu. temporal attention mechanism assigns probability values to Users’ profiles, e.g., demographic information, provide valu- different time steps within the ω-length window. The exercise able information about paying attention to specific aspects and embeddings are combined with ψ i (t): features of exercises. For example, age and gender are two important variables that can significantly affect the types of exercises users will likely to perform. The attention mech- pe (t) = sof tmax(W3 × ψ i (t) ⊕ W4 × Hej (t)), (5) anism will highlights the features of the exercise (extracted by neural network) that are more relevant to the current user where W3 ∈ Rω×K and W4 ∈ Rω×Ke are trainable based on their demographic information. Thus, we combine mapping matrices, pe (t) ∈ Rω×1 is the combination vector the exercise name embedding and the user profile embedding of the user-attention exercise and exercise profiles giving the using: attention probability for each time step. The final time series of vectors with length ω used as the input to the RNN module pu (t) = sof tmax(W1 × Hxi (t) ⊕ W2 × Hui ), (3) is: φi (t − k) = pe (k) × ψ i (t − k) for k = {1, 2,... , ω}. RNN: The RNN module is responsible for learning the where W1 ∈ RK×Kx and W2 ∈ RK×Ku are trainable sequential pattern of the exercise history. Although there mapping matrices, pu (t) ∈ RK×1 is the combination vector of are different variants of RNN modules with long short-term the exercise and the user profiles which provides the attention memory (LSTM) and gated recurrent units (GRU), we will use probability for each entity of the input exercise. The final regular RNN modules. The reason is that LSTMs and GRUs vector is the element-wise multiplication of the exercise name have built-in units to learn the dependencies of time series IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 5 over time and accentuate or de-emphasize relevant information a set of training data from different people. Thus, the recom- at different time steps. In this paper, we will use temporal mender is not tailored for new users. The profile of the users attention mechanism that will take into account the importance will provide the attention mechanism in the feature space, but of different time steps. Moreover, our auto-correlation function not enough to personalize the recommender. When we get (ACF) analysis of this dataset shows a short-term dependencies a new user, the recommender will be used. However, when for exercise recommendation. However, in developing recom- the recommender is uncertain about the next recommended mendation systems for datasets with long-term dependencies, exercise, the recommender will ask the expert to intervene and we will replace regular RNNs with RNNs with LSTM or GRU provide the next recommendation. We will use the marginal units. The RNN module consists of one hidden layer with distance and entropy of the series of recommendation as ReLU activation function: criteria to decide when to ask the expert. However, the existing work has set an arbitrary threshold on the marginal distance. h(t) = ReLU (Wφ × φi (t) + Wh × h(t − 1)), (6) In this paper, we derive the probability distribution function of the marginal distance. where Wφ and Wh are trainable weights and h(t) is the hiddent state at time t. Exercise name prediction: The decoder converts the pre- A. Marginal Distance Random Variable dicted hidden state at time t into exercise names: The output of the classifier (the last layer of the rec- ommender) is a vector of N random variables, Y = Y i (t) = sof tmax(Wy × h(t)), (7) [y1 , y2 ,... , yN ]T , with a Multinomial distribution. Ideally, where Wy is a trainable matrix and Y i (t) ∈ RN ×1 is the one of the yi s is one and the rest are zeros. However, N multinomial distribution over N exercises. random variables represent the probability of the input sample Training: The whole network is trained end-to-end with the belonging to each of these classes. Let p = [p1 , p2 ,... ,N ]T cross entropy loss function: represent the probability values. pi has a Beta distribution and the vector p is drawn from a Dirichlet distribution. Thus, these N random variables are sorted in ascending order min −EX(t)∼pdata [log(P (Y (t)|X(t−1),... , X(t−ω), U, E)] y(1) ≤ y(2) ≤... ≤ y(N ) and the marginal distance is defined (8) as Z = y(N ) − y(N −1). The distribution of the marginal All modules are analytically differential. Thus, the gradient distance is (see Appendix A for proof): can back-propagate through decoder, recurrent layers through time, and user and temporal attention modules. Adam opti- Γ(α1 + α2 + α3 ) mizer is used for training the network. f (z) = Γ(α1 )Γ(α2 )Γ(α3 ) Z 1−z B. New User Initialization 2 (α −1) (α −1) (y(N −1) + z) 1 y(N2−1) The recommendation system is trained with a set of training y(N −1) =0 data collected from different users. The system is general and (1 − 2y(N −1) − z) (α3 −1) dy(N −1) (9) not personalized for new users. To personalized the recom- mender, we fine tune our network with the training dataset The Monte Carlo method will be used to approximate this of the users whose profiles are very similar to the new user. distribution. i To achieve this goal, we calculate the similarity p between U and the existing users by Euclidean distance, kU − U k i j 2 B. Active Learning Procedure for i 6= j, to find the most similar users. Then, users are The key component in active learning is to determine when sorted from the most similar to the least similar, and the k to ask the expert for feedback. The recommender system needs most similar users are selected. Afterwards, the recommender to ask for the expert opinion when the input sample is out of system is fine-tuned with the training data of the selected users, the distribution of the training dataset. Let Ds represents the small learning rate, and one epoch. distribution of the training dataset. We use Ds to derive the probability distribution function as the marginal distance as IV. ACTIVE L EARNING defined in § IV-A. Let DM represent this distribution, and Active learning is used to ask experts for providing annota- z i (t) represent the marginal distance for the ith user at time tion for unlabeled samples ,. However, it is impossible t. Then, our hypothesis is: to ask experts for annotating a large set of unlabeled samples. ( The active learner is usually presented with a limited budget to H0 : z i (t) ∈ DM ask experts for annotation. The active learner selects the most H1 : otherwise informative samples based on the uncertainty that a trained classifier has about these samples. Two most common meth- The α-level hypothesis testing determines whether the rec- ods of measuring the uncertainty are entropy and marginal ommended exercise should be given to the user or ask the distance , ,. expert for the right exercise recommendation (feedback or In this paper, we use active learning to personalize the label). Then, the trained recommender is fine-tuned with the recommender for each user. The recommender is trained on feedback from the expert for personalization. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 6 V. E XPERIMENTAL R ESULTS training. Then, the left-out participant is treated as a new participant and the trained recommender is used to recommend In this section, the proposed approach with different com- exercises to the new participant. The actual data from the ponents are evaluated on the offline mHealth dataset. The left-out participant is used as the groundtruth to calculate the proposed architecture is also evaluated on the MovieLens100K accuracy of the recommendation system. We calculate the top- dataset. 1, top-5, and top-10 accuracy for evaluating the recommen- dation system. The experiment is repeated 5 times and the average of the top-k accuracy are reported in Table I. In the A. mHealth first experiment, we only use the demographic information This section is devoted to the evaluation of the proposed of users in the attention mechanism of the proposed recom- exercises recommendation system on an exercises activity mendation system. Table I row 1 shows the accuracy. In the dataset. The dataset was described comprehensively in Sec- second, experiment, we used all information extracted from tion II. questionnaires in addition to their demographic information 1) Data Augmentation: The challenge with recommenda- for the attention mechanism, and observed a slight decline in tion systems is not having enough training data. One way the accuracy (Table I row 2). We hypothesize that most people to increase the amount of training data is to use data aug- don’t have an accurate evaluation of their own ability. Thus, mentation ,. While data augmentation in computer the answers to the questionnaires may not accurately represent vision is straightforward and can be achieved by adding noise their profiles, leading to a conflict to what exercises they or cropping images randomly, it requires careful attention for performed and what they answered. For example, two persons sequential and symbolic data, e.g., language. The random may give exact similar answers to the questions (possibly creation of sequential data has negative effect as it introduces not accurate answers), but have different ability and interest noise to the data that does not follow the sequential pattern in to perform exercises. In the rest of this paper, we only use the real data. demographic information to represent the user’s profile. For exercise activities, there are two ways to augment the sequential data: asking the human expert or using association We compared the baseline model with user demographic mining on the training data. In the first approach, an exercise information with a slightly different network architecture to expert was asked to categorize different exercises available study the affect of the architecture. In the new architecture, for participants. Then, the augmentation algorithm goes over we reduced the dimension of the RNN module to half, and the sequence of exercises for each participant in the training also we reduce the dimension of the output module. The top-1, data, chooses 10% of exercises and replace them with similar top-5, and top-10 accuracy are 60.87%, 90.55%, and 94.71%. exercises in the same category. Compared to the row 1 of Table I, the alternative architecture In the second approach, we propose to use association rule has a worse performance. The original structure proposed in mining to extract rules from frequent itemsets. Then, the Section III-A has the best performance for the mHealth data augmentation algorithms go over the sequence of exercises of and is selected empirically. each participant, choose 10% of exercises and replace them with their similar exercises based on these rules. Baseline with Data Augmentation: The training dataset was 2) Experiment Setup and Discussion: In order to find the augmented with two approaches described in § V-A1. The appropriate length of window for the RNN model, we looked training and augmented data were used to train the baseline at the autocorrelation function of the exercise sequence. The model and evaluated as described in the baseline section. autocorrelation function shows the degree of the dependencies Table I row 3 shows the accuracy results of the baseline model of time series data and is often used to select the order of trained with the training data and the expert augmented data. the time series analysis methods. Our ACF analysis shows The data augmentation generalizes the model and improves that the length of the sequence for the RNN model should the accuracy compared to the baseline (Table I row 1). be w = 3. Processing the data with w = 3 results in 2343 sequence of training samples. The architecture of the proposed On the other hand, we see a decline in the accuracy of recommendation system is selected empirically as follows: the baseline model when it is trained on the training dataset KX = 20, Ku = 15, Ke = 3, and K = 10. and the augmented data by association rule mining algorithms (Table I row 4). This observation points to the importance of Baseline: We use the proposed RNN model with the user having the expert in the loop for exercise recommendation profile attention mechanism and the exercise profile temporal systems. Because in this experiment the augmented dataset attention mechanism (described in § III-A) as our baseline with expert knowledge gives higher accuracy, we use this method. The model is trained with Cross-entropy loss function method for other experiments in this paper. and Adam optimizer for 30 epochs. We use k-fold approach for evaluating the performance of the recommender. Baseline with Active Learning: The training data was used In our k-fold setup, we kept one participant out of the training to estimate the parameters of the Dirichlet distributions: dataset and used the remaining 71 participants data for the [y(N ) , y(N −1) , y]T ∼ D(1.59, 0.42, 0.31). The marginal dis- The code repository: https://github.com/arashmahyari /ExerRecomActive- tribution is calculated numerically, and the α = 0.01-level Learn. hypothesis testing results in: IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 7 TABLE I: The accuracy of the recommendation system. ( i H0 : z (t) ≥ 0.18 top-1 top-5 top-10 Method Accuracy i Accuracy Accuracy H1 : z (t) < 0.18 1. Baseline (Demographic) 63.78% 92.19% 97.33% where P (z i (t) ≥ 0.18) > 1 − α. During the test, if z i (t) 2. Baseline (Full Profile) 61.16% 90.76% 96.98% falls bellow 0.18, then the recommender asks the expert for 3. Baseline + Data Augmen- 72.53% 95.53% 98.56% tation (Expert) the feedback and fine-tunes the network with the feedback. 4. Baseline + Data Augmen- Because we are evaluating the proposed model on a dataset 69.74% 95.28% 98.68% tation (Rule based) collected in the past, we cannot ask for the expert for feedback 5. Baseline (Demographic) 74.45% 95.29% 98.48% in our evaluation. Therefore, whenever z i (t) falls bellow 0.18, + Active Learning we take the actual exercise the test participant performed at 6. Baseline (Demographic) 65.91% 93.14% 97.65% + New user Init that time step and provide it as the feedback by an expert to 7. Baseline + Data Aug- our active learning algorithm (Table I row 5). Comparing the mentation (Expert) + Active 80.12% 97.23% 99.26% results with our baseline model, the top-1 accuracy is increased Learning by 10%. The increased in accuracy is the results of fine-tuning 8. Baseline + Data Augmen- tation (Expert) + New user 71.90% 95.27% 98.42% the recommender system with the feedback received from the Init expert, which makes the recommendation system personalized. 9. Baseline + Data Augmen- tation (Expert) + New user 80.08% 97.00% 99.11% Baseline with New User Initialization: We calculated the Init + Active Learning pairwise similarity across all participants. For each new par- ticipant, we fine-tuned the recommendation system with the TABLE II: Comparison of the accuracy across different rec- training data of the three most similar participants based on ommendation systems for the mHealth data. their profiles. The recommendation system is initialized for top-1 top-5 top-10 the new user by the fine-tuned network. We didn’t use data Method Accuracy Accuracy Accuracy augmentation and active learning in this experiment just to examine the effect of new user initialization. We see a minor GRU4REC 7.89% 22.74% 40.04% Pooling 8.83% 41.16% 50.38% improvement in top-1 accuracy (Table I row 6). However, we CNN 12.97% 33.83% 53.00% believe that when it comes to more diverse participants, e.g., Mixture 5.45% 20.48% 40.41% different age groups, race, ethnicity, the proposed initialization Baseline (Demographic) 63.78% 92.19% 97.33% strategy improve the accuracy significantly. Baseline with Data Augmentation and Active Learning: In this experiment, we combined the data augmentation and To have a fair comparison, the baseline model with user active learning as we hypothesise that the accuracy will demographic is used in this experiment. Table II shows the improve. Table I row 7 shows the results. As we expected, Top-k accuracy for different models. The proposed baseline the accuracy has improved with respect to only active learning model outperforms the existing state-of-the-art in top-1, top- (Table I row 5) and only augmentation (Table I row 3). 5, top-10 accuracy. The best performance after the proposed baseline method in Table II belongs to the CNN method. Baseline with Data Augmentation, New User Initialization, However, the performance of the CNN method is not as good and Active Learning: In the last experiment, we combined as the proposed method. The superior performance of the all modules. Table I row 9 represents the results indicating proposed baseline model extends to the other variations of a very minor decline in accuracy by adding the new user the proposed model, e.g. the baseline with active learning. initialization procedure (compared to Table I row 7). More study with different questionnaires, medical records, etc. may lead to increase in the accuracy of the new user initialization B. MovieLens method. In the second experiment, the proposed method is applied to Comparison with the state-of-the-art: The proposed model the MovieLens100K dataset and compared with the state is evaluated against some of the well-known sequential rec- of the art sequential recommendation systems , , , ommendation systems , , , †. Because the. The dataset contains 100,000 user IDs, movie IDs, the sequential models, including the proposed method, use the user ratings of movies, and timestamps. We group the dataset history of items for each users to propose the next item, by the user IDs and sorted them with their timestamps. they cannot be compared directly with matrix factorization 1) Experiment Setup and Discussion: Our proposed rec- approaches, e.g. SVD. The factorization approaches estimate ommendation system was developed specifically for recom- the rating of different items without explicitly recommending mending exercises, when the number of exercises is limited the next item. However, the sequential models take into (unlike the number of movies). However, the proposed method account the order of items in the history and recommend the is evaluated on the MovieLens dataset to show its strength in next item without estimating the user’s rating. scaling to other applications. To this end, we pre-processed the dataset and limited the number of movies in this dataset to the † Implemented in. 100 most watched movies in Movielens100K. Thus, the new IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 8 TABLE III: Comparison of the accuracy across different recommendation systems for a subset of MovieLens100K. top-1 top-5 top-10 Method Accuracy Accuracy Accuracy GRU4REC 2.5% 12.27% 21.55% Pooling 3.93% 18.91% 39.66% CNN 2.26% 10.91% 20.23% Mixture 2.47% 11.37% 21.34% Baseline (Demographic) 5.90% 26.38% 46.32% (a) (b) Fig. 3: (a) The valid surface of y(N ) , y(N −1) , y variables with subset of MovieLens100K is more comparable to the exercises the Dirichlet distribution. (b) The integral area projected on activity dataset when the number of items are limited. y(N ) − y(N −1) plane. Similar to the mHealth dataset, we selected the window length of w = 3 resulted in 29931 sequence samples, and are split to 80% training and 20% test samples. The architecture of distance is defined as M = y(N ) − y(N −1). When the value the proposed recommendation system is selected empirically of the marginal distance falls below a given threshold θm , the as follows: KX = Ku = 1000, Ke = 3, and K = 300. recommender asks the human expert for labeling. While in Table III shows top-1, top-5, and top-10 accuracy of different prior works the threshold was determined by users, we define recommendation systems. In recommending new movies in the the probability distribution of the marginal distance. The out- subset of MovieLens100K, the proposed method surpasses the put variable is drawn from a Dirichlet probability distribution accuracy of the state of the art methods. After the proposed function, D(α1 , α2 ,... , αN ). In the DirichletPdistribution, the method, the pooling method comes second with accuracy N amount of the variables sum up to one: k=1 y(k) = 1. less than the proposed baseline method. In the marginal distance, we are only interested in y(N ) and y(N −1). To simplify the calculation, we define a new VI. C ONCLUSION PN −2 random variable y = k=1 y(k). Note that the marginal In this paper, a physical exercise recommendation system distribution of y(k s are Beta distribution and the summation was developed. The main challenge in developing recom- of several Beta distribution, y, is a Beta distribution. The mendation systems is to make them personalized, especially joint distribution of Y = [y(N ) , y(N −1) , y]T is a Dirichlet for new users when the training dataset does not exist. An- distribution, D(α1 , α2 , α3 ),: alyzing the outcomes of the experimental results indicates the importance of user and exercise profiles as attention Γ( P αk ) (α1 −1) (α2 −1) (α3 −1) mechanisms.Thus, the developed system took advantage of f (y(N ) , y(N −1) , y) = Q k y(N ) y(N −1) y k Γ(α k) the health and demographic questionnaires filled out by users (10) before joining the program to use as attention mechanism. where 0 ≤ y(N ) , y(N −1) , y < 1, and y(N ) + y(N −1) + y = 1. In spite of all fine tuning and the attention mechanism, The parameters α1 , α2 and α3 are estimated from the training the experimental results show that the perfect personalization dataset using Maximum Likelihood approach ,. cannot be achieved. While constant feedback from the user The probability distribution of the marginal distance is interaction can be collected in e-commerce, music and movie derived from transforming y(N ) and y(N −1) based on Z = recommendation systems, having an expert in the loop is y(N ) − y(N −1). Because the support of y(N ) , y(N −1) , y is inevitable in exercise recommendation systems because we restricted to the hyper-plane defined by y(N ) +y(N −1) +y = 1, don’t know whether the user performed the exercise or not the third argument, y, is known given y(N ) and y(N −1). Fig. 3a (while in e-commerce the user either buys the product or shows the y(N ) + y(N −1) + y = 1 hyper-plane. Because listen completely to a music on a music platform). The expert the support of y(N ) , y(N −1) , y is restricted and the value of knowledge provides information when the recommender is y depends on y(N ) and y(N −1) , we can just visualize the uncertain and fails to make accurate predictions, which is why projection of the hyperplane on the y(N ) − y(N −1) plane. a real time active learning mechanism in conjunction with This makes defining the boundaries of the integrals easier. our developed recommendation systems showed a significant Fig. 3b shows this projection with y(N ) > y(N −1) and increase in the accuracy of the system. Z = y(N ) − y(N −1) lines. In the first step, we derive the cumulative probability distribution F (z) = P (Z ≤ z), as in ACKNOWLEDGMENT Eq. 11. Then, to get the probability distribution function, we The authors would like to thank Dr. Choh Man Teng for take the derivatives from both sides with respect to z to obtain assisting with the expert studies. Eq. 12. Thus, we get the probability distribution function the marginal distance as in Eq. 13. We calculate f (z) numerically A PPENDIX A by approximating the integral with a summation for different P ROBABILITY D ISTRIBUTION OF THE Marginal Distance value of z, even though calculation of the closed form is not In marginal distance, the output variables are sorted in impossible. ascending order as y(1) ≤ y(2) ≤... ≤ y(N ). The marginal IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 9 1−z Z 2 Z y(N −1) +z P (y(N ) − y(N −1) ≤ z) = f (y(N ) , y(N −1) , 1 − y(N ) − y(N −1) )dy(N ) dy(N −1) y(N −1) =0 y(N ) =y(N −1) Z 0.5 Z y(N −1) +z + f (y(N ) , y(N −1) , 1 − y(N ) − y(N −1) )dy(N ) dy(N −1) (11) y(N −1) = 1−z 2 y(N ) =y(N −1) 1+z −1 1−z 1−z Z 2 f (z) = f (y(N ) , , 1 − y(N ) − )dy(N ) 2 y(N ) = 1−z 2 2 2 Z 1−z Z 1+z 2 1 2 1−z 1−z + f (y(N −1) + z, y(N −1) , 1 − 2y(N −1) − z)dy(N −1) + f (y(N ) , , 1 − y(N ) − )dy(N ) y(N −1) =0 2 1−z y(N ) = 2 2 2 Z 1−z 2 = f (y(N −1) + z, y(N −1) , 1 − 2y(N −1) − z)dy(N −1) (12) y(N −1) =0 Z 1−z Γ(α1 + α2 + α3 ) 2 (α1 −1) (α2 −1) (α3 −1) f (z) = (y(N −1) + z) y(N −1) (1 − 2y(N −1) − z) dy(N −1) (13) Γ(α1 )Γ(α2 )Γ(α3 ) y(N −1) =0 A PPENDIX B A. Konrad, V. Bellotti, N. Crenshaw, S. Tucker, L. Nelson, H. Du, C OMPUTATIONAL C OST P. Pirolli, and S. Whittaker, “Finding the adaptive sweet spot: Balancing compliance and achievement in automated stress reduction,” in Pro- ceedings of the 33rd Annual ACM Conference on Human Factors in In this section ,the computational cost of different modules Computing Systems, 2015, pp. 3829–3838. are calculated. The computational cost of the exercise input H. Bharadhwaj, “Meta-learning for user cold-start recommendation,” module is O(N ) because N > Kx. The computational cost in 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019, pp. 1–8. of the User Profile input module is O(NU ), and that of the D. T. Dunsmuir, B. A. Payne, G. Cloete, C. L. Petersen, M. Görges, Exercise Profile input is O(Ne ). The computational cost of the J. Lim, P. Von Dadelszen, G. A. Dumont, and J. M. Ansermino, RNN module is O(K ω ). K is smaller than N but not a lot “Development of mhealth applications for pre-eclampsia triage,” IEEE journal of biomedical and health informatics, vol. 18, no. 6, pp. 1857– smaller. So the overall computational cost is dominated by the 1864, 2014. RNN module. Thus, the overal computational cost is O(K ω ). R. M. Seepers, C. Strydis, I. Sourdis, and C. I. De Zeeuw, “Enhancing heart-beat-based security for mhealth applications,” IEEE journal of biomedical and health informatics, vol. 21, no. 1, pp. 254–262, 2015. E. C. Schiza, T. C. Kyprianou, N. Petkov, and C. N. Schizas, “Proposal for an ehealth based ecosystem serving national healthcare,” IEEE R EFERENCES journal of biomedical and health informatics, vol. 23, no. 3, pp. 1346– 1357, 2018. A. Huang, C. Chen, K. Bian, X. Duan, M. Chen, H. Gao, C. Meng, W. T. Riley, W. J. Nilsen, T. A. Manolio, D. R. Masys, and M. Lauer, Q. Zheng, Y. Zhang, B. Jiao, et al., “We-care: an intelligent mobile “News from the nih: potential contributions of the behavioral and social telecardiology system to enable mhealth applications,” IEEE journal of sciences to the precision medicine initiative,” Translational behavioral biomedical and health informatics, vol. 18, no. 2, pp. 693–702, 2013. medicine, vol. 5, no. 3, pp. 243–246, 2015. P. Wuttidittachotti, S. Robmeechai, and T. Daengsi, “mhealth: A de- K. E. Thorpe, “The future costs of obesity: National and state estimates sign of an exercise recommendation system for the android operating of the impact of obesity on direct health care expenses,” A collaborative system,” Walailak Journal of Science and Technology (WJST), vol. 12, report from United Health Foundation, the American Public Health no. 1, pp. 63–82, 2015. Association and Partnership for Prevention, 2009. S. Reddy, S. Nalluri, S. Kunisetti, S. Ashok, and B. Venkatesh, “Content- “noom,” www.noom.com, 2021. based movie recommendation system using genre correlation,” in Smart P. Pirolli, G. M. Youngblood, H. Du, A. Konrad, L. Nelson, and Intelligent Computing and Applications. Springer, 2019, pp. 391–397. A. Springer, “Scaffolding the mastery of healthy behaviors with fittle+ M. Khoali, A. Tali, and Y. Laaziz, “Advanced recommendation systems systems: Evidence-based interventions and theory,” Human–Computer through deep learning,” in Proceedings of the 3rd International Confer- Interaction, vol. 36, no. 2, pp. 73–106, 2018. ence on Networking, Information Systems & Security, 2020, pp. 1–8. S. Michie, R. West, R. Campbell, J. Brown, and H. Gainforth, “Abc C. Panagiotakis, H. Papadakis, A. Papagrigoriou, and P. Fragopoulou, of behaviour change theories: an essential resource for researchers,” “Improving recommender systems via a dual training error based cor- Policy Makers and Practitioners. Silverback IS: Silverback Publishing, rection approach,” Expert Systems with Applications, p. 115386, 2021. vol. 402, 2014. F. Fessahaye, L. Perez, T. Zhan, R. Zhang, C. Fossier, R. Markarian, G. Linden, B. Smith, and J. York, “Amazon. com recommendations: C. Chiu, J. Zhan, L. Gewali, and P. Oh, “T-recsys: A novel music rec- Item-to-item collaborative filtering,” IEEE Internet computing, vol. 7, ommendation system using deep learning,” in 2019 IEEE international no. 1, pp. 76–80, 2003. conference on consumer electronics (ICCE). IEEE, 2019, pp. 1–6. A. V. Bodapati, “Recommendation systems with purchase data,” Journal D. Ravı̀, C. Wong, F. Deligianni, M. Berthelot, J. Andreu-Perez, B. Lo, of marketing research, vol. 45, no. 1, pp. 77–93, 2008. and G.-Z. Yang, “Deep learning for health informatics,” IEEE journal J. Bennett, S. Lanning, et al., “The netflix prize,” in Proceedings of KDD of biomedical and health informatics, vol. 21, no. 1, pp. 4–21, 2016. cup and workshop, vol. 2007. New York, NY, USA., 2007, p. 35. D. Z. Liu and G. Singh, “A recurrent neural network based recommen- Y. Song, S. Dixon, and M. Pearce, “A survey of music recommendation dation system,” tech. rep., Stanford University, 2016. systems and future perspectives,” in 9th International Symposium on C.-Y. Wu, A. Ahmed, A. Beutel, A. J. Smola, and H. Jing, “Recurrent Computer Music Modeling and Retrieval, vol. 4. Citeseer, 2012, pp. recommender networks,” in Proceedings of the tenth ACM international 395–410. conference on web search and data mining, 2017, pp. 495–503. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022 10 A. Sami, R. Nagatomi, M. Terabe, and K. Hashimoto, “Design of S. Yu, J. Yang, D. Liu, R. Li, Y. Zhang, and S. Zhao, “Hierarchical data physical activity recommendation system.” in IADIS European Conf. augmentation and the application in text classification,” IEEE Access, Data Mining, 2008, pp. 148–152. vol. 7, pp. 185 476–185 485, 2019. J. Ni, L. Muhlstein, and J. McAuley, “Modeling heart rate and activity K. Kafle, M. Yousefhussien, and C. Kanan, “Data augmentation for data for personalized fitness recommendation,” in The World Wide Web visual question answering,” in Proceedings of the 10th International Conference, 2019, pp. 1343–1353. Conference on Natural Language Generation, 2017, pp. 198–202. J. Yoon, C. Davtyan, and M. van der Schaar, “Discovery and clinical de- P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to data mining. cision support for personalized healthcare,” IEEE journal of biomedical Pearson Education India, 2016. and health informatics, vol. 21, no. 4, pp. 1133–1145, 2016. B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk, “Session- X. Liu, C.-H. Chen, M. Karvela, and C. Toumazou, “A dna-based based recommendations with recurrent neural networks,” arXiv preprint intelligent expert system for personalised skin-health recommendations,” arXiv:1511.06939, 2015. IEEE journal of biomedical and health informatics, vol. 24, no. 11, pp. P. Covington, J. Adams, and E. Sargin, “Deep neural networks for 3276–3284, 2020. youtube recommendations,” in Proceedings of the 10th ACM conference Z. Li, S. Das, J. Codella, T. Hao, K. Lin, C. Maduri, and C.-H. Chen, on recommender systems, 2016, pp. 191–198. “An adaptive, data-driven personalized advisor for increasing physical N. Kalchbrenner, L. Espeholt, K. Simonyan, A. v. d. Oord, A. Graves, activity,” IEEE journal of biomedical and health informatics, vol. 23, and K. Kavukcuoglu, “Neural machine translation in linear time,” arXiv no. 3, pp. 999–1010, 2018. preprint arXiv:1610.10099, 2016. H. Lee, J. Im, S. Jang, H. Cho, and S. Chung, “Melu: Meta-learned user M. Kula, “Mixture-of-tastes models for representing users with diverse preference estimator for cold-start recommendation,” in Proceedings interests,” arXiv preprint arXiv:1711.08379, 2017. of the 25th ACM SIGKDD International Conference on Knowledge ——, “Spotlight,” https://github.com/maciejkula/spotlight, 2017. Discovery & Data Mining, 2019, pp. 1073–1082. T. Minka, “Estimating a dirichlet distribution,” 2000. J. Li, M. Jing, K. Lu, L. Zhu, Y. Yang, and Z. Huang, “From zero- E. Suh, “Dirichlet python package,” https://github.com/ericsuh/dirichlet, shot learning to cold-start recommendation,” in Proceedings of the AAAI 2021. conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4189– 4196. A. Mahyari and T. Locker, “Robust predictive maintenance for robotics via unsupervised transfer learning,” in The International FLAIRS Con- ference Proceedings, vol. 34, 2021. S. Cohen, T. Kamarck, and R. Mermelstein, “A global measure of perceived stress,” Journal of health and social behavior, pp. 385–396, 1983. S. Cohen, “Perceived stress in a probability sample of the united states.” 1988. A. Coker, O. Coker, and D. Sanni, “Psychometric properties of the 21- item depression anxiety stress scale (dass-21),” African Research Review, vol. 12, no. 2, pp. 135–142, 2018. M. M. Antony, P. J. Bieling, B. J. Cox, M. W. Enns, and R. P. Swinson, “Psychometric properties of the 42-item and 21-item versions of the depression anxiety stress scales in clinical groups and a community sample.” Psychological assessment, vol. 10, no. 2, p. 176, 1998. A. C. McFarlane, M. Atchison, E. Rafalowicz, and P. Papay, “Physical symptoms in post-traumatic stress disorder,” Journal of psychosomatic research, vol. 38, no. 7, pp. 715–726, 1994. G. Godin, “The godin-shephard leisure-time physical activity question- naire,” The Health & Fitness Journal of Canada, vol. 4, no. 1, pp. 18–22, 2011. E. McAuley, “Self-efficacy and the maintenance of exercise participation in older adults,” Journal of behavioral medicine, vol. 16, no. 1, pp. 103– 113, 1993. P. Pirolli, “A computational cognitive model of self-efficacy and daily adherence in mhealth,” Translational behavioral medicine, vol. 6, no. 4, pp. 496–508, 2016. J. D. Hamilton, Time series analysis. Princeton university press, 2020. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. H. H. Aghdam, A. Gonzalez-Garcia, J. v. d. Weijer, and A. M. López, “Active learning for deep detection neural networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3672–3680. B. Zhang, L. Li, S. Yang, S. Wang, Z.-J. Zha, and Q. Huang, “State- relabeling adversarial active learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8756–8765. K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin, “Cost-effective active learning for deep image classification,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 12, pp. 2591–2600, 2016. A. J. Joshi, F. Porikli, and N. Papanikolopoulos, “Multi-class active learning for image classification,” in 2009 ieee conference on computer vision and pattern recognition. IEEE, 2009, pp. 2372–2379. S. Gu, Y. Jiao, H. Tao, and C. Hou, “Recursive maximum margin active learning,” IEEE Access, vol. 7, pp. 59 933–59 943, 2019. C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical distribu- tions. Wiley Hoboken, 2011, vol. 4. F. M. Harper and J. A. Konstan, “The movielens datasets: History and context,” Acm transactions on interactive intelligent systems (tiis), vol. 5, no. 4, pp. 1–19, 2015. View publication stats