Recommender Systems Lecture Notes PDF

Recommender systems Computing & Information Sciences W. H. Bell Motivation Local shops focus on popular items. Limited stock. Unlikely to have rarer books. Online shops Unlimited stock with respect to local shop. Niche products are large part of total sales. Need to provide users with recommendations. Examples Amazon – recommends products. Netflix – recommends videos. https://www.netflixprize.com/ - $1M prize for a better recommender systems algorithm. YouTube – recommends similar videos. News applications – recommend similar stories. Types of recommender system Content-based filtering. Capture user’s behaviour to build up a user profile. A user’s search history, watch history or time spent reading an article. User profile is used to predict preferences and make recommendations. Collaborative filtering. Relies on the behaviour of other users towards the document or products. Other users submit their preferences towards a product – ratings or Boolean. Recommendations based on assumption that people are similar. Hybrid – using content-based and collaborative filtering. Content-based filtering Overview Capturing data. User profiles. Item-based comparisons. User-profile comparisons. Combining data sources. Capturing data Websites record user data: Cookies – stored locally on user’s computer. Account information – stored within a website server. Capture user’s: Order history. Time reading an article. Watch history and time watching a video. Browsing history. Time and spatial data. Build user profile Use captured data to build a user profile. Information collected have features. Type of video watched. Description of item purchased. Use features to build a user profile. Use profile to recommend products or services. More data available – better recommendations. Reduced statistical and systematic uncertainties. Item-based comparisons 𝐷1 property2 𝑄 User has selected an item. Use the item to select other items. 𝐷2 Implement with vector-space model. 0 property1 Selected item is the query (Q). Other items are the document vectors (D1, D2,...). Cosine similarity matching. Searching for items with similar number of properties. User-profile comparisons 𝐷1 Profile built from many sources. property2 𝑄 User has preferences from these items. Implement with the vector-space model. 𝐷2 User-profile properties form query (Q). Find matching items (D1, D2,...). 0 property1 Simple (dot product) matching. May have many more user-profile parameters than within item. Combining data sources Inputs from different sources can be combined. Social media and sales history data. Difficult to combine with simple linear methods. Use deep learning/neural networks. Deep learning approach: benefits Capable of modelling non-linear data. Able to combine inputs from different sources. Reduce effort designing features of properties of data. Include different types of data, such as images and text. Model sequential patterns. Two or more user actions, considered as a pattern of behaviour. Flexibility to model changes or requirements. Combine content-based and collaborative filtering. Deep learning approach: limitations Requires a large data sample. Difficult to interpret in terms of understanding the system. Many hyperparameters – fine tuning of modelling parameters. Collaborative filtering Collaborative filtering The behaviour of many users is recorded. Binary information – E.g. watched/not watched. Ratings information – E.g. number of stars assigned to product. Continuous information – reading or viewing time. Build a matrix of user response with respect to observables. Use matrix to extract general results. Users and behaviour: Boolean Example: Boolean decision. All users have reviewed all products. Item 1 Items 2 Items 3 Item 4 User 1 1 0 1 1 User 2 0 0 1 0 User 3 1 1 0 1 User 4 0 1 0 1 Users and behaviour: ratings Example: Ratings (1-5). All users have reviewed all products. Item 1 Items 2 Items 3 Item 4 User 1 2 3 2 4 User 2 1 2 1 2 User 3 5 3 5 5 User 4 2 3 2 4 Users and behaviour: similarities Users are similar. Exaggerated example, but real effect. Item 1 Items 2 Items 3 Item 4 User 1 2 3 2 4 User 2 1 2 1 2 User 3 5 3 5 5 User 4 2 3 2 4 Users and behaviour: similarities Items are similar. Exaggerated example, but real effect. Item 1 Items 2 Items 3 Item 4 Items may contain features of other items. User 1 2 3 2 4 Item 4 may contain features from item 1 and 3. User 2 1 2 1 2 User 3 is unable to score higher than 5 though. User 3 5 3 5 5 User 4 2 3 2 4 Users and behaviour: sparse matrix Matrix is normally sparse. Users have only reviewed a few items. Compensate with a large number of users. However, statistical uncertainty remains. Item 1 Items 2 Items 3 Item 4 User 1 2 2 4 User 2 1 1 User 3 5 3 5 User 4 3 2 4 Collaborative filtering approaches Memory-based – or neighbourhood-based. Comparisons between input user and known users. Comparisons between input item and known items. Model-based – prediction from model. Matrix factorisation. Singular value decomposition (SVD). Neural networks. Memory-based predictions User-based Compare input user with other users. Use information on input user to form comparison with known users. Recommend products that the user users liked. Item-based Compare input items with known item combinations of other users. Recommend other items the users liked. User-based predictions 𝐷1 Implement with the vector-space model. property2 𝑄 User-profile properties form query (Q). Find matching users (D1, D2,...). 𝐷2 Simple (dot product) matching. Other users may have rated many more items. 0 property1 Matrix factorisation Use matrix factorisation. Possible since users and items are similar. Reduce the space required to store the matrix. Allows the prediction of missing values. Having factorised, multiply to predict missing values. Numeric factorisation method. Alternating least squares. Gradient descent. Introduce two or more latent factors. May correspond to real features of user or item. Matrix factorisation 𝑈 = {𝑢𝑖 , … , 𝑢𝑀 } – Set of users, where M is the number of users. 𝐼 = {𝑖𝑖 , … , 𝑖𝑁 } – Set of items, where N is the number of items. 𝑟 ∈ 𝑋 – Each user expresses their rating (r), where X is the set of rating values. The sparsely populated matrix R has entries rui. Items User feature Item feature matrix matrix 𝑟𝑢𝑖 ⋯ 0.2 0.1 0.2 0.3 0.2 Users ⋮ ⋱ ⋮ ≈ 0.3 0.2 ∙ 0.2 0.4 0.1 Number of latent factors 0.4 0.3 ⋯ P Q R Matrix factorisation Split the data within the matrix R into two: Training (𝑆𝑇𝑟 ) and test (𝑆𝑇𝑒 ) sets, which do not overlap (𝑆𝑇𝑟 ∩ 𝑆𝑇𝑒 = ∅). Evaluate the root-mean square error (RMSE): 1 2 𝑅𝑀𝑆𝐸 = ෍ 𝑟𝑢𝑖 − 𝑟𝑢𝑖 Ƹ 𝑟𝑢𝑖 Ƹ – Prediction of 𝑟𝑢𝑖 , obtained from training set 𝑆𝑇𝑟. 𝑁𝑇𝑒 𝑢,𝑖 ∈𝑆𝑇𝑒 Minimise the derivative of the error. Fit becomes better with more latent factors. Too many latent factors can be too flexible. Add a regularisation term to the RMSE derivative to reduce this effect. Cold-start problem Cold-start problem Sufficient data do not exist: New products are added. New users are added. New community – completely new recommender system. Cold start: New products or services New products or services are added. No users have reviewed these items. Collaborative filtering data are not available. Reduced list of properties of item may be available. Full property list is extracted from user interactions. Content-based filtering is possible. Matching may have reduced efficiency. Increase score of new products to compensate. Random selection and then re-weight result. Cold start: New users New users pose a similar issue. Content-based filtering may not work. May be able to extract data from interaction with other websites. Acquire preference data from account settings, age, gender, preferences. No data for collaborative filtering. Netflix recommender system Carlos A. Gomez-Uribe and Neil Hunt. 2015. “The Netflix recommender system: Algorithms, business value, and innovation”. ACM Trans. Manage. Inf. Syst. 6, 4, Article 13 (December 2015). DOI: http://dx.doi.org/10.1145/2843948 Motivation Users are quickly overwhelmed by choices. Find it difficult to choose between many choices. Choose none of those offered or a bad choice. Internet feed provides more choice, making this problem worse. Average Netflix user’s concentration on choosing a video is 60 to 90 seconds. Recommender system has two screens to provide choices. Users have enough concentration to review two screens of options. Have moved away from predicting high star ratings. Now use a range of algorithms that work together. Netflix homepage layout Taken from: http://dx.doi.org/10.1145/2843948 Matrix layout that provides many types of recommendation. 40 to 75 rows of options, depending on device. Algorithms Personalised video ranker (PVR). Blends personal information with general popularity. Top-N video ranker. Best personalised recommendations from catalogue. Unlike PVR that only works on a subset of the catalogue. Trending Now. Uses short-term temporal trends, minutes or days of data. Picks up Valentine’s day or weather storms effects. Algorithms Continue Watching. Picks up episodes of video watching. Estimates if the viewer is likely to continue watching. Uses time elapsed since viewing and when the viewer stopped the video (beginning, mid, end). Video-Video Similarity. Similarity algorithm is not personalised. Arrange rows of similarities in a personalised manner. Page generation (user interface) Uses output of all algorithms. Considers recommendations and diversity. Fully personalised. May use many or few video-video similarity rows. Search interface. 80% of videos watched based on recommendations. 20% of videos watched found via the search feature. Search history is used for recommendations too.

Recommender Systems Lecture Notes PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue