Million Song Dataset Recommendation Project Report

1y ago
345.96 KB
8 Pages
Last View : 1m ago
Last Download : 1y ago
Upload by : Aiyana Dorn

Million Song Dataset RecommendationProject ReportYi LiRudhir GuptaYoshiyuki NagasakiTianhe ZhangCornell UniversityCornell UniversityCornell UniversityCornell @cornell.edutz249@cornell.eduAbstract—For our project, we decided to experiment, designand implement a song recommendation system. We used theMillion Song Dataset [1] (MSDS) to find correlations betweenusers and between songs to ultimately provide recommendationsfor songs to which users would prefer to listen. In this paper, wewill discuss the problems we faced, the methods we consideredto implement this system, and the results and analysis of ourfindings. We have also compared our findings with other experiments done in the field. From our findings, also we recognizenumerous ways to improve the results of our recommendationsystem; we believe we have found a novel way to recommendsongs that a user might enjoy.I. I NTRODUCTIONWith the advent of online services such as Pandora and iTunes,the music industry has shifted away from the traditionaldistribution model of selling physical copies of music to acloud-based model that provides music for users to listen to.In this model, value is derived when these services presentsongs that the customer is interested in; either the customerpurchases a subscription, or the customer pays for the song.In both cases, these music services can derive financial gainby improving their recommendations to potential customersof the songs they like. Thus, there is strong financial incentiveto implement a good song recommendation system.There are many interesting problems to tackle whenbuilding a song recommendation system. One issue thatmakes recommending songs difficult is that there is nostraightforward similarity measure between songs; two songsthat sound similar may be very different songs that havecompletely different audiences. Thus, we needed to figureout how to gain useful information that could be used torecommend other songs. Another issue is in identifying usefulfeatures of the songs; determining which features would bebest for a learning algorithm is important, since good featureselection could improve recommendations. Lastly, we decidedto use the MSDS, as well as data from participating partners1 ,to train and test our recommendation system, and the scopeof this data presented issues. The MSDS alone containedmore than 280 gigabytes of data, which was a challenge toprocess and use efficiently.1 dditional-datasetsTo address the problem of effectively measuring songsimilarity and recommendation, we used song metadata fromthe MSDS and user play history data from EchoNest2 . Afterexperimenting with several baselines, such as K-NearestNeighbors (KNN) and matrix factorization, we createda program that utilized the user listening data and songmetadata to suggest new songs that a user would most likelyprefer. Metadata that we deemed had the greatest value inthis metric were used as features. Lastly, we had severalways to address the large amount of data. The simplest waywas to run our training and testing on a subset of the data.We also decided to filter our data in order to preserve theuseful information while ignoring information with little value.Our basic approach is to use a combination of clustering,song-based similarity metrics, and user-based similaritymetrics to recommend songs. When recommending to a userin the test group, our algorithm produces three different setsof recommendations from these three different metrics. Fromthese recommendations, as well as properties of the test user,we choose songs with the highest likelihood of being a songthat the test user prefers.Papers on related work in this field talk about how aspecific implementation will improve the precision ofrecommendations. For example, the winner of the KaggleMSDS competition3 provided a Memory-based CollaborativeFiltering method. Also, many other papers talked aboutdifferent song similarity approaches, such as using acousticdata. These publications focused on deriving methods thatprovide a better precision. We surveyed some of thesealgorithms in our work and used them as baselines and indesigning our final solution.From our experimentation, we have derived severalconclusions. First, we discovered the precision of ourresults tended to be low, which he attribute mainly to thelack of useful features and the fact that our program onlyrecommends constant number of songs for a given user.2 The Echo Nest Taste profile subset, the official user data collection forthe MSDS, available at: ofile3

Second, we found that certain methods that we surveyed weremore useful than others, when the properties of the users inthe test set were considered (i.e. the number of songs in ausers listening history). Lastly, we believe our method forcalculating song similarity is a novel solution for the songrecommendation problem, as it optimizes among numerousapproaches to predict the best songs for users.II. P ROBLEM D EFINITIONThe central issue is the recommendation of songs to auser. For a given user, we have their song history and playcount for each song. From this, we want to produce aset of ten recommendations to the user. Then, we try toanswer the question, “How do we use the song history ofa user to provide recommendations that they will like?”Generally, this is done by looking at the songs that aremost similar to the users songs, as well as the users whoare most similar to the user according to their listening history.In order to discern whether songs are similar to those in aquery users listening history, our methods use the metadatacollected from a song to weigh its similarity with anothersong. From this, we can choose the songs with the highestweights as the recommended songs. However, given thatthere is a lot of metadata about each song, a significant issuewas deciding which features should be used to calculate songsimilarity. For this project, we focused on the features that wedeemed were most likely to distinguish similar songs fromdifferent songs.We also determine the best songs for a given user, in part,based on the song histories from similar users. However, thisbrings up another problem: How do we measure similarityamong users? Our approach measured closeness among usersby taking the cosine similarity measure between the twousers’ song histories. This is a major component in some ofour algorithms.Another problem that we considered is the cold-start andnear-cold-start problem. The cold-start problem, in terms ofsong recommendations, refers to the situation in which thetest user has no song history; since all our metrics use this forsong prediction, a lack of song history is a difficult problemfor our algorithms. It can be solved by simply recommendinga subset of the most popular songs. This scenario also bringsup another interesting situation, the near-cold-start problem.If the user has a very small song history, our algorithms maynot return an accurate set of recommended songs. To addressthese issues and to try to achieve the best possible precisionunder any circumstance, we ran several experiments with ouralgorithms and devised a multi-faceted prediction algorithm.Lastly, we have the issue of dealing with large quantitiesof data. The MSDS and accompanying datasets offer over280 gigabytes of information about songs, their metadata,and user play counts. How do we deal with all the data?Often, we simply use a random sample of the data fortraining, validation, and testing. For example, we usethe data in our user song history set with only the topone thousand songs in the song history. We also pruneour data so that we focus on information that wouldmaximize the value-to-performance tradeoff. Furthermore,we developed multithreaded, multiprocessor, and mapreduce implementations of certain methods to accelerate theprocessing of the tasks.III. DATA A NALYSISThe main source of data for this project was the MillionSong Data Set (MSDS), released by Columbia UniversitysLaboratory for the Recognition and Organization of Speechand Audio. It contains metadata and audio features for onemillion popular and contemporary songs, varying in genre,time period, and country of origin. Supplementary dataprovided by other groups and matched to songs in the MSDSwas also used. This includes lyrics provided by musiXMatch4 ,user-defined tags provided by Last.fm5 , and user listeningdata provided by the EchoNest.A. User Listening DataThis data comes in the form of the number of times a givenuser has listened to a given song (play count). Throughout theproject, we have roughly modeled user preferences by theirplayed song history, so the algorithms we implemented thatuse this per-user song play count data learn and are evaluatedbased on this assumption.B. Song MetadataAmong the large amounts of song metadata, we decided tofocus on features that we deemed would be most relevantin determining similarity between songs, as well as thosethat we thought were interesting candidates for enhancingthis metric. We decided that information like artist name,year composed, and some audio signal-related measurementslike tempo would be most relevant in characterizing a song,whereas data on artist location would not matter significantly.We used the lyrics data as features for those songs whoselyrics were provided. To respect copyright privileges,lyrics data for only about 23% of the MSDS songs werereleased in a bag-of-(stemmed)-words format. Ultimately,we decided to include this data, because we believed thatit could exploit the semantic relationship in like-themed songs.4 tch5

IV. M ETHODS AND A LGORITHMSFor all of our methods, we created sets of training, validation,and test data. For training data, we used unmodified userhistories and song metadata. For validation and test data, wetook the user histories and split it into two halves. The firsthalf was used to test our algorithms, while the second halfwas used to evaluate our results. If a song recommendationfor a user, predicted based on the history for that user inthe first half, existed in the second set of data, then thatrecommendation was considered relevant. This is the samemethod used to create training, validation, and test data in[1].A. Baseline algorithmsFor this project, we devised a few baseline algorithms againstwhich to compare our results. These methods provided asolid basis for evaluating the effectiveness of our later work.The first and simplest baseline method we implementedwas a straightforward popularity-based algorithm that justrecommends the most popular songs, as outlined in [1].1) K-Nearest Neighbors (KNN): We also used the kNearest Neighbor algorithm on the user history data. For eachtest user, we found the set of users who were closest, usingeach song as a feature and weighted according to the playcount for that song. From these results, we recommendedsongs (that the test user had not listened to) from the listeninghistory of the closest users. Closeness between users iscalculated by the minimum cosine distances between userssong histories.2) Matrix factorization: Singular Value Decompositions(SVD) are among the most successful recommender systemalgorithms. In its simplest form, SVD decomposes theuser-song play count matrix M into a latent feature spacethat relates users to songs (and vice-versa). This is usuallyachieved by factoring M according toM UT VWhere U Rk m is a low-rank representation of the musers, and V Rk n represents the n items. Once therepresentations U and V have been constructed, personalizedrecommendations are calculated for a user u by ranking eachitem i in descending order of the predicted feedback:wi UuT ViThe interpretation is that the users and songs can each berepresented by k-dimensional vectors, where each dimensioncorresponds to one of k topics like (rock music, artist nameetc). Each row of U , represents user u’s degree of interestin each topic. Each row of V represnts the relevance of thesong v to each topic.The computation of SVD in our case is non-trivial dueto the large number of missing values in M . The abovedecomposition is possible only if M is complete. SimonFunk has described a method to estimate SVD using onlythe known weights (playcount)[2]. The idea is to viewthe decomposition as a regression problem of finding theparameters of the vectors u and v, so as to minimize the errorbetween the known actual song weights and the predictions.Err(w) argmin(m,nX(V (wu,s ) V ) R)u U,s Swhere V (wu,s ) is the predicted value.As only few songs per user are known, an L2 norm regularizeris added to prevent overfitting:R γ( u 2 s 2 )Now, we use adaptive stochastic gradient descent to minimizethe above objective. We train the user and song vectors onefeature at a time. Here are the update equations for user andsong vector feature i:ui ui λ(u0 v 0 u · v)vi γuivi vi λ(u0 v 0 u · v)ui γviWhere u0 and v 0 represent current feature values for useru and song v. γ is the regularization parameter and γ isthe learning rate. For our experiments, we varied the latentfactors from k 1 to 20, and varied γ {10 3 , 10 4 } withλinit 10 2 .B. Main Algorithms1) K-Means clustering: One of the approaches used wasK-Means clustering. For all the users in the training data, wefound clusters of users based on their song histories. Fromthe clusters, we predicted for each user which cluster wasthe best fit according to proximity to its centroid. The mosthighly weighted songs in each cluster were selected as thesongs to recommend for that particular user. Although thisalgorithm will approximate the accuracy of KNN, it is a lotfaster and easier to tweak.2) User Based Recommendations: Generally, given a newuser, for which we want to obtain predictions, the set ofitems (in this case songs) to suggest is computed by lookingat similar users. This strategy is typically referred to as userbased recommendations. In user-based recommendations, thescoring function, on the basis of which the recommendationsare made, is computed assim(u, v) u·v11 u 2 · v 2which is just a cosine similarity between two users u and v.One common way of finding the cosine similarity is finding the

number of common items the users share between them. Thus,sim(u, v) #common items(u, v)11#items(u) 2 · #items(v) 2The strategy relies on the fact that each user belongs to a largergroup of similarly-behaving individuals. Consequently, itemsfrequently purchased (listened) by the various members of thegroup can be used to form the basis of the recommended items.The overall procedure is as follows: We try to find the weightof each item Ii in the system for the new user u by findingthe set of users V in the training set who have listened tothe item Ii and then summing the user similarity using thesimilarity function. Thus, weight for song Ii , wIi is,Xwli sim(u, v)v VFinally, all the items are sorted by their weights and top-Nare recommended to the user.The cosine similarity weighs each of the users equally whichis usually not the case. In fact, a user should be weightedless if he has shown interests to many variety of itemsbasing it on the knowledge that he does not discern betweensongs based on their quality of the item, or he just likesto explore items. Likewise, user is weighted more if helistens to very limited set of songs. Thus, we re-modeled thesimilarity measure by parametrizing it with this weighing rule.sim(u, v) #common items(u, v), α [0, 1)#items(u)α · #items(v)1 αWe also, added another parameterto sharpen thisnormalization. The effect of this exponentiation is this:when γ is high, smaller weights drop to zero while largerweights are (relatively) emphasized. Thus our item weightequation is now,Xwli (sim(u, v))γ , γ [0, 1)v V3) ItemBasedRecommendations:User-basedrecommendations identify a neighborhood of users that,in the past, have exhibited similar behaviors. An alternativeapproach is to build recommendation models that are basedon items. The key motivation behind this scheme is that auser will more likely purchase (or listen to) items that aresimilar to items he already purchased; thus, in this approach,the historical information is analyzed to identify relationshipsbetween the items so that the purchase of an item oftenleads to the purchase of another item. The algorithm firstdetermines the similarities between the various items, thenuses them to identify the items to be recommended. The keysteps in this approach are (i) the method used to compute thesimilarity between items and (ii) the method used to combinethe similarities and provide recommendations.During the model building phase, for each item i, the k mostsimilar items {i1 , i2 , ., ik } are computed and their corresponding similarities are stored. Now, for each user, that has listenedto a set U of items, we first identify the set V of candidateitems for recommendation by taking the union of the k mostsimilar items that are already in U . Then, for each item c C,we compute its similarity to the set U as the sum of similaritiesbetween all items u U and c, using only the k most similaritems of u. Finally, as in user-based recommendation, thecandidate items for recommendation are sorted and the top-Nitems are recommended.Xwc sim(u, c)u U,c CItem similarity: We used two approaches to calculate ItemItem similarity used in the Item based recommendations.(i) Without metadata: We used the similar method as in useruser similarity by finding the common users that are sharedby items i and j. Thus, similarity function is,sim(i, j) #common users(i, j)11#items(i) 2 · #items(j) 2(ii) With Metadata: the second approach involves using thesong metadata to compute a cosine similarity score betweentwo songs. The features used for each song in this calculationare artist name, tempo, loudness, energy, year the song wascomposed, the five most common lyrics in the song, and thefive most heavily weighted user tags for the song. All thefeatures, except tempo, loudness, and energy, were treated asbinary features.Lyrics were included as features in an attempt to takeadvantage of any possible semantic similarity between songs.For example, if a user enjoys love songs, then this approachcould exploit the fact that love songs may often sharelove-themed lyrics.4) Multi-faceted Prediction Algorithm: The multi-facetedprediction algorithm uses multiple algorithms to moreprecisely recommend songs for the user. We do this becausewe have observed that different methods work better underdifferent circumstances. By combining the different methodsand weighing a specific methods prediction more based onthe size of the test users listening history, we can maximizeour song recommendation systems precision.The multi-faceted prediction algorithm uses the trendsobserved in Figure 7 and Figure 8. For different sizes of thetest users song history, different methods of prediction giveoptimal recommendations. Specifically, when the test userssong history is small, predictions based on song similaritymetrics tend to perform better; when the test users songhistory is large, predictions based on similar user metrics tendto perform better. This has a direct application to the coldstart or near cold start situation, in which predictions based

on a test users song history tend to be noisy and inaccurate,since the similar users do not accurately represent the testuser. In these situations, making recommendations based onsong similarity is better. Thus, the multi-faceted predictionalgorithm aims to use this phenomenon to make the bestpredictions. Using our validation sets, we tuned the weightsto produce the optimal set of predictions.V. R ESULTSA. Precision MetricThe metric we used to determine the qual

Project Report Yi Li Cornell University Rudhir Gupta Cornell University Yoshiyuki Nagasaki Cornell University Tianhe Zhang Cornell University Abstract—For our project, we decided to experiment, desig

Related Documents:

Big Day Wedding Songs 41 Song One 41 Song Two 42 Song Three 42 Song Four 42 Hadda N'Ayt Hssain, O, Bride: Berber Wedding Song 42 Women Writing Africa: West Africa and the Sahel 44 Béatrice Djedja, Maĩéto, or the Battle of the Sexes 46 Song One 46 Song Two 46 Song Three 46 Communal, The Plump Woman's Song 48

10 Speaking and Writing Song 9 61 1:14 11 Walk and Ride Song 10 68 2:00 12 The Subjects Song 11 74 2:09 13 Celebratio Song 12 81 1:42 14 Rex and Regina Song 14 101 1:00 15 The Mighty Miles Song 15 106 0:54 16 Porto-Amo Song 16 111 1:25 17 Bonus Chant: Being Chant 16 112 0:49 18 Cito-Lente Song 17117 1:08 19 Forte Song 18 122 1:08 20 The .

UCDP/PRIO Armed Conflict Dataset v.19.1, 1946-2018. UCDP Battle-Related Deaths Dataset v.19.1, 1989-2018. UCDP Dyadic Dataset v.19.1, 1946 - 2018. UCDP Non-State Conflict Dataset v.19.1, 1989-2018. UCDP One-sided Violence Dataset v.19.1, 1989-2018. UCDP Onset Dataset version 19.1, 1946-2018 UCDP Peace A

The Analysis Data Model Implementation Guide (ADaMIG) v1.1 defines three different types of datasets: analysis datasets, ADaM datasets, and non-ADaM analysis datasets: Analysis dataset - An analysis dataset is defined as a dataset used for analysis and reporting. ADaM dataset - An ADaM dataset is a particular type of analysis dataset that .

3 MIND Dataset 3.1 Dataset Construction In order to facilitate the research in news recom-mendation, we built the MIcrosoft News Dataset (MIND)8. It was collected from the user behavior logs of Microsoft News9. We randomly sampled 1 million users who had at least 5 news click records during 6 weeks from October 12 to November 22, 2019.

Trauma and Emergency Medical Services Information System Project (TEMSIS) The New Hampshire Bureau of Emergency Medical Services EMS Uniform PreHospital Dataset, Version 2.05 is composed of two separate documents or components. This primary document describes the EMS Dataset. A secondary EMS Demographic dataset has been defined which

Recommendation: All NUIC vehicles to have a licensedo perator 178 Recommendation 50. 179 Recommendation: Requirements for being a NUIC operator 179 Good repute 180 Financial standing 180 An establishment in Great Britain 180 Demonstrating professional competence 181 Recommendation 51. 182 Recommendation 52. 182 Recommendation 53. 182

Alberta and British Columbia, had revenues of 54.7 million (2019: 55.7 million) primarily from home support of 14.7 million (2019: 16 million), health equipment loans of 24 million (2019: 22 million), transportation of 7.7 million (2019: 8.5 million), meal deliveries of 3 million (2019: 3.4 million), and community initiatives of