Generating a Spotify Playlist
Home Page - Data Preparation - Data Exploration - Dimensionality Reduction - Clustering Techniques - Playlist Generation - Conclusion - Authors’ Gift

The past decade has seen the music industry undergo the biggest structural change since the launch of the mp3 and individual track purchases by Apple in 2003. Driving this change has been the Swedish company Spotify, launched in 2008 and today serving 50 million songs to 248 million monthly users. Streaming technology enables Spotify to deliver a massive song library to users digitally so that they can listen to any song they want without having to buy individual albums or songs. With the explosion in access the average listener has to more music has come a rise in the demand for recommendation algorithms, ways of cutting through the noise and delivering just the right song at the right time. Many companies such as SoundCloud, 8tracks, Stereomood and LastFM have expertly developed algorithms but Spotify is still pushing the category forward. Its Discover Weekly playlist recommends newly released music each user might like based on their listening preferences. Its Year In Review has become an annual cultural moment, summarizing each user’s listening history and top songs of the past year. To stay ahead, Spotify has developed an open API that developers can use to access musical features about songs provided through their library.
We present our contribution to the music recommendation field with our algorithm that generates a complete thematic playlist from a cold start of a few seed songs. We leverage Spotify’s open API and a starting dataset of 1,000,000 playlists through which we learn families of similar songs and use these families to produce exciting new playlists for our users. Our site explains the data we started with, how we enriched it, our process for identifying important musical features and grouping those into families and provides the user with a generation algorithm they can use to build their own playlists.
We conclude with a gift to the reader from the authors for your time reviewing our work. It is a playlist generated by our recommendation algorithm from seed songs that are all favorites of the authors. Please enjoy the music while you read this paper and thank you for your interest in our work!
Harvard University AC 209a: Introduction to Data Science (Fall 2019) Hardik Gupta, Johannes Kolberg, and Will Seaton Group 24