How Spotify Uses Machine Learning Models to Recommend You The Music You Like

In the early 2000s, Songza implemented a manual music recommendation system for its listeners, where a team of music experts and curators would create playlists. But these recommendations were not objective, as they were dependent on the personal taste of the curators.

It was an average experience for listeners, with a fair share of hits and misses, because it was impossible to make a playlist which catered to the varied tastes of a diverse set of people. The technology and the data did not exist back then to build a playlist that would be personalised to the taste of each individual listener.

Along came Spotify a few years later, offering a highly personalised weekly playlist called Discover Weekly that quickly became one of their flagship offerings.

Every Monday, millions of listeners receive a fresh playlist of new song recommendations, customised to their personal tastes based on their listening history and the songs they’ve engaged with. Spotify uses a combination of different data aggregation and sorting methods to create their unique and powerful recommendation model that’s powered by machine learning.

One of our flagship features is called Discover Weekly. Every Monday, we give you a list of 50 tracks that you haven’t heard before that we think you’re going to like. The ML engine that’s the main basis of it, and it’s advanced some since, had actually been around at Spotify a bit before Discover Weekly was there, just powering our Discover page” – David Murgatroyd, Machine Learning Leader at Spotify.

Spotify uses three forms of recommendation models to power Discover Weekly.

1. Collaborative Filtering

Collaborative Filtering is a popular technique used by recommender systems to make automated predictions about the preferences of users, based on the preference of other similar users.

On Spotify, the collaborative filtering algorithm compares multiple user-created playlists that have the songs that users have listened to. The algorithm then combs those playlists to look at other songs that appear in the playlists and recommends those songs.

This framework executed by matrix math in Python libraries. The algorithm first creates a matrix of all the active users and songs. The Python library then runs a series of complex factorisation formulae on the matrix. The end result is two separate vectors, where X is the user vector representing the taste of an individual user. Vector Y represents the profile of a single song. To find out users with similar taste, collaborative filtering will compare a given user vector with each and every single user vectors to give a similar user vector as the output. The same procedure is applied to the song vectors.

Spotify does not only rely on collaborative filtering. The second recommendation model used is NLP.

2. Natural Language Processing

NLP is the ability of an algorithm to understand speech and text in real-time. Spotify’s NLP constantly trawls the web to find articles, blog posts, or any other text about music, to come up with a profile for each song.

With all this scraped data, the NLP algorithm can classify songs based on the kind of language used to describe them and can match them with other songs that are discussed in the same vein. Artists and songs are assigned to classifying keywords based on the data, and each term has a certain weight assigned to them. Similar to collaborative filtering, a vector representation of the song is created, and that’s used to suggest similar songs.

3. Convolutional Neural Networks

Convolutional Neural Networks are used to hone the recommendation system and to increase accuracy because less-popular songs might be neglected by the other models. The CNN model ensures that obscure and new songs are considered.

The CNN model is most popularly used for facial recognition, and Spotify has configured the same model for audio files. Each song is converted into a raw audio file as a waveform. These waveforms are processed by the CNN and is assigned key parameters such as beats per minute, loudness, major/minor key and so on. Spotify then tries to match similar songs that have the same parameters as the songs their listeners like listening to.

With these key machine learning models, Spotify is able to tailor a unique playlist of music that surprises its listeners every week with songs they would have never found otherwise.

A key problem in many machine learning models is the lack of access to clean, structured data that can be processed. Spotify has been able to circumvent that problem due to their access to massive amounts of data that they collect from their users. They’ve been able to shine as a great example of effective use of Machine Learning models to give their users an unrivalled personalised experience.


Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *