USF Students Develop Algorithm to Predict Popular Music Success

American rapper and country artist Lil Nas X made history after his remix of his own song “Old Town Road” became Billboard’s longest-running No. 1 single ever. “Old Town Road” brought him seventeen weeks of success, millions of streams, and multiple remixes — could this popularity have been predicted? USF students and researchers Kai Middlebrook and Kian Sheik developed an algorithm to answer that question. 

The algorithm allows for computers to “listen” to songs and predict, based on numerous factors and criterion, whether or not they will become hits. After Middlebrook and Sheik’s study, in which they released their algorithm’s findings, was pre-published on arXiv, a repository for electronic research papers that have yet to be peer reviewed, it was quickly picked up and shared in several media outlets like TechXplore and Popular Mechanics.

Middlebrook, a senior data science major and avid music enthusiast, came up with the idea to create this algorithm after working in USF’s Machine Learning Artificial and Gaming Intelligence and Computing at Scale (MAGICS) lab.  

Computer science Professor David Guy Brizan and data science Professor Paul Intrevado run MAGICS, which is a hands-on lab available for students working on technological advances outside of the classroom, as stated on the MAGICS Lab website. According to Brizan, he first conducted research with Middlebrook to create a program that would allow for a computer to detect a song’s genre; essentially, serving as deciphering ears. 

Middlebrook then partnered with Sheik, a recent USF data science graduate, to develop the program even further.

Having worked together on a project in a data mining class, Middlebrook said he felt that teaming up again with Sheik on his endeavor was a no-brainer. While filtering through copious amounts of music, Sheik and Middlebrook had a discussion about why all hit songs sound similar to them, and what those similarities were. Inspired by that conversation, the two data scientists got to work.

Middlebrook and Sheik then began creating the algorithm by using Spotify’s Application Programming Interface (API), which provides developers with public access to its music and data. It took them about a month to complete their research and develop a program that could be tested.  

“Our results were basically four models that could take in data [a song], give a prediction of whether the song would be a hit or not, then each model was tested for accuracy,” Middlebrook said. The four models — a neural network, logistic regression, support vector machine (SVM), and a random forest (RF) architecture — were created to break down and evaluate each song on different criteria such as tempo, volume, danceability, and more. Songs were tested with each model, and the resulting data was then compared to find the relationship between variables (songs vs. hit/non-hit) and classify them into categories.


Our results were basically four models that could take in data [a song], give a prediction of whether the song would be a hit or not, then each model was tested for accuracy.

Kai Middlebrook

This technology has real-world application in the music industry — one of the advantages for record labels that utilize this technology is the ability to cut down the time and energy it takes to sort through the vast amounts of content they receive. 

“There is a lot of music that can make it difficult for A&R’s [Artist and Repertoires] and managers to filter through the noise in the music industry,” Middlebrook said, adding that the program allows more artists to be discovered.

On the other hand, Middlebrook explained that a downside to his research and algorithm is that the popular-music industry could potentially become wrapped up in the process of mass production, meaning that all major hit songs could gradually start sounding similar and repetitive in order to “ensure” that they’ll become hits. This could “stifle creativity for artist[s] while labels try to mass produce a specific sound,” Middlebrook said.

When asked about their future plans for the algorithm, Sheik said that he and Middlebrook “hope to develop more algorithms and advance the field, while maintaining the sanctity of art.” 

Middlebrook added that he and Sheik are in the process of launching a startup, and that that their new algorithms will be more useful for music industry businesses. “I cannot say too much other than I’m very excited,” he said. 

Leave a Reply

Your email address will not be published. Required fields are marked *