Newswise — At a recent IEEE technology conference, UC San Diego electrical engineers presented a solution to their problem with the song "Bohemian Rhapsody," —and it's not that they don't like this hit from the band Queen. The electrical engineers' issue with "Bohemian Rhapsody" is that it is too heterogeneous. With its mellow piano, falsetto vocals, rock opera sections and crazy guitar solos, Bohemian Rhapsody is so internally varied that machine learning algorithms at the heart of their experimental music search engine have trouble labeling the song. The solution presented at the 2009 International Conference on Acoustics, Speech, and Signal Processing (ICASSP) in Taiwan could lead to improvements in the electrical engineers' song labeling and search engine system.
The system "listens" to songs it has never heard before, labels them based on the actual sounds in the song, and then retrieves songs, as appropriate, when people type descriptive words—like "mellow jazz" —into the team's experimental search engine.
At ICASSP, UC San Diego electrical engineering Ph.D. student Luke Barrington presented a new model for music segmentation that can capture both the sound of a song and how this sound changes over time. By modeling music in this way, Barrington showed how to automatically segment songs such as Bohemian Rhapsody into homogenous sections such as verses, choruses and bridges. This new approach to training computers to dissect songs into heterogeneous segments and then accurately label each chunk could improve the accuracy of the new music search engine built by engineers from the Jacobs School of Engineering at UC San Diego.
The team's nickname for their experimental music search engine is "Google for music" . Users type descriptive words—rather than song titles, album names or artist names—and the search engine returns specific song suggestions. The engine currently works for more than 100 words that cover music genres, emotions and instruments. The Jacobs School engineers are working to expand the search engine's "vocabulary" before opening it up to the public later this year.
Teaching Computers to Label Songs
In order to "teach" the search engine new words, the engineers need to show it many different examples of songs that fit that description. Initially, the engineers paid UC San Diego undergraduates to manually label songs that would serve as training materials for machine learning algorithms. But instead of continuing to rely on this expensive option, the engineers built online music games that encourage people connected via the Internet to do the song labeling while listening to music online.
In April, the electrical engineers launched their games on Facebook as an application called Herd It (http://apps.facebook.com/herd-it).
Watch a two minute video about the making of the games here:http://www.jacobsschool.ucsd.edu/news/news_video/play.sfe?id=28
To play Herd It, log in to Facebook, open the Herd It app, select a genre of music, and start listening to song clips and playing the games. Some games ask users to identify instruments, while others focus on music genres, artist names, emotions triggered by the song, and activities you might do while listening to a song. The more your answers align with the rest of the online crowd playing the game at the same time, the more points you score.
"The Facebook games are a lot of fun and a great way to discover new music. At the same time, the games deliver the data we need to teach our computer audition system to listen to and describe music like humans do," said Gert Lanckriet, the electrical engineering professor and machine learning expert from the Jacobs School of Engineering steering the project. Lanckriet also leads UC San Diego's Computer Audition Laboratory, housed at the UC San Diego division of Calit2.
For the system to "listen and describe music like a human," it must find patterns in the songs using the tools of machine learning. For example, for the system to learn to identify and label romantic songs, it must be exposed to many different romantic songs during the training period.
This exposure enables the machine learning algorithms find patterns in the wave forms of the songs that make the songs romantic. Once trained, the system can identify romantic songs that it has never before encountered, offering the tantalizing possibility of amassing a huge database of songs that can be tagged and retrieved based on text-based searches with no human intervention.
"The more examples of romantic songs our search engine is exposed to, the more accurately it will be able to identify romantic songs it has never heard before," explained Barrington.
Part of Barrington's Ph.D. dissertation will involve demonstrating that data collected from the Facebook games reliably improves the accuracy of the search engine.
"Once enough people play our new music discovery games on Facebook, I'll have the data I need to both improve our search engine and finish my Ph.D.," said Barrington.
The song-word combinations collected by the Facebook games will also enable the researchers to grow their music search engine's vocabulary and increase its coverage in genres and classes of music.
2009 ICASSP (IEEE International Conference on Acoustics, Speech, and Signal Processing): "Dynamic Texture Models of Music," by Luke Barrington, Antoni Chan and Gert Lanckriet from the Electrical and Computer Engineering Department at UC San Diego's Jacobs School of Engineering. To appear in ICASSP 2009.
Online Game Feeds Music Search Engine Project UC San Diego press release from 2007
A Search Engine with Ears Jacobs School of Engineering alumni magazine, Pulse
The National Science Foundation (NSF) funded some of the research leading to this publication, as well as some of the students who contributed.
UC San Diego's von Liebig Center provided funding that enabled the researchers to create Herd It's professional interfaces for the music games. The von Liebig Center also provided the engineers with entrepreneurship advisory services and "incubation space" .
MEDIA CONTACTRegister for reporter access to contact details
2009 International Conference on Acoustics, Speech, and Signal Processing