Shazam Music Identification Service – Case Study
Essay Preview: Shazam Music Identification Service – Case Study
Report this essay
Shazam Music Identification Service – Case Study
Yahav Biran
CMGT/557
December 23, 2013
Ken Orgill
Table of Contents
Shazam Music Identification Service
History
Shazam – Affected Technologies
Query by Example (QBE)
Acoustic fingerprint
NoSQL Database Management
Machine Learning
Shazam Evolvement through Technology
Company Product Line
User Perspective
Conclusion
References
Table of Figures
Figure 1 Shazam early interface
Figure 2 Basic Fingerprints Transformation
Figure 3 Shazam iTunes User base (Asymco.com, 2013)
Figure 4 Shazam User Experience Evolution
Shazam Music Identification Service
Music is playing an active role in peoples life; individuals are regularly exposed to music in various venues: driving, dining, or even swimming. According to Music Reports (2013), 90% of music content is stored in a digital medium. 60% is offered online through music services such as, iTunes, Spotify, and Shazam. The music services offer the music data through content aggregators such as, Catpult, CDBaby, Tunecore and The Orchard. One of the main challenges music content providers is facing today is maximizing their return of investment. It consists of two main components, legal rights for maintain, manage, and distribute the content. The second component is the store, search, discovery and maintenance aspects of the digital media on a cloud based service. Content monetization strategy is one of the ways to increase the return of investment. This approach includes advanced search capabilities that will maximize the search efficiency. Moreover, it will help content providers to predict what content or other properties to promote.
Music like any other textual content includes properties also known as Tags that help search engines to classify the searched item. Shazam is a music service provider that enables music identification method by audio search rather than textual search.
Audio-based identification process is based on emerged identification technology, acoustic-similarity method (Wang, 2006). Although the technology theory is in use almost 10 years, it was proved as an effective method only recently. The growing amount of music content enriched the index the acoustic-similarity method is based on, which allows it to be effective.
The following paper will review the history of Shazam and the technology its service use. It will discuss how unique is Shazam among the existing music service providers thanks to the music identification technology.
History
Music recognition research started in early 1980s by Broadcast Data Systems using a range of techniques of audio wave correlations (Miotto & Nicola, 2012).
In 1996, Musclefish developed a new identification method based on multidimensional feature analysis and Euclidean distance metrics (Miotto & Nicola, 2012). According to Miotto and Nicola (2012), the methods limitations to clean audio samples does not fit to the common scenario, e.g. being in a car, restaurant, or shopping mall.
Music identification services use to offer in early 1999 by StarCD. StarCD was a service that allows a user who hear a song on the radio, call the service, play the song over the phone, find the music track, and purchase the album with the required track. StarCD generated a request to the content provider about the played song at the transaction time on the radio station. The StarCD service limitation to songs played on participating radio stations was the lack of acoustic-similarity methods.
Shazam Entertainment was the first to introduce query-by-example (QBE) music search service that enables users to search for music track based on an audio track sample using a mobile phone as a recording device (Wang, 2006). Figure 1 shows the early Shazam interface, the audio sample recording, and the query results.
Figure 1 Shazam early interface
Shazam Entertainment founded in 2000. Initially it developed the first QBE music recognition service targeted for mainly dumb cellular devices. Because Internet connectivity was not part of the carriers offering, the ability to run an application on the device was complex and expensive task, the initial interface was through a voice call, like the StarCD service. The Shazam service included several advantages such as calling the service using short dial code rather than a full number. Moreover, after the sampling phase, the server would hang up the call and return the results through short text message. In 2004, Verizon Wireless was the first to offer the Motorola Razor 1. As a feature phone, it has some Internet and Java capabilities that allow running application in a form of an applet. The significant improvement in the new method was the applet local ability. The applet performed the extraction operation that produced the signature file. This signature, in a firm of a string, sent to the server that could use the signature file as the main parameter in the query by example. The server could read the signature, search and return the results: title, artist, album back to the client (applet).
Having the new mechanism of the data flow in place did not eliminate the main challenge of the music detection task, the noise, and distortion. The detection method modified to a combinatorial hashing that significantly improved the detection quality (Garcia-Hernandez, Feregrino-Uribe & Cumplido, 2013).
While the identification method improved significantly with the combinatorial hashing method, the service hits another difficulty, this time on the service side. The growing demand and content provided by Shazam in 2004 created a capacity issue in managing the growing numbers of titles. The database management issue of audio “fingerprints”