Writeup
Our shared interest in classical music motivated us to pursue this visualization project. We had previously noted a lack of a visual and interactive web applications allowing users to easily see what where, when, and what the top classical artists will perform. We were also interested in the influence record labels exert over given geographical regions, as well as in characterizing the amount of top-level performances in different regions of the world. Essentially the question became: what world regions or cities provide the most opportunities to see great performances of classical music? Our data came entirely from record label websites. We chose the two biggest names in the classical recording industry, Deutsche Grammophon and EMI, and used their websites to get the data. Both labels' websites include a list of their recording artist's upcoming performances. We scraped both sites using a Python script which got the essential data (artist name, concert date, repertoire being performed, venue, city, country) and used part of this data (city and country information) to geocode the event (using the google geocoder via Python), thus instantly giving us a latitude/longitude position. This was done for reasons of efficiency . essentially, the script can do all the work and all the waiting so that the user (and browser) doesn't have to. The meaning of the data is straightforward . as a set, it regionally characterizes the number of high-level concerts while at the same time providing a detailed account of each concert. This is exploited in our visualization, as it allows the user to see the general view (density of concerts according to geographic region) rather quickly as well as the specific view (concert dates, locations, repertoire, artists) through interactive features.
Our visualization was designed on the Google Maps API platform. We chose this in order for our application to be easily accessed by a wide variety of users via the web . having never encountered a visualization of this nature, we wanted to make it available to as many people as possible. Rather than place markers for individual events, our design places markers at each city where one or more events are planned. The marker is a transparent circle of varying size and color. The color denotes the record label and the size the number of concerts that label's artists are scheduled to play in the city. This allows the user to see both the influence that each label exerts on different geographical regions and to gage the relative amount of performance activity in a given city. In order to gain greater clarity, the user can toggle each label's markers on and off using a control at the top-right corner of the map, allowing the user to see just the EMI markers or just the DG markers. The user can also zoom in if an area is too densely populated in order to carefully select the city of interest. Clicking on the city's marker displays the name of the city, the number of concerts scheduled for that label, and a list of the concert dates, artists, and planned repertoire. These last few features add interactivity to the application and allow users to really delve into the data either for practical purposes (looking up their favorite artist's schedule or planning concert attendance during travel to a specific city) or if they are interested in exploring the data and finding interesting relationships (for example, although EMI only has concerts in the USA, it dominates the San Francisco market while DG, which mostly dominates the European market, dominates the Los Angeles market). A search bar allows the user to search for a specific artist . clicking "Go!" displays one marker for each of this artist's performances and hides the previously showing global label markers. Clicking on the control buttons hides these markers and shows the label markers with which the map opens.
The answer to our initial question seems to be that central Europe is "the place to be" for classical music lovers. Specifically Berlin, London, Paris, and Milan easily carry the largest number of concerts out of any city included in our map. This map has turned out to be even more fascinating than we first imagined. For example, we never expected EMI to only hold concerts within the USA, while DG's influence in the USA is far greater than expected. One interesting relationship perhaps worth pursuing was already mentioned above: are EMI and DG somehow in competition in different portions of the California market? The data seems to suggest that DG covers the LA area very well, while EMI covers the SF area. Another interesting observation within the US is that the major cities in the west coast have more concerts than the major cities in the east coast. This is quite surprising, considering that New York has always been so active in all sorts of cultural activities and is home to some of the best symphonic and operatic ensembles and companies in the USA. The most important improvement we can suggest is in the area of performance. The searches are sometimes a bit slow, and it would be nice to do things in real time rather than populate the markers from a static xml document. This would entail reworking the marker-generating code and utilizing a MySQL table to store data. This would also allow for searches to happen much faster and for a script to run every few hours to check for any changes in the data. An obvious extension is to allow users to search along more fields. right now only searches in the "artist name" field are allowed.
The most satisfying part of this project was seeing all the different parts come together. While the individual parts may have seemed unexciting at the time (i.e. scraping for data), the end result was truly rewarding. The most challenging part was finding a good way to debug the site. The firefox debugger tool was very helpful in this respect. Another challenge, and something we would do differently if we were to do this again, was the lack of a design document or a truly complete design concept. While we had a specific question and an almost-complete idea of what we wanted in the end, the last push towards completely solidifying our vision of the project would have saved us some time. Nevertheless, we are very happy with the end result and very excited to see how it can be extended and made widely available.