 Thank you very much. So, I will only give a brief introduction to the project. I am Aracisa John from the Talon Research Group, and this is a collaboration with the Music Technology Group, and also we have as principal investigator as Ricardo Baisa. And then I will pass the floor to Luis. So, the outline of the presentation will be first a motivation for this activity in the Maria de Maetzo Initiative. The challenges of natural language processing in particular in the music domain. Then Luis will be talking about identifying musical entities in free text, also in addition to identify musical entities in text, he will tell us how he computes relationships between, identifies relationships between entities, and also how he has carried out a study on music, sentiment analysis in music repositories, with the objective of applying it to musicology. Then we will also describe some dissemination and outreach activities. So, the motivation. So, there is textual information about music created every day and making sense of what is written in those textual documents can contribute to musicology and investigation in general in music information retrieval. So, by transforming textual content into some knowledge representation, you will be able to obtain knowledge, to infer some knowledge. For example, you may have a huge database of musical genres, artists, and so on. And you may infer that at a particular time there was this, the born of a particular music genre because some political or social situation appeared. So, you will be able also to ask and receive an answer to complex questions. So, for example, you may be able to ask how many guitar concertos Rodrigo composed and which one is the most popular one. In order to do that, you will need to go to a database, first understand that Rodrigo is one possible composer. You will need to understand what is a guitar concerto. You will need to extract different information about concerts composed by this composer. And then finally, to query some databases in order to find which one is the most popular one. So, also, transforming textual information into actual knowledge will allow the visualization of information, not the typical graph that we see of concurrence between entities, but also more meaningful relationships like relationships, why two artists are related, which is the relationship between them or which is the set of relationships that link them. By having also these relationships between artists and words of arts, we will be able also to improve navigation and improve recommendation, for example. So, Music Mix NLP is a project that brings together people from the natural language processing group on music technology and is a collaboration that started between Luis and Sergio in 2014 already. And this project will fill an existing gap between music information retrieval or music technology and natural language processing. And we will try to bridge this gap with our technology. And this kind of activity has been well received by the places where Sergio and Luis have been presenting their work. So, there is people in the music technology and in NLP are very keen on this type of collaboration. So, what are the challenges? So, one of the principal activities in natural language processing is try to identify entities in text. This is a core activity where you try to identify the name of a person, the name of an organization and so on. But in the case of music, this is particularly difficult because brand names, album name songs have very specific characteristics that probably a natural language processing tool, the standard of the shelf, will fail on any text that talks about music, will be unable to identify any music related entity. So, we need to deal with this difference in domain. So, all systems that are ad hoc should be adapted or should be improved or machine learning techniques should be used in order to be able to operate on text that pertains to the music domain. So, entity recognition is one of the particular activities that we carry out in this project. And the typical procedure that you will find in available tools that they rely pretty much on lexicons or gazetteers that will help identify entities. So, this can be an efficient thing when you try to identify, for example, using Arctic names that the symphony number blah blah blah because you have a specific lexicon that you can take advantage of to identify entities. But what happens if you have different names for the set musical entity, like the ninth, the ninth can refer to anything. If you don't have enough context, you will not be able to identify that the ninth refers to an actual symphony by Beethoven or by any other author. Also, we have the problem of ambiguity as ambiguity is inherent in natural language. So, if you have Carmen, Carmen can refer to the opera but also to the character or can refer to the character in the opera or the character in the play by Merime. So, you face all these difficulties. There is also variability. You can have names that actually are like common name entities and also variations of names of musical groups that have to be taken into account. Common names like Madonna will be used for artists and you don't know whether it refers to the Queen of Pop or whether it is referring to a representation of the Virgin Mary. Okay, so if we take the case of Spain, so you will have Placido Domingo in Madrid this Saturday, because Placido Domingo is also a common vocabulary in our language. It means pleasant Sunday. So, probably a natural language processing system in Spain will have difficulties in identifying that Placido Domingo is the renowned artist and will probably believe that he's talking about the weather. In any case, we face various difficulties in natural language processing in the music domain. Okay, it is also true that current systems in natural language processing rely on the availability of knowledge bases like Yago or DbPedia or Freebase or Wikidata in order to create lexicons or vocabularies that then you can match against the text in order to find the entities that are of your interest. The problem with this database or knowledge repositories is that they lack many pieces of information. So they cover, for example, the most popular things, but they don't cover the long tail, for example. And the long tail, you find it in free text. So, and this is the challenge that we face. And these knowledge repositories need to be constantly updated. And currently they are updated just by looking at structural sources of information and not to unstructured sources of information. So there is a lot of unstructured information in textual format available out there, like artist biographies or articles that talk about specific pieces of music or artists and also, obviously, a free text that can be exploited from Wikipedia in order to identify useful entities in the music domain, to contribute to creative knowledge bases and to contribute to useful applications. Now, in what follows, Luis will present three cases studies on the application of natural language processing to the music domain. Thank you. So, thank you, Horacio. So, well, after the introduction, I'm going to just very quickly go over three projects that are already framed within the Maria de Maes initiative. And the first one is about what Horacio just explained, the type of identifying music entities in text. Much better. Yeah, so because of the problems that Horacio was explaining, one of the possible approaches, one of the alternatives could be to train an algorithm on very high quality data in terms of entities annotated in the music domain. So all these characteristics, all these idiosyncrasies that he has been describing would be manually checked. And then, ideally, we would have a clever way to deal with this kind of variability. So, well, in this case, one of our proposals was to automatically construct a very large document collection with thousands of music entities, specifically, we focused on album, song, artist, and record label. And the idea was too high, even if you didn't have everything, every single mention annotated in your data set, whatever you had had to be of very high quality. Because in this case, in this way, you would have high quality source data, for example, for propagation, semi-supervised learning, et cetera. In this paper, what we did was to process thousands of biographies from the website last FM. And the method which we named after the acronym was ELVIS, and it's an entity linking voting and integration system. So the basic idea behind it is that even if you take isolated tools that Horacio said would do pretty poorly in dealing with text, with musical text, if you build a framework where you can plug an arbitrary number of them and then look at the degree of agreement they have in different text spans, then you can use this information as a confidence measure. Sergio is the developer behind it, and the code is available, so it's a framework where you can plug any name-dentity recognition or entity linking system, and it unifies the output and provides you with a confidence measure for each of the predictions it does. And also the data set that I will describe in the next slide is also available from the MTG website. Yeah, so this is to show that it actually worked. So we manually checked 1,400 predictions. So we did an extensive evaluation of the quality of the disambiguation with different type of configurations and different degrees of agreement. And, well, what I would like to highlight is that when all the systems that we tried in our paper agreed, the precision, like the correctness of their prediction was very high. So that was one project that should set the foundations for what we want to do in the future, which is training entity linking systems and named entity disambiguation algorithms specific for the music domain, ideally to be able to cope with the variability that Horacio was explaining earlier. And another interesting thing that can be done is to take advantage of the fact that you already have identified entities and disambiguated in large text collections to encode relationships in natural language so that you can, for example, build a large knowledge graph out of them. And from there, this graph can be used for many other applications. So without going into much detail, we were very happy to see nice results from a collaboration that relies a lot on linguistic information. So our approach was based on crafting a set of linguistically motivated rules over the syntactic tree for every sentence where at least two entities were already disambiguated. So in this case, for example, we would have the sentence, Hated Here was written by Wilco Frontman, Jeff Tweedy. And, well, the algorithm would first identify the text spans, the offsets of each of the entities, then assign them a type, and then disambiguating them to any available knowledge repository. So you would have your music range ID for Hated Here, or for the case of Wilco and Jeff Tweedy, their corresponding Wikipedia pages. As I said, instead of just looking at coherence or other bag of words approaches that put together entities and match them together according to how near they appear in text, we wanted to have something a little bit more linguistically motivated and the triples, the left argument, relation, right argument that we extracted look like this. So Hated Here was written by Jeff Tweedy, or Jeff Tweedy was the frontman of Wilco. If you do this over thousands of sentences and with thousands of triples, not only you can construct a large graph in the music domain, but also you can reward better relations and punish those ones, maybe which are sparse, maybe they are redundant, maybe they are wrong because something might have gone wrong during the pipeline, et cetera. So this was a very nice application of this graph, was actually not music recommendation, but rather explanation of music recommendation. The idea is that on paper there are certain artists or certain songs that should not have anything in common and traditional recommender systems work a lot on, well, if user A and user B listen to the same set of songs, if user C has something in common with user B, maybe what user A listens to, maybe that can be applied as well. But we found that it could be not the recommendation itself, but rather giving a reason why you got a recommendation could be interesting. And we found a very nice example that I wanted to show. So imagine you like Tom Waits. Tom Waits, for those of you who don't know, is like a very obscure and, well, not the happy type of musician, we could say, right? And actually the recommender system that went out of the graph recommended the song by Lady Gaga, which is Technopop, so on paper it shouldn't have anything to do. And the explanation is it comes from the path that exists between them, and since we have the edges in the path are relations expressed in natural language, you can actually see why is this the case, why you get a recommendation of Lady Gaga if what you like is Tom Waits. So apparently Bruce Springsteen covered Jersey Girl, which is the song that the Tom Waits song, and Bruce Springsteen had performed fairly often with someone called Clarence Clemons, which I found out that this saxophonist over here. And this same saxophonist had performed live with Lady Gaga. So, well, I'm not going to play them, but if you actually listen to them, they are much more similar than you would expect in theory. So we found this was a very nice application. We ran a survey of people rating the quality of the recommendations. And actually this kind of explanation was highly valued for people who were not trained in terms of musical knowledge. So people who were a little bit ignorant, like myself, like I see this. And I say this makes a lot of sense. And well, if this is actually true, this is great. People who knew a lot about music didn't find this very useful. But still, yeah. There were other applications, but we didn't put them here, but yeah. And finally, the last of the three papers that we have done within the time that we had within for what has been the Maria, the Math 2 collaboration so far, was to work with a large collection of reviews in the music domain, the music products from Amazon. And the idea was to extract correlations between how people felt about certain music products, especially albums, and either the date of their review or the date of publication of the album. We did, so it was a very heterogeneous paper. So half of it was on machine learning and text classification. And the other half was on musicology and a little bit digressing. But I think that by far the most interesting part is the application of this for musicology. So the sentiment analysis framework was something that came from a framework from Dublin, where Sergio stayed last year. And basically, so it works like this instead of giving you a score for the whole piece of text, it's what they call aspect level sentiment analysis. So first it identifies key elements in the review, and it assigns each element like guitar riffs or vocals. It assigns a score according to, well, a presence or absence of certain adjectives and so on. So, well, we had like a kind of an extensive study. And, well, one of the interesting findings we found was that if you look at the reception of and the criticism of albums in the genres of reggae and pop across years, it was pretty clear that there were significant peaks in the 70s in the case of reggae and in the 60s in the case of pop. And, well, we thought it made a lot of sense to associate these peaks with the appearance of key artists in both genres. As I said, this part of the project is a little bit more speculative, but we thought it was very interesting as a first step for making life easier for researchers in humanities and musicology so that they can benefit from the information they get from, well, from automatic processing of textual content. And, well, in terms of dissemination and outreach, all these and much more we will explain in Nismid, in August, in a tutorial that is called Natural Language Processing for Music Information Retrieval. But it's also for musicology, so we will, I'm expecting, I'm hoping for a little bit of discussion by people who disagree with what I just showed, but, well, the idea is this, to present both the part that we can apply to music information retrieval, recommendation, artist similarity, automatic playlist generation, etc., etc., but also its application to musicology. So the tutorial, this is actually the first slide of the tutorial, so since we have it almost ready, Sergio, myself and I'm sure, will be the speakers and we are advised by Yolazio and Xavier. And that's everything, so thank you.