 So, thank you for staying this long. I'm going to talk about a project we have proposed for the MariaLMIS2 program, and it's based on two concept-wise data analytics to analyze data in the context of enhanced music learning. So, the team for this project is mainly composed of MTG people, from people working in the music technology group, and also it's multidisciplinary. It's also by Davina who just spoke, and in MTG we have experience in music processing and all the technologies related to music, and Davina, I don't have to say she just spoke about her expertise, so we will be combining these two, and also we have some members or collaborators of the project who are external to the UPF, Waltero Volpe from the University of Genoa. They have experience in gesture capture, image processing, and then, as I will tell you, this is an important issue in the project, and also George Badl and Aaron Willamon, who are a professor and postdoc in the Royal College of Music in London, one of the most influential conservatories in the world, and they will be very much important in the sense that they will be dealing with the pedagogy of the project. So, to put this everything in context, I will show you two videos, very short videos, about two people with slightly different expertise in the learning. So, I forgot to put this. Let me play it again. You can hear more or less, and this is another player. Well, you get the idea. So, the question is, what is required to go from person A to person B in terms of experience in playing an instrument? This is a complicated question. At least we can say there is a lot of practice, but probably not only that. So, what we are interested in is how can we use technology to try to enhance this transition from A to B. Basically, enhance the learning process. So, that's exactly what we intend to do. So, to put the project, Maria Maitzou project in context, as well as the academic context, we have another project, an European Union project called Tell Me. And it is an H2020 project, Research and Innovation Actions. And it's a three-year project and it started very recently. And the partners are the usual suspects from the team's slide. We are the coordinators, the Enrogator College of Music, they are doing the pathology part, University of Genoa with gesture and image analysis. And then we have two companies. One is High Skills, who is a gamification company that they have experienced in taking learning tasks, building gaming frameworks to try to make the learning process as a game. And then Psycho is a Catalan company, a Spanish company, which will be in charge of dissemination and possible exploitation of the results. So, the project, Maria Maitzou project is very in line with the Tell Me project. So, what are the motivations for the project? The motivation is based in three observations. The first one is that music is... Well, the first one is not... When you learn a music instrument, the benefits are not only the trivial ones that you learn to play an instrument, but there are many other benefits having shown that cognitive, social, and well-being. There are many, many benefits that you get from learning an instrument which are not exactly or directly related to playing the instrument. The second observation is that the music education is elitist in the sense that a small percentage of the population have access to formal music education, which means, from the first observation, that just a small percentage of the population can benefit from the outcomes or the benefit from music and direct benefit from music learning. And the third one is that even the people that are lucky enough to have music education, access to formal music education, there is a big rate of abandonment. So, many people start an instrument when they are children or later, and many of them drop out on the way they just leave the music education, which is a pity. So, our aims in the project are several. One is to buy the access to music education and then buy the benefit or have more people benefit from music learning than lower the abandonment rate that I mentioned. And then this is by doing two things. One is investigating how we learn music instruments, trying to understand exactly what is the process that I was saying before, how we can make the life easier for someone starting in Person A to become Person B or at least getting more people getting longer in the way from A to B. And then this by creating a system and interactive technologies that we have to emphasize complement music education. We are not intending to replace teachers or schools. It's just a complement to music education. So, the project overviews as follows. We have, of course, the center of the project is the student, the music student. We have a set of sensors around the student and these sensors are cameras of different qualities. We're ranging from a camera from a smartphone to really high resolution cameras and passing from all types of cameras and also sound. We want to send those microphones to record the sound, of course. This is probably the center part of learning an instrument. And again, we will have ranging from a very basic microphone from a smartphone to really high quality pickups placed inside the violin. And then also we will be collecting all the data through sensors. For example, ECGs, we can record accelerometers data or EEG data and so on. We can get all this data and then to understand exactly what's happening in the learning process. And then the information of these sensors will be connected to a computer which on the one hand will be providing some feedback, real-time feedback to the student. It could be in many, depending on the purpose of that moment, the student, the practice of that student. For example, it will be practicing the vibrato and then it will be visualizing the amplitude and the frequency of the vibrato and comparing these with probably other musicians because the second part of the computer is connected to or is providing access to a data set of expert performances. So we will have a set of recordings which have been done by expert musicians and then when, for example, practicing the vibrato, then the student can compare to different masters how his vibrato is related to the other ones. And then this database is of course a multi-modal database because it will be containing all the information from the sensors acquired here. So we will have sound, video and other kind of sensors. Information, I will talk more about in the next slide. And the last component of the system is a kind of small or lower scale social network in which students interact with the database by exploring the database, by comparing the performance to the performance in the database and also, probably very interesting, it will be how they interact among themselves because a student can decide to upload the performance on a particular exercise and the other students or teachers in this network can comment and give feedback on the performance. So then about the multi-modal database is called RepoVis. This has been developed in the context of all the projects, in the MTG and I will show you a very brief video just to have an idea of what it is what the capabilities of the database as it is now. I love the sound for some reason but the idea is that we have different modalities for the data and we can choose which one we visualize, they are synchronized so we can see at the same time the video we can see the gestures came back and it was not me. And it allows for manipulation of the data once we capture the data we can label it, we can extract higher level features from this data we can capture events and this is particularly what we are interested in to have gesture capture from violin performances and then audio and all the other types of data I mentioned before. Okay, so if we go back to the system overview where is the analytics? There are plenty of places we can get information from data one, the most obvious is we have a rich multimodal information stored in the database we can apply analytics in many ways I will detail one of them but we can detect patterns among gestures and sound about physiological sensors and sound and so on we can apply machine learning techniques to extract information from here a second one that we will be exploring is taking a student individual we can try to find patterns among that person interaction with the system especially with the database and then find useful patterns and probably intelligently recommend things to do to improve the music playing it would be like a kind of Amazon it records your interaction with the system and it will be probably depending on how good you perform some tasks it will give you recommendations for things to do to improve them and the last one means this interaction among people this is interesting how people the interaction among people make them learn better or not so patterns of interaction what are these patterns have an impact in the learning process so one part of an analytics is not only theoretical to understand how it works but it has a practical application is that it clearly recording this material in the database is not cheap we have someone an expert to record it we have to set up everything and we cannot exhaust all possible repertoire of violinist studies so then what happens when a violin player wants to compare himself to target the performance but we don't have that recording in the database one way, since we have a lot of information already in the database would be to try to extract patterns and try to extrapolate these recordings to recordings we haven't recorded yet so for that we have done the pass on working expressing the performance for the non-musicians very quickly what we listen but it's not exactly what was written by the composer it's based on two things it's based of course on the composition but the piece as it's supposed to be plus what the musician adds to the performance so what we listen is as some of these two and probably since the score is something fixed and static an interesting thing to model is what is the performer adding to the piece in order to interpret it so in order to do that we have been doing this for a while we have on the one side we have the score of the piece on the other hand we have the performer we encode the scoring to some machine representation like it could be a text file and then we obtain the recordings and we apply signal processing techniques in order to obtain a symbolic description of the audio file so instead of having just another file we split it into notes and higher level events like a MIDI file and once we have this we compute the difference between what was in the score compared to what the performer did or played and then these are the expressive aspects of the performance and then we also analyze the structure of the piece we put this into machine learning in order to obtain some expressive performance model of that particular player the model we can probably change some parameters in the machine learning or change some of the description and then once we are happy with the model we can now give a new score the one we haven't recorded and then the model will predict or produce a performance in the same style as the musician would have done if he had recorded the performance we have been there are many expressive parameters a musician manipulates in order to add expression to performances but the ones we have we have been studying is duration when a note should be played louder or softer depending on which context should be played louder or softer when the onset of a note when a note in the score should be advanced according to the specification of the score or delayed when the note should be played longer or shorter as stated in specifying the score ornamentation in classical music is not always a big issue but in popular music especially in jazz ornamentation is a big expressive resource and when a note should be added to the ones specified in the score or when they should be removed and also we can predict gestures and we have all this multimodal data we can say in which context for example in what force it should be applied to that note or what velocity of the bow should be applied to that note and also we have studied articulation so given two notes in the score how they should be played should be played like legato really linked together or staccato separated from each other so how we do this we start from the score we focus on one note and then we characterize the note for example duration and pitch and also metrical strength where it appears in the bar and then we characterize also the notes around it the notes we are focusing and then for each note we strike a set of the scriptors which I just mentioned so one by one we go and we get the scriptors from the score then we look at the performance and then we focus on the same note but we look at what was played exactly for that note for example if we are interested in duration we can say how different the performance duration compared to the specified duration and then for each note we can annotate the difference between the score and the performance what is exactly what we want to predict once we have this description we can apply machine learning algorithms this is for example a typical regression problem in machine learning and then we obtain a model again we can play with the parameters once we are happy with the model then we give a new score the one we haven't recorded it will produce an expressive score in the sense of volumes and durations and then we have a synthesis engine in order to produce audio file I will play you something we did with saxophone this is automatically generated this is not perfect but it is expressive compared to the straightforward rendition from the score directly to the synthesis engine so from gesture capture we have been also working preliminary work but we have I was saying that different solutions like in terms of cost and complexity for different data that we want to collect for gestures the same we can go from this end and the expensive one we have high definition cameras and normally you use a lot of them like 15 or 20 cameras in order to capture all the articulations or the points of interest in the body and you can get these images that you always see with the skeletons moving around but these are very very accurate these are expensive you have to have a special room for setup for these and this is what we are going to do this is not amenable to take it to a student level then we have a second level which is like physical sensors that are placed in the instrument like in the bow in the violin and they can track the position, velocity, force and other parameters of the performance and maybe I can even show you then you can see in real time you can see the exact position of the bow where the violin is exactly you can calculate the distance to the bridge for example, which is an important timbre measurement for timbre aspects velocity and so on and the but this is still not very practical, it's very good for research but the violin has to be wired you have to have a wire coming out from the bow and violinists in general they don't like to get anything on the violin so this is just for research so the next step what we are doing now is using Kinect which is a high resolution camera which not only gives information to the information but also gives you for every pixel the depth of that point in the image and this is very useful to track in 3D movement so we are starting to deal with this and the lower end is web cameras so with this you put some color markers and we have been playing with this actually, literally we have been playing with this and it will track the bow and also we have an augmented reality technique which gives you the coordinates of a rigid object and then you can compute the coordinates of the bow or the bridge or whatever you want with respect to this so this is the low end solution you don't need to put almost anything in your violin just a small marker color marker in the violin so we expect also these projects in my experience many of these projects happen that we researchers think that it would be perfect to provide to the students or to the final users if you don't have a close relationship with them then what happens is what you produce is not actually what they really would have like to have so we have been conducting workshops with students to ask them what exactly they use which things they would see that would be useful to use in the future and we have done once in Royal College of Music in London on Friday if anyone wants to come from 11 to 2 with music teachers and students and the idea is to discuss technology in this setting in the music learning setting and then we get some feedback for developing the prototypes we also have a survey, an online survey that you can complete it's just like 10 minutes something that it asks you what you use what level of expertise in music and so on and once we conduct workshops in the 9th is how time happens in the 9th year we organize the music and machine learning workshop and this year we are going to do it in the area of of music learning it's in Italy in September so we have also, almost to finish we have developed some initial prototypes as we say we present this prototype to the students and say what they think what they think and then we will take this into account feedback in order to refine them so we have prototypes this one which is it has curve, it follows what you have to play then it has the piano role and it tracks your pitch to see if you are in tune it tracks your energy and then you can also upload the expert performance so you can see how close you are of the expert we have some more kind of games because they give you some scores for different parameters how good you have done it and also as I will recognize this is a web based interface which you don't need to download anything into your computer, it runs through the browser and then through the browser you can play and it will tell you similar information to these two so that's it thank you really nice talking and work are you considering now that you have the movements so detailed of the players not just using it to test performance to consider performance but also if the posture is correct for the body because normally when you are learning an instrument sometimes you have a specific subject specific movements that you have to avoid when you are playing an instrument are you considering also testing them and telling them this is correct in the kind of performance but not correct in the kind of movement because it will hurt you actually there are very links because for example in the violin the posture will change your sound or the way you put your elbow of course we are dealing with this you can compare yourself with the expert database we have all the exact movements so someone can compare their movements compared to the expert and not only that we are exploring right now sensors which have a nice sensor that has an accelerometer and a gyroscope but also has some haptics you can make it vibrate so then you can probably stop vibrating when you have the exact position and if you are going out you will vibrate stronger you can put some electric shocks to encourage it to make mistakes and then you can feel when you are doing it correctly it's like someone holding your hand when you are doing it any other I wanted to know so you have all this numeric data about how a person plays with respect to an expert and how are you going to translate these numbers into words that do not collide with the teaching methods and whether there is a risk or not in let's say changing the way instruments are learned just because you are looking at some numbers and how if this translates into language how are you going to surpass the language barrier let's say whether we are using English or Spanish so what are your views on that the first thing is that this is not really trying to replace the formal education so this would be a compliment the ideal setting would be that the teacher is there then they record they can see them together like look at here you didn't do it correctly because look at some extra information sometimes it's difficult for the teacher to objectively use very accurate adjectives they say no it sounds too woody or too I don't know this is still and each professor or teacher can use different terminology so this would be a concrete way to go and both the student and the teacher see well this is not going right because you can see tuning is not there or some timber aspect is not as you are supposed to be but the main idea is that we don't want to to change how music instruments are learned we want to compliment to add information so it can be used by humans it's not going to be a complete automatic system that replies your question Esteban ok anything else well I guess we're done so thank you Rafael and thank you for staying well this has been quite nice I think it's the first time that in our department we have had this type of workshop we have been able to see quite a lot of the research that is going on in the department and in the context of a project that basically would like to help all these projects to have a better impact and to promote collaboration among all of us that maybe next year when we have the next workshop we see that we have been improving and we have been really doing interesting research and with high impact which is what we should all aim for so thank you for coming and thank you all for being here and all the people that are not here but that has helped in organizing all this so thank you