 ...in at the Italian National Council of Research. So, I'm also part of the open air team. In particular, I'm the technical director. In this presentation, however, I will not speak of open air. I will actually speak of an activity that is being performed in the context of another project, which is called ICORDY, but the activity that I'm going to describe is strongly linked to what we are doing in open air. Actually, it's funded by ICORDY, but it's actually done in collaboration with open air. So, what we'll do is that I will first briefly introduce what is ICORDY and a related initiative, IDA, that you may be aware of. And then I will explain briefly the prototype that we are developing in the context of this project and the issues of interoperability that are behind it. So, let me start with ICORDY. So, ICORDY is a coordination action that started in September 2012, and the objective is to establish a coordination platform between Europe and the US to discuss interoperability across infrastructures. So, what happened is that while Europe was actually, we were preparing this proposal and Europe was going in this direction, a similar initiative was launched in the US at the time it was called the Data Work Forum. And in parallel, more or less, at the same time, the Australian National Data Service start also having the idea of funding something similar. So, actually, what happened was that Europe, a certain point, US and Australia decided actually to merge the three initiatives and give rise to what is called the Research Data Alliance. So, what is this, you might have heard about this because there is a lot of people that now are participating in this, starting to participate in this initiative. And the main goal is, as you say, to facilitate a specific short-term effort that accelerates the sharing and exchanges of research data. So, again, the problem is to make infrastructure interoperable and the instruments that they are setting up are working groups that should work on sharing and agreeing on, for example, standards on principle for infrastructures, on policy, on best practice and so on. So, this is the goal. And you can find more information on the website of the initiative. So, what I want to say is, so, there is a lot of interest behind Research Data Alliance, but why? So, I think that is important to understand the reason. So, what do we have now at the moment? So, there are a number of data infrastructure that are in place now. However, you should keep in mind that infrastructure, there is no single notion of infrastructure. Infrastructure is always created to serve someone. So, there is a group of people that has a particular needs. So, instead of implementing a service by themselves, what they do is that they agree that they have a common set of services that they need, and instead of implementing by themselves, they create an infrastructure, and then each of them use the services provided by infrastructure. So, it means that it is driven by the needs of the community that actually decides to set up the infrastructure. So, if you look at the infrastructure that exists now, you will find that there are many, many different kind of infrastructure. So, everyone, and this is also a problem of language, because when I speak about infrastructure, it's not the same notion that you have of the infrastructure. So, for example, there are organizations that have created infrastructure for supporting the storage, creation, preservation, or a large amount of data. So, they are offering services for the preservation of data. There are others that, for example, have created infrastructure for giving a uniform access to a set of different data sources. So, they have aggregated data, and they give only an access point, uniform access point. There are others that have created, for example, infrastructure for not only for giving access to the data, but also to support data analysis mining. It's in the vision of the cloud. Now, there is an infrastructure who offers services for not only giving access to the data, but also have tools, for example, for mining the data. And this is given as a service to the standard community. And there are infrastructure, like, for example, Open Air, which actually are supporting and offering services to a community of scientists and researchers. So, if you look at all these different, they have different services. And then there is also the other dimension that there might be this kind of infrastructures, and these are being created by informatics. There is an infrastructure for vulcanologists, and so on, so forth. So, this is a very, very heterogeneous universe. Now, what happened is that, however, setting up an infrastructure is very expensive. So, what happened is that there is now, before starting realizing that when you create an infrastructure, especially in infrastructure, like, for example, Open Air, which is giving services to of a high level, you cannot reimplement all this tech. And so, if there are already infrastructures that provide certain services, these infrastructures should actually use the services that provided by others, or the content that is provided by others. So, what is happening is that a notion of what we call the infrastructure ecosystem is being created. So, I wanted to create an infrastructure, but at the same time, my infrastructure can exploit something that another infrastructure is providing. And so, you understand that in this vision, the notion of interoperability is key. So, it's in order to implement this infrastructure ecosystem the key point is to implement interoperability. And I also want to stress again what Jeffrey said before. Interoperability is a key issue, but interoperability means a lot of different things because it depends on the task and the reason why you want to have interoperability. So, if you may want to have interoperability because you want to be able to discover resources that are published by another infrastructure, but you may also want, once you have discovered, you want to access the resource, or you want to use the resource. So, according to what you want to do, the problem of interoperability becomes more and more complex. So, simply access retrieving is one aspect, but if you want to reuse, you need to understand what you have, it's much more complex. And so, this is, let's say, the reason why there is all these people speaking, there is a lot of interest in interoperability. Now, so, what I want to say is that interoperability is a key issue also in the context of open air. Because open air, the open air mandate is to create an infrastructure for scientific information. But when you think about, so it means, essentially, to serve a scientist in what he's doing. So, but if you look at the process that the research does, actually, in order to perform its work, it needs to retrieve access and use a large variety of resources. So, it's not only accessing the article, it needs to access service. We have already discussed the data set, but, for example, we need to, as I was trying to point out earlier, the tools for doing analysis and monitoring. But in order to use the tool, I need to have, for example, access to the documentation of the tool, and so on. So, or when I produce something, I elaborate the data, then I need, for example, to, I produce a graph. So, maybe the scientists need to have access to the graph. Or, for example, if the data are projected on a map, I need to have access to the map. So, what is happening is that, in the everyday work of the researcher, the researcher access a lot of different elements. So, what is happening is that, and it has already mentioned by Alicia, the publication model is changing. So, there is, in some cases, is the author itself that try to publish together with the paper a lot of connected information that are useful for those that read, try to understand the research result. So, he's pointing to a number of relationships. So, this is explicit in the new publication model. And there are a lot of different models that have been proposed at the moment. But this is what is done, as preceded by the author. So, what we have seen, for example, in Open Air, is that the Open Air is trying to do something a little bit different. So, is essentially look at the paper and it try automatically to, for example, it try to understand which is the data set that is linked to the paper. So, this is trying, something that is not done by the author, is done automatically and maybe the author can actually validate what has been suggested by Open Air. And this only, let's say, and this is done in the context of Open Air is done by working on the metadata. So, we harvest metadata, metadata from papers and metadata for data set. And now by analyzing the metadata of the two, we are trying to establish this link or to identify possible link. So, in the context of ICOD, we are trying to make a different experiment. For the time being, it's just an experiment. So, instead of starting and just starting from the paper, we want to start from any of these different resources. So, for example, the idea is that someone, for example, can identify a relevant experimental data set. So, he want to understand, for example, in which, given the data set, which is the working data set that has been used for making an experiment. So, you know that there are observation data, for example, in one, before doing analysis, select, which are the most interesting observation data, for example, for making a model. And then, given this work data set, it can go to the paper. And maybe once he has the paper, it goes to another paper that contains complementary information. So, the difference within the experiment of these different prototypes is one, we want to start from a particular point. And instead of harvesting the metadata, the idea is that we want to, essentially, from the technical point of view, we want to be to query. So, not harvesting the metadata, but query directly the different information sources. Now, you understand that from the interoperability point of view that is the topic of this workshop, this is maybe very complicated because the different resources that I'm talking, I'm referring may also be hosted by different infrastructures. So, just to I don't know probably you cannot see anything. However, just to show that this prototype is running prototype, is still a prototype, so the interface in this way, so you can search, for example, here we are searching in data site, you receive result, then what you can do is you can say, I received this result, I want to search for another resources related to this, for example, in driver. And the relation that I'm trying to explore is I want to have, for example, something that has been published by the same author, and is described in terms of the same or similar keywords, for example. And then at this point you retrieve the result also on that has been located in the other source. So, this is, as I told you, this is only first prototype, the effort is in understanding how we can afford this interoperability issue. An important point is that we are creating this prototype in such a way that the criteria when you say relation, maybe relation as I told you, relation by author, but it may be relation for many other type of relation. So, this should be configurable. We have adopted a very simple algorithm and you want to have more selective algorithms. And another important point is that we want to integrate in this also what we call the authority registry that are essentially it's like authority file in, so like, for example, orchid that have general applicability. So, when they say, I want to have a data set from similar author, I can use orchid in order to, let's say, retrieve better quality results. And this is as far as orchid is of general applicability, but we are thinking also more if we identify, for example, particular collection, we are thinking, for example, to use explicit resources that are creating very domain-specific in domain-specific area in order to refine the relation. For example, if you are, there is I don't know if you are familiar with catalogal life which is the taxonomy of all the specifics of all the species and then you can, if you are in, for example, looking at the biodiversity area, you can search for something and then use this taxonomy in order to, actually, for example, look for all the other information that have a different common name, for example, a different name and a different language and so on. So, this is the way in which we want to retrieve better results. So, this is my concluding slide. What I want to say is that you understand that the difficult of what we are doing is certainly in the fact that what are our sources for these is not a fixed set of sources. We have an open set of data sources, so we need as computer scientists to find solutions that are not tailored for the 10 data sources that we have now, but we might be able to expand. And the other actually consideration that I want to say and this is my concluding consideration is that one of the problem is that we are linking resources that traditionally has been managed by different organization. So, we are speaking of libraries that have been dealing, managing and curating the content, describing the content in a certain way. We are dealing with data centers that I have been doing the same operation but in a completely different ways. We are dealing with software initiative, software repositories that have been working in different ways. So, what is happening now is that the trees start moving in the direction of the others because there are now data journals so data centers try to do something that is typical of libraries. Why libraries are trying to, let's say, also maintain the data so they are trying to move in the direction of data centers, but you see the interoperability is also complex because we have also to let's say change, something that is historical and the three different type of organization should actually start managing and exchanging experiences. So, this is actually my last slide.