 to add something to what Alessio said. In the beginning, when we were thinking of the research community in the dashboard, the main question was what we could accept as a community. What is a community, basically? That's the basic question. And we started, in the beginning, we said, OK, we're not going to think about it. So let's fix something that allows us to group all these results of science in different ways. We came up with the possible ideas on how polarity of means include objects coming from this information space to build a questionnaire. But then, when we started thinking and thinking, we realized that there was a slight difference between the communities that we were conducting. The immediate distinction that we felt was there is actually the intent. So what these communities are using these tools for. So there is a distinction, and this is the only one we found. Maybe you can find another one. Between a group of scientists who's a group because of researchers, so a research group. So I'm here, and I want to know everything and help me performing my research. And instead, a group of scientists or an organization like a research infrastructure, for example, that wants to collect information about all the objects that have been produced thanks to the research infrastructure. This is more like a research impact analysis. So you don't want to have in this research community the actual objects that are not somehow produced or involved or engaged with the services of the research infrastructure, or what the scientists are doing. This is a big difference. And we started with the EGI in the beginning. EGI was actually our first use case and it fell exactly in this use case. So they wanted to collect all this information to show to the founders and to their scientists that their existence is worth something. So look at how many publications, look at how many data sets, look at how much software they use thanks to EGI. So this is a completely different thing. And in fact, in support of EGI, we started defining customizable mining tools. So we were looking into the publications, the full text, you can access publications that we have, to find evidence of the fact that this science was performed thanks to EGI services, or mutual organizations, or the disciplines and so on. And this took us to a more generalized point. So we can do the same thing for all this kind of research infrastructures. So today, if you look at the publication of Beta and you can see that some research communities are highly defined as research communities others as research initiatives. So there is a slight distinction between them. In the research initiative one, so the research impact-related communities, you will find, for example, RDA. RDA is an initiative that has been founded by several projects, but still exists. So it is an initiative whose record publications can be collected across the years. And even in this case, you have a clear and neat distinction between the research community and the research initiative. So there is no such a thing as research of interest to RDA. It's too broad. But the research thing is, what has been produced thanks to RDA? So I know it in the articles or in the data segmented data. So you will find a distinction which is actually crucial, and maybe there are other more fine-grained distinction between them in the future, but these actually prove for us to be very important. So I was asked to comment on the challenges that we found. There were several challenges that had to do with what to show, how to search, what to provide, how to make it useful, and so on. We surrogand the human-user-interface interaction and this is, of course, obvious, not obvious, because those who actually developed it or gave up the solution found it very hard, and I think they made an excellent job. But then there were some crucial key aspects. One of them, in the chapters that we can hear, Connect, which is a project that we developed, this was to actually identify what is the magnitude of interest in the scholarly communication domain. We had literature, we had data set, and then if you looked at, that was the previous stage, if you look at data set, data set could be so many things and publications could be so many things. So we had to understand what would be of interest to communities across this very topic as an entity. So the first intuition was of course software, software is everywhere. So software is something that all researchers, despite the community of prejudice, that say they need to be a resourceful. So we wanted it to be different from data. So something that is, as a concept, different. Data is something new process, software is something new attitude. So conceptually we wanted to have this clear distinction. Those preparing data sets had a different heat and respect to those developing software and sharing software. When you share software, you often share some business projects and the way you wanted it to perform and then connect. When you share data, you're actually sharing there is also some scientific thinking, so observational or secondary data or whatever, but it seems something you want to process and give us an evidence. So this was the first distinction. Then we wanted to, if you look at the publications, the publications coming from institutions, again they had a variety of things. So they range from the general concept of publication to also look at vocabulary, standard vocabulary used out there. They made from images, multimedia, to software, to data sets, to whatever you can think of, from slides, to all the literature typologies, to then also the things like research objects and so if you look at every positive, it's crazy what you find. So we wanted to have a finding very distinction. So we came up with literature, which is any product that is intended for we here. So the intent is to not rate something. I want to tell you a story with a slide or whatever. So we came up with literature, data and software. The rest being what we called, not a very nice name, but we couldn't come up with anything better. And please tell us with other research products. Other research products is like a storage. You put everything that is not the rest in respect of the categories that are very community specific. This was our choice. So each community can actually select the specific vocabulary of other products that they manage every day. So this is important because we want each community to recognize in some way where it's something specific. The same model of software. Software, again, is something that can be classified in different ways with a jargon that is that of the community. So this was really hard to come up with a conclusion. It's very generic, the solution that we adopted and it allows for custom solutions to the individual communities. Of course, from other products you may, one day, identify another critical mass of objects that are common to different communities and therefore take it out. So we're exploring, for example, the virtual appliances, so virtual machines, which could be, again, another typology or another entity of sensitive products that deserves this first class citizen step. The second issue was how to embed mining tools into the documents. So this wasn't part of the demo, but it's there. So before, our developers and data scientists had to conduct every single community to come up with specific solutions for mining, to mining the publications to identify those publications that were part of the community. And this was some work. So what we decided to do is to come up with user interfaces that allowed to do that. So basically they abstract high-level concepts in mining, in text mining, and allow the end user with very minimal knowledge to customize these algorithms and they can run them to test how successful they are. They can also upload their own corpora if they want to run these tests. Once they're happy, they send the solution to the mining team over here. We send body data and checks in this area. This actually was a lot of effort closer to understanding which is the right balance between technicalities and ideal concepts. Then the identification of community, I mentioned it before, propagation of community tasks. That's another interesting thing. You have a graph. Graphs with these objects are connected. So there are several things you can do to propagate this notion of community. If my object belongs to a community, then possibly all the objects related to it belong to the community. To which extent? Can this be planned by the research administrator? When is this true? Can it be just one step or two steps you can avoid? Can you propagate the level of trust of this conclusion that we are making so that users can visualize only a certain level of trust entities and so on? We solved this problem by taking unilateral decisions in what we are promoting and adopting back conservative solutions. For example, if you have a paper that notes the community that is supplemented by a dataset, then the dataset belong to the community. That is something that we do. But we don't go beyond that. And this is still under discussion. Because one day what we hope to have is in the dashboard a list of possible propagation tasks that the administrator can select. To select how and to which extent they want to color the graph with the color of the whole community. This is for the future. For the future what we are willing to introduce is some logic that allows us not only to specify a data source among the ones that we have, so Alessia has shown before that you can select a content provider and say, everything in this content provider belongs to my community. So when you collect from there, please add it. I'm curious. So not all content from this data source will belong to mine but only those matching certain conditions. This is because in most of the cases the data source can include several things that are not necessarily belong to the community. Even if it's 90% of the objects, saying that everything belongs to my community is wrong. So we'd like to have ways of doing this. When we're thinking about it, we have ideas but of course we're working to discuss them with you. Meet the data quality. This is one of the main issues. The whole scoring communication is facing in fact, so it's not our problem. And since users can claim we would like to have experts validate their claims. So these objects belong to the community but we'd like to have a way to say other three experts said this is fine or so certain level of confidence must be recorded to make sense out of this. And we also now the fixing of the data that we have so it's not just about saying these objects belong to my community but also fix the records. We decide the fact in the content provider in Nashville that we're not showing today that we can send the fix to the original sources because that's what the content provider like for plenty. So this is a good scenario but of course we need to link and we think to make these things funny. And finally, community identity. This is another issue that is coming up. I don't know if you've represented previous session but it's coming up over and over again. So basically all infrastructures are facing this issue which is that of identifying communities. They need to come up with a notion of community to classify whatever they offer is in terms of these communities. So if you look at the EOS if you look at the UDOT, if you look at the DGI, they all have their own internal national communities. This is very important because this is what we do. We would like to offer services to daily care communities. What we cannot do today is to harmonize this logic. So for Catalan for example I'd like to know for each service which are the communities involved so that may be interested in that service and I'd like to have my community may be or identification be aligned with that. So in the context of the EOS this is extremely important. So we can actually map into each other of the territory straightforwardly. This is something that we are extremely also interested in.