 circle and that he princi be here to give this presentation. Pressent a vision that we developed over the last few months together with the open air project on how an augmented open air service could be suitable for the gI for an infrastructure to solve the problem of opening up the scientific outcome of the infrastratu in vse vse oboženje aktivitivi. V tega prezentacija vzajimo tudi tudi počke. Prvo, najbolj, vzajem, da vse je srečnje vzaj, da vse zelo počke, zato se počkaj pogledaj, da se vse predstavimo, da se počkaj prišli, da se počkaj prišli, da se počkaj pogleda, da se postavimo vse in tudi srečnje, In nekaj, da bomo izgleda, kako je izgleda ideje, v termi kolaboracijo v opene, da se zelo problemo trakičnja s sentičnjem. Kako je bilo v EGI. Počekaj, da bomo izgleda izgleda, kako je izgleda, kako je izgleda izgleda, kako je izgleda izgleda. EGI je za digitalne rečenje, o početku, da je predstavilo, da je vzitkovati kapabiliti v veliki, vrte, vrte, vizivati. Však je tega platforma, da je vzitkovati vzitkovati digitalne data, z vsej izgranji, z delovim vsej vzitkovati. In da je vzitkovati vsej, in da je vzitkovati in se zelo kot vsej zelo. Vzitkovati trebih isti, na početku vzitku, vzite je zeljena v ili inergyfysik, ki je nekaj kastom, všem prejzumaj, ki je vseh inergyfysikov počakov, vseh zelo izgledali, ki je še izgledali inergyfysikov in nekaj različal nekaj od 8,3 milijerjegi. Zelo izgledali po 15 petabyte, in penn uncomet잔ice ne pa-, they need to processes this data. And, for instance, in the last year they consumed a kind of 1 billion of CPU hours. In just through anatisfied interface, in the middle, you can see, you can see the other use case which is about a diagnosis of Alzheimer's disease. We are using NMR technologies today and the hospitaleee generating lots of images in začnevači, da je in izgleda v sej zelo v prilj, in da je inojevaj še boj, in da je to je očetno. Na izgledu je to, da je inojevaj še boj, in da je očetno. Na zelo vsej gleda je vedno vseh zelo izgleda, in zelo, da je inojevaj še boj, zelo da postoji EGI, si je učniti pakrat kontrola drowning in skar designs, ne zelo dio drugi medizino in biotik. If you want to know more on this, you can go and see on the link, which is on the bottom. So basically we see what researchers can do with EGI. We saw some example. Now let's see who is providing EGI, who is making this infrastructure. EGI is a federation, as a european federation of resources that are funded at the national, each national country, and that are through the technology we provide, we provide the framework basically to federate all of them, so to enable researchers to use them all together, so to break the highlands of individual research centers to make them a virtual computing capabilities for them. At the moment we have more than 35 countries and each country is establishing a national coordinating body, which is responsible for the national computing resources in the country, and there is a european coordinating body, which is called EGI.tu, which is the foundation for which I work, and is located in Amsterdam, around 26 people at the moment, who perform the coordination of this federation. A brief, I will not go very much in detail on the technical part, I will just give you an idea of what EGI provides, so basically on the ground we have the infrastructure, which are the IT assets from the research funded that are funded by individual countries, and what we provide basically is a technology to make all these individual resources work as a unique federation. On top of this technology we provide a number of services, for instance, authentication, so that a user using a single credential can access resources all over the country, so without applying for different authentication IDs. Also we provide support services, and also we enable user communities to deploy their own services on this infrastructure. Above this stack of services there are the users, who can run their own digital research, and everything is coordinated by EGI.tu. Looking a bit in perspective, what we are doing at the moment is that technology evolves, and the concept agreed was invented during the 90s, was implemented Europe in the last decade, but now things change, now visualization is a commodity, everybody, most of you, on a day-to-day basis, now they hear about cloud computing. What you are doing now, we have a road map also to adopt to offer cloud-like services in the EGI infrastructure, and the idea is to basically what we did 15 years ago for the grid, say we federated individual data centers, we want to do the same with the cloud, so we want to inject cloud technologies in the data centers and make a cloud federation, like if you could use Amazon, Google and other cloud providers all together, because our challenge is to work with many independent autonomous providers. Why we want to do this? Because the virtualization enables, provides more flexibility for the users. At the moment there are user communities who are not interested in EGI because the kind of services have been developed for the physicists, for the biologists are not good for the humanity, for instance, they have their own services, they would have to deploy in a flexible way the infrastructure. Through the virtualization we can enable this, we can enable, we can give the key in a way to the virtual infrastructure to the users and they can deploy their own services. Looking bit at the history, EGI, the idea of this European grid started in 2001 with a pilot project, like now we can have the example of open air, from pilot to service. Then we have the second phase from 2004, from 2004, 2010, where from prototype we provided production service. And now from a project based organization, we are through the EGI SPILE project, we are moving to a network of permanent organizations in order to provide a sustainable service for the future. What is interesting to see also in the, from this diagram, if you think about open air, about scientific impact, many projects built, contributed the building of the current EGI, but for today I would like to know how many scientific publications are linked to EGI. We don't know because usually this kind of tracking is banned by a different program of bi-projet, and there are many side projects who contributed to EGI, but also national fundings who are not, at the moment, you cannot see here. So basically it's important to see that research infrastructures who have long term life, they may even change name along their life, like EGI was called EDG before, then EGE now is EGI. And also there are different streams of fundings, each stream of fundings, if each funding agency would like to know what is the scientific impact of this money invested in infrastructure. Just a few data now to the current project, EGI SPILE is an FP7 project from 2010 to 2014, which is helping us to transition to a network of permanent organizations. There is only 25 million contribution from the commission, but the whole effort from the member states is 330 million, because also the infrastructure, the servers, computers, storage, they are not paid by the European Commission, but they are paid by the member states. And we have at the moment 50 partners in this project. In these slides I would like to tell you a bit how we see EGI contributed to the Europe 2020 strategy. There are two main flagships that we see where EGI has an impact, where we are having a contribution. One is the digital agenda for Europe, but the other one is the innovation union. In the digital agenda for Europe we see EGI first as a supporting a single digital market for computing resources, because through the enabling this federation basically we offer research communities, just spreading Europe, but also outside. A single way is a uniform way to access them. They don't need to build their own adapters or try to combine resources, because we provide already this abstraction. And also it's a way to stimulate the competitiveness and interpretability, because EGI works on open standards, so all the technology we develop, we mandate the adoption of standards, and this effect of having open standards means that new actors can enter with their own solutions if they are more good in implementing the software component. Improve efficient spending, because EGI, if we see, is a framework for implementing this with the computing infrastructure for research, who pull together funding from member states, from the European Commission, in an all unique initiative. So it avoids high lands in duplication of spending. And also it addresses grand challenges, because given that we pull all the resources together, we offer scientists a more capable and higher capacity infrastructure for their digital research. In terms of innovation union we see EGI also as a tool to implement the digital European research area, promotes skill development, because in the last decade generation of IT experts have been trained on distributed technologies, but also researchers. And also EGI has connections, so we have many agreements with partners outside Europe, so we cover almost all the globe about connecting the European infrastructure to infrastructure like US, Latin America or Asia for instance. Given this first part on EGI, now we are trying to approaching the context which is more connected to this conference, which is about how do we track the scientific input of all this infrastructure. Here we have some number about EGI, so at the moment there are 351 different research datacenters, who provide computing capabilities, who are federated through the NGIs. We have 470,000 logical CPUs, 143 petabyte of disk, 138 petabyte of tape. And one thing we need to consider here in order to understand what is the challenge for EGI in terms of tracking the scientific output is that this is a federation, so each center retains its own autonomy. So there is no single control, there is no central control for which we can say to this certain discipline today we provide we allocate this computing power for this amount of storage. So each resource center retains negotiation capability with the research community, so say if there is an Italian group of structural biology wanting certain amount of computing capabilities of storage to do the additional research, they go to the national grid initiative, they will make the agreement with them. If they have a connection with another group in Netherlands, in Spain, the other research will do the same in the national country, and through EGI they can use all these resources together. So this kind of autonomy in a way is a barrier for us to understand who is accessing what on the infrastructure and what kind of research they are doing. It's difficult to make, to have a single enforcement point to say to researchers, okay, if you use the infrastructure now you need to give you all the services you provide to me at the end of the scientific publication. And if this is done, it's at the national level, so it's difficult to pull everything together. At the moment we have 20,000, more than 20,000 users which are spread in 233 different groups, so there are 233 different virtual, what we call virtual organizations, which are grouping of researchers spread in Europe outside, and they work together toward the same goal on the same topic. So basically today with EGI researchers they can easily share a city infrastructure, data applications, training material, but still we need to improve the tracking of the scientific output. What is the current approach? How do we track? Because in a way we have also been funded by European projects, we need to report on the commission about what are the applications that are being possible or thanks to EGI. At the moment basically we ask to our partners, the NGOs, so the National Consortium to provide periodically the publications that the national researchers produce using the infrastructure. Since these organizations are new, they don't have major processes in place, so at the end we have partial results, so for us it's difficult to collect all that has been published thanks to EGI. What would be the ideal for us? Our need is that since we want to be a sustainable infrastructure for the long term, we need to establish policies and processing tools that help us to track all the scientific output as possible thanks to EGI and also to open up between the researchers. In order to tackle this problem, in June we created a dedicated task force to analyze the problem and see how we could solve this. I was leading this task force and we have a recommendation document that you can see linked there. Because of this task force, now I'm here today, and I will show you briefly what we did in this. Basically we tried to understand why it is difficult today to track all these scientific outputs from EGI. First, we came out with a lack of awareness, so researchers may get access to EGI, they may give a digital certificate from the end of the group, and they may use the EGI through a web portal that has been developed for that specific community, so basically EGI is behind that portal and they may not be aware they are using EGI. This is one problem, so one problem is that if they don't know, they will not be able to cite us. Second problem is that even though if they know, they may not appreciate the importance of citing EGI. They get the infrastructure usually for free, let's say, and they may not understand the importance of citing EGI as enabling tool for them to achieve the results and then to report us back and for us to demonstrate the value of the infrastructure. And the third part is that there is a lack of efficient or effective services for us to track even though we get to know to researchers. They cite us, it's difficult to track the publication from 21,000 researchers for which we don't have a day-to-day contact because they work in autonomy in the federation. We identified these three main barriers and then also we defined some action how we can solve this. And in terms of the awareness, we are planning to create a kind of site EGI campaign. In EGI we are defining a champion program to promote EGI toward researchers. So what we do is to make a kind of campaign and then ask all the developers of user interfaces for what we call science gateways, basically is the web interface that the researcher used to access the infrastructure to make it visible in this web interface to link to this webpage and then they will click and then maybe understand what is behind the web interface they are using. Second one is to stimulate engagement. How we can do this? First, to make they need to share the importance of EGI, so we need to make them aware of the importance of the value added of EGI. And then second, so we need a bit of stick and carot technique here. Each user, when they get access to the infrastructure, they need to accept unacceptable use policy. There are a number of rules they need to comply with. What we want to do is to add the one rule that is if you make a publication that is relies on work done on EGI services, then you have to cite EGI in the publication. In the cart here would be to, we run two main conferences a year where we attract around 500 people. The idea is to make prize or review on our publications. All the howters, maybe we need to identify how we can select howters of publications that made site of EGI. So these are some ideas on how we can stimulate engagement. And this would move if we succeed in one and two, we will succeed in having our name in the publication. Then third is how do we collect all these publications. We need to establish appropriate services. We want to minimize human intervention because researchers already deposit usually the publication on the university that want to be bothered to deposit twice. So it should be easy to use. And also this service should be infrastructure specific. That means that it needs to collect all the publications, all the extra metadata that we want, we feel used for an infrastructure. Given these three actions that we identified, now we focus on the third one because it's the one that will connect to open air. We went to more in the analysis phase on what do we want from this service. So we want a service to be able to enable researchers to simplify link or they claim their publication. In general, we're not interested in the paper. We just want to know that the paper exists. And it was possible thanks to EGI. So it should be very minimal for the researchers should, if the publication doesn't site EGI, so it's not possible to automatically discover it, researchers should be able to deposit it just putting the DOI, the digital object identifier. Eventually we want to offer also the possibility to deposit the file. But as I said, it's not a priority for us because there is already a main repository somewhere for which we can pull the publications if we need. And then especially is a very important feature. We would like to ask, we want to reduce human intervention. Once researchers site the publication, we want tools that enables us to automatically discover these publications. And we want some extra information. We want to know that the publication is connect was possible thanks to EGI. We want to know also what is the virtual organization, the group, the research groups to which this publication is linked, not just the outer. And also the GI scientific discipline. We have our own categorization. We would like also to support federated authentication. We don't want users to have a research having an extra authentication ID and password. They already have an ID they use to access GI. We would like them to use the same. So is what is nowadays called federated authentication. Once publication is mentioned and claimed in this service, we would like to receive a notification and to verify from EGI perspective. And also we would like to have a number of statistics that can help us to understand the publications by scientific disciplines, by group who is performing well in terms of research outcome, and so on. So then when we had this list, then we started to think about how to implement this service. Should we have our own repository and or should we rely on something external. We were aware of the PANER initiative, so I contacted the project coordinators. And we had some dialogue in this, and I understood was, I had to say, a high level of review of PANER was not sure it was a suitable solution for us. And I thought, yeah, this is very, provides already lots of what we want to do, but it's not yet complete. So the idea is to why don't we, instead of reinventing the wheel, why we don't work together to extend open air, and then we can use it as a service. So without installing any software on us, we don't want to care about managing an extra service, because this is not our core business. So the idea is that we came up with this idea why we don't improve this analysis, understand what is missing in open air to satisfy our needs, and then open air would be our, what EGI proposes as a repository where we track all the scientific output of EGI. Coming back to the list, basically as we can see, alpha, these requirements are met already. Alpha needs some change, some development in open air, but given the initial estimate is very easy to implement, we can, this can be done in the order of MAMSA work. So early next year, we can have already a prototype. So what is the idea? The idea is that an EGI user, okay, first that once EGI researchers cite EGI in their own publications, this will appear automatically. Second is that if they have publications for which there is no explicitization of EGI, we would like them to go to, say, customize open air portal, which maybe they just go to the EGI website. We can brand an open air EGI portal. Once they login, it's already known that they are part of EGI, so they don't need to express the part of EGI. We know already the virtual organization for which they are part, so maybe they just need to say, okay, I did this public, they will put the DOI, they will just click, I did this through this virtual organization, and the virtual organization is already to be able to do a discipline, so should be very minimal effort. The idea is to give them this service, this custom service that they can find on the EGI website, but on the back usually it would be open air, basically. What are the benefits over this? If we achieve this, basically we will have a single access point for the older scientific publications for EGI, and this will be, since then the data will go into open air, will be a single access point for all Europe, basically. This will accelerate discovery of the publications across the EGI users, access and reuse, will minimize the burden of your searchers, because through open air we can, in these modifications we can keep this to the minimum, and it will help us to demonstrate the impact, scientific impact of the infrastructure, and also of the virtual research collaborations. And also we can have a number of very nice statistics that we can use to, either to plan for the evolution of the services, also to demonstrate to the funding agencies what we were able to do, or what researchers were able to do, thanks to EGI. And this idea, I mean EGI, we expect to be not the only one with this problem, if you know the S3 project, there will also be a project that will last for decades, they will have the same problem. So once open air does this pilot with us, then they can offer this kind of services for others. And this would be in a way a win-win situation, because we get our problem solved through open air, and open air can open up and dissolution to more infrastructures. So concluding, EGI is an infrastructure that pulled together ICT resources for research, enable digital research in Europe, and supports virtual research collaborations. I explained what are the current barriers for us to track the scientific output of EGI. And also I explained this vision that we've all developed over the last months about how an augmented open air can remove these barriers from us, and also can be an enabler for us to open up the scientific output of EGI, and enable, demonstrating us the impact, but also enable researchers to use and reuse the scientific output. Thank you very much. Do you have any question?