 just introduce, just make the link between the previous presentations and this one very briefly. So, Catherine has presented the six working groups that are in operation. One of them is dedicated to architecture, the AOSC architecture working group, which I share. And within that working group, we have organized two task forces, one on AI and one on persistent identifiers. So, during this session, you will hear Christos, and I take the opportunity to thank Christos Canelopoulos to make the presentation today. And as he said in the intro slide, this presentation is on behalf of the AI task force, which I just mentioned. Similarly, this afternoon, there will be a presentation of persistent identifiers with the same spirit, the same mechanism. So, with that as an intro, I will let the floor to Christos. Christos. Thank you very much. Hello, everyone. So, as Daniel said, this is a presentation about the work I've been doing in the EOSC AI task force. The task force was created in October last year, and the mandate we were given was to deliver a consistent architecture for authentication and authorization and access control for the European Open Science Cloud. So, the task force has 24 members from various disciplines. So, we have members from research communities, research infrastructures from national initiatives, but also from the infrastructures. So, we have a very good and balanced group of people contributing to the work of this group. And the group is chaired by Klaas Weeringa from Muziant and Leif Jokansson from Sunet. So, when we put together the task force in the charter, we set out for four goals. The first one was to take one step back. A lot of things have happened in the context of AI with previous initiatives like the authentication authorization for resource collaboration, ARC initiatives within the context of EOS Hub. So, we said, okay, now we're at the point where we have a lot of input. Let's take one step back to revisit where we are, revisit what are the first principles, what is important, and what is less important, and also what are our requirements. So, the first goal is exactly first principles and requirements for the EOSKEI. And then to provide a baseline architecture, the starting point for the EOSKEI. We do expect that the EOSKEI will be something that will continuously evolve as the needs and requirements for accessing and sharing resources through EOSC will continue to grow. And within the goal of this task force was A, to provide an initial baseline that we call EOSKEI architecture 2019. And that would be our starting point and then towards the end of the year to have an iteration over that architecture, which we call EOSKEI architecture 2020. And I will say a few more things about this in a few moments. In addition to this, we set also two other goals for this task force. One was to provide a catalog of good examples, what is already out there, how can it be used, etc. And also very important, rules of participation. This is linked to the presentation that was made a few minutes ago from Juan. As the result of the Roots of Participation Working Group, this is basically where we want to provide our feedback as, from the point of view of the AI, what are the Roots of Participation, for resource providers, for users, for research communities, but also for identity providers, all these components in the EOSKEI ecosystem that we have to work together. So let's start with the first principles. So users, whether in the form of researchers, graduate students, teachers, or even administrators, are the reason why we build IT infrastructures of any kind. The authentication and authorization process is necessary part of the security requirements for infrastructure for any AI service. So AI is important, but it is important in order to drive the user experience. So the first measure that we set and the foremost principle is that user experience should be the first measure of success. The second point was, the second principle was that all trust flows from the communities and I will expand a bit more about this in a few moments. And the third principle, which is also very important, is that the EOSKEI itself and the EOSKEI is going to be a federation infrastructure, a federation of services. It is inherently a distributed system and there is no center in a distributed system. So this applies also for the EOSKEI. There is no center within the context or concept of the EOSKEI itself. There should be no center. So let's think a bit deeper about the user experience aspects. As I mentioned, authentication authorization process is really a necessary part of the security and risk management infrastructure for any AI service, in this case for EOSKEI. Security, while important, must always be deployed in conjunction with proper risk management to avoid over-engineering security controls at the expense of user build. Again, linking back to the problem of user-centric approach. We build technology for users, not for technologies. Since the process of indicating to and updating the rights to use an AI service has no intrinsic value itself, it must be made as unobtrusive and as humanly possible. In EOSKEI, what we want to adapt is a scientific approach to usability and quantify, measure and evaluate services from a user-ability perspective. As I said, access to EOSKEI should be as unobtrusive as humanly possible. Regarding the second principle, all trust flows from the communities. It is very important here to highlight that trust does not derive from technology itself. Technology can enable trust, but trust derives and emerges from the property of the communities and the research collaborations that we'll be using here. Communities may act as an interface between individual users and the resources. The EOSKEI should build on the trust that exists within the well-managed scientific collaborations and also provide the necessary structures to cater for all those cases that are not covered by a given specific scientific discipline. For example, the long tail of science of users, how they will be fit in. But from the context of trust, the trust anchor will be the communities themselves. So EOSKEI will be a trust mortar within which many such scientific communities, collaborations and infrastructures can coexist and interoperate. What is very important is taking into account the design decision is what works today for the users should and will work tomorrow. But our goal is only to make it better. The third principle is that there is no center in a distributed system. This has been a very common misconception that there is going to be a single EOSKEI, a singular instance of what would be the EOSKEI architecture. This is very far from the truth. The way that we see the EOSKEI is that it is a set of principles and governance structures for how the architecture is applied, is evolving and grows over time. The objective of the EOSKEI is not to create a central structure that will control users or that will build world gardens, but rather to provide an open and fair playing field for service delivery to the scientific community. So starting with these three principles, basically we will use them and we have started using them as our driving force for defining the EOSKEI. And I'm saying define the EOSKEI, but actually one of the very first things that we have discussed even before we started working in the task force was that by no means should we reinvent the wheel. I mentioned about ARC in my introduction. ARC, which stands for authentication and authorization for research collaboration, has been a set of projects, initiatives in our community that has been active since many years now, since 2015, and which had its roots in actual other activities that I think go back even in 2011, which has created an open dialogue between technologists, infrastructure providers, communities, research infrastructures in order to understand what are the requirements for an AI that will enable seamless access to resources. So within the context of EOSKEI, we want to adapt the ARC-grouping architecture with all of its extensions and then going governance that is provided by the EDIS group and use this as a starting point for the worker here. In particular, going to more detail a bit, the ARC-BPA builds on existing best practices in the scientific community and that is something that we want to adopt. It provides clear guidance for how campus identities integrate with science and comes with the beginnings of a governance structure in the EDIS group that we can talk a bit more about later. And also something very important, it has international buying. The ARC-grouping architecture is not something confined within the European borders, but it's something that already has adoption outside of Europe in the US, in Canada, in Latin America, in Asia Pacific region. So the starting point for the EOSKEI architecture is the ARC-grouping architecture version 2019 that was published last October. The ARC-grouping architecture basically defines an architecture of five layers where we have to split the identities on the top layer and this there you can find all the possible identity sources. Campuses, universities, research institutions being one type of such identities coming in from Medigame, but also we envision connection with other identity sources such as social IDs, other identity systems like ORCID, governmental IDs from EIDAS that are becoming more and more used nowadays in Europe, but also potentially I didn't provide this coming from the commercial sector, potentially. At the bottom layer we have the services. Services providing resources to the EOSKEI users and communities and they need to be able to consume identities in a homogeneous manner. At the left side you see what we call the community attribute services. This is basically what we define communities and the community structures. So individual users being able to authenticate other home organizations using social IDs, they will be able to join communities, be assigned access rights, be assigned roles within these communities, and then they should be able to consume resources provided to those communities. But of course access has to be able to be authorized. Not all services or resources are provided or can be provided to everyone. Perhaps some communities have specific rights or given data sets, so the authorization aspect is very important. At the center of this architecture we have what we call the access protocol translation layer. Basically this is the glue that brings everything together. If I was to describe the architecture in one term I would say this is a proxy architecture. Effectively we have a layer where acts as an integration point between all the other components, between the identity sources, resource providers, service providers, communities that are going to consume them, and then technologies and authorizations of frameworks that make it possible to have seamless access to these services. On top of this basic architecture we have already been working on deployment implementations, and you see in the second and third graphs in your screen basically how we see things actually working within the context of the use of AI. Basically we will have communities operating or having the community as operated by third parties that can allow them to consume identities from multiple sources. Through the community they will be able to consume their own services. They will be able to consume generic services available to multiple communities, but also they will be able to consume services provided by generic infrastructure providers through a structure that we call the infrastructure proxy. So this is the general model. It doesn't change basic many things of what we have been doing in the past, actually built upon those. And this is also the work that we are currently doing right now also in the context of use hub. And we will hear more about this during the use hub week. So working on the architecture and using the arc blueprint architecture at the starting point, perhaps the most important thing that we are doing at this level is identifying the gaps, what we're still missing, and what are the challenges that we need to address in the next months. And we have already identified three very broad areas of challenges. And these are, first of all, the community and attributes and how they are linked to theorization. How multi-infrastructure workflows can work and scalability. And let me expand a bit on this. Regarding the community, we have received a number of comments that communities are not the only source of attributes for access control. And it did, whether a user should be able to access the resource or data set might be a factor related to different authorization endpoints and have different entities having to say in these decisions. So this is something that we need to take into account, that it's not only the roles and the access rights that a user might have within a given community. It might be also other factors, like, for example, a given grant that a user might have received from a family body that will determine whether they would be able to access certain sources of the underwood status. Also, attributes may not only be user-specific, but they might also be specific to other contexts, and therefore, not necessarily managed by the community itself. And the last point is that who have been lacking a community attribute profile. So we were talking about attributes in an abstract manner that would enable authorization, services that would enable a homogeneous, consistent representation of the users towards the services. But what that community attribute profile would look like is something that we're still missing. Having said this, we have already started working in creating just such an attribute profile for EOSC. And this is something that actually we have taken up as a work in the international community in Arc to create an international profile that we can then take and adapt for the needs of EOSC itself. The second challenging topic is a use case that is rather complex, but also very common. So one use case that we have been facing again and again, and that we cannot address with the current architecture, is the case where we have services, compute services, for example, provided by one infrastructure, and then data services provided by another infrastructure. So a user would like to use something like a Jupyter notebook service provided by Infrastructure A, do their calculations, experiment, then do some big runs and when the experiment finishes, to be able to have the data stored at the data store service provided by another infrastructure. Today, the current trust model that we have cannot cater for this. It expects that basically all services used in the context of a given flow would be behind the same proxy. So we have already identified this problem quite some time now and we are already thinking of how this can be solved. And actually we expect that within the next months we'll be able to have the first implementation that we can use also in the context of the ESKI to be able to deliver this functionality. Lastly, the third challenge and group of challenges is scalability. When Arc Blueprint Architects 2019 was introduced and it brought the logical separation between the community AIs and the infrastructure proxies, the assumption was that there would be a number of communities, a big number of communities, and that the number of infrastructure proxies providing services could be rather low. This assumption is good at the starting point, but we're already seeing that the number of resource providers and infrastructure proxies connecting to providers will grow significantly. We will have national proxies, we're going to have thematic proxies, research infrastructure will be providing their own services to their own proxies. So this is something that we need to be able to deal with in the architecture and we need to deal with this in a scalable manner. Today the trust between the various components of the ESKI is more or less established on a manual basis. It does follow a set of interoperability principles that come with the Arc guidelines, but still the connection between the components is a manual process. This is something that we definitely have to get away from and we need to provide a scalable way of connecting resource providers, identity providers within the infrastructure, which brings us also to the following question. What also are the rules for participation for all of these entities, components? What are the rules for participation for community EIs, for infrastructure proxies, and other AI services? In this regard also we have already started working on a model where we see that the ESKI AI will grow from a set of rules and principles and manual connections to a more federated approach, where basically each entity will be able to connect once and be made available to all other entities without having to establish these bilateral connections with each entity within the ESKI AI. We do expect that we should have also something in this regard by the end of this year. So this brings me close to basically this very short introduction of 20 minutes just to also give you the timeline again. In April we finished the work on the ESKI AI first principles. We have already finished the work on the ESKI AI architecture 2019 and now it goes through the process of being published at the ESKI website and to be approved by the by the processes. So we should expect that this should be there sometime in June and we have already started gathering requirements and use cases and we do expect to have a document in the September period that will describe all the required use cases that we have gathered through this time. And all this information will be used to feedback, to feed it back to the development of the next version of the ARC BPA which wants to bring it more closer to the needs of ESKI. And then use this to profile the ESKI architecture of 2020 that should come by the end of the year and which will include also the use of participation but also the examples of technologies that can be used and are already out there to be able to build AI components for research infrastructures. And with this I think I'm already at a 20 minutes time limit. So I think we can open the floor for discussions now if you want to dive in some particular aspects to go more in depth or have generic questions please. Thanks a lot Christos. So please try to raise your hand just to so that we can manage the discussion but aside from that the floor is to the whole group question to Christos. Zenfros wants just a comment because I cannot see the chat if there is any question please if you can relate to me but perhaps I can take the opportunity because I heard in the previous session there was a question about non-academic identity providers and for sure this is something that we are working on how this could be integrated. Already in the architecture we support the connection of non-academic identity providers. How to bring in identity providers from the commercial world this is still being discussed how they can connect but it is for sure within our vision of what we want to believe as part of the USKI. Christos we have two questions one from Mateusz. Hi the question regarding the proxy and kind of from the point of view of a service provider what does that mean? So I mean until now the situation is that basically each service provider has to exchange metadata with which every identity provider more or less or through the federations if I understand correctly would that then who would that mean that we will only have end to one and one to end then afterwards when there is a proxy so that every IDP will only have a contract with the proxy and every service provider contract with the proxy with basically and then who runs the proxy or yeah. Very very good question thank you very much actually there's not going to be one proxy that's one of the first principles that I mentioned at the very beginning that there is no center in the decentralized architecture. But back for our community right? Yes we expect that there is going to be multiple proxies at the national level European level multiple ones so but we do expect services to have one connection point so from the point of view of the service and the identity provider they should connect once and they should be made available to all the communities and users in terms of the AI aspects so the principle is exactly what you said but it's not going to be one central service provided who to do this one center okay but so it could be for example national identity federation yes proxy they're going to be at the national level the work we're doing on the scalability actually will bring possibly also the concept of the EOS federation itself where such proxies can connect directly to automate the trust relationships between all these components but again a very important principle is that each service provider will have to connect once to one entry point and it would be possible for all users to be able to access it. Service providers don't have to go and connect to multiple entry points in order to share different communities okay thank you okay Mateusz if you can lower your hand maybe one for a brief follow-up question when so when will this start to happen 21 or is there I mean you showed the time plan but it's all this is all blueprint and papers when will this be kind of in implementation or in production actually it is not all all blueprint and papers we have already and production services running we have already a set of production proxies connected and serving communities right now we're not there at the level of scalability that we wanted it to be so and this is what we're proving so the basic structure is already there within the context of EOS hub we're doing the initial integration of those and this is where we show also all these limitations in terms of scalability and we have brought this in this discussion in the EOS KIA task force to see how we can improve this this aspect so it is not just paper work it is basically implementation happening that drives also the architecture at the paper side thank you we had a question from Dragana Radulovish but I don't see it anymore maybe she was disconnected I don't know okay other questions please raise your hand and in the chat maybe class your thanks a lot for following the chat you want to summarize for everybody you know the answers that well the questions and the answers that you were that you monitored in the chat yes yeah okay yeah I could not unmute myself yeah so there were a number of questions about the the role of the proxy and the proxy being a single point of failure a an anchor a trust anchor for a whole community and I wrote down that there will be multiple proxies one per community there was a bit of discussion about what a community is and there were remarks to the extent that security is going to be very important and incident response and the answer to that is obviously yes and I think the other important one is the relation with what happens after this year and how the output of the the working groups is going to be picked up in infrasco 3 and infrasco 7 and as one of the authors of the infrasco 3 proposal that a group of e-infras and research infras are putting together I can say that that is taken as input the output of the architecture working group is taken as the starting point for infrasco 3 and and and will have to be picked up by infrasco 7 as well I think that is most of it let me quickly scroll through yeah there are some some useful pointers that may put in a pointer for people that want to know more and in general I think it would be very useful to read up on the art blueprint architecture if you have not done so because that contains a lot of the thinking behind this presentation thanks class uh we have a question from uh Gergely Sipos so go ahead yeah so my question is hi I'm Gergely Sipos from EGI is that on the last slide your timeline showed that there will be first the release of the architecture and then a collection of requirements and use cases no actually the date that you see here is the publication of the document so we have already started gathering requirements and use cases and as I said previously we have not started from scratch we have taken up the initial set of requirements and use cases coming from arc and and eos hub and we are now in the process of reaching them eos kia architects 2019 is basically our starting point from what we have already gotten from from other activities and there we identify what we're still missing so we're still working on getting more requirements more use cases and we expect to conclude this by september and that will drive the arc bpa 2020 in the eos kai 2020 architecture so the areas we see here are not when it's not when the work starts but when we expect to have the results okay Gergely does that answer your question oh you're muted uh sorry sorry sorry how do we do that I can't find you in the list i'm unmuted already okay yeah yeah yeah yeah now i am i'm muted thank you for the answer so this point in requirements and use cases september does that mean that a survey will be opened or does it mean that you will publish a document and if you publish a document in september how do you collect input for that and when so we have already started collecting input from various communities and the extra classes that we are in touch with at the moment we're doing this for multiple channels but I do expect that in the beginning of summer possibly in June that we have also perhaps more formal communication for generic input for use cases this is something we're discussing in the AI task force at the moment but we're already receiving a lot of input from many of the class research infrastructures that we're working with okay any so thanks a lot any other question well as Christos explained in general the the document from the AI working group the four topics that were mentioned by Christos early in this presentation are starting to get out so the first principles section will get out and similarly so I assume the dialogue between the task force and the community will develop that way and I'm sure if you want to input use cases to the task force there'll be more than happy to listen to those to listen to those use cases