 Good morning, everyone. Hope you can all hear me well. Sorry for letting you wait a couple of minutes, but we were just testing the facilities for this session. And we are now ready to start. So thanks for joining today. So first of all, a recap on what is this session about. So this is about scientific community and the use of your scope and use services. So to provide you some example of successful use cases that you can take as a sample or take as ideas on what you can also do with your research workflows. We will have the other chair of this session is Gergely from EGI. I'm Deborah Pesti from Chineca and we will be chairing this session. Then we will give a short introduction on the topic of this session. And we will have four speakers representing for different thematic communities that will give you an overview of what they were able to achieve in using and integrating with the US services. As you probably already know from other sessions, you are all muted. So please, if you will have questions, moments during the session, if you have any questions, you can either write into the chat and at the right moment I will give you the word or you can just raise your hands and we will let you unmute you at the right moment. So this is everything from me as just a quick welcome you and I give the floor to Gergely for the introduction. Thank you. Let me show these few slides. Can you see full screen? Yeah, perfectly fine. So thank you very much Deborah for the short introduction. So we are running this session together because we are both working with the scientists communities and I would like to just briefly explain you how these science communities you will hear about fit into the EOSC Hop project and how they benefit from the EOSC Hop project. You heard about the project overview this morning from Per and he explained that we work with a large number of scientific communities. Actually, we do work with more than 30 now and they fit briefly in two main ways into the project consortium. One is the SOCA thematic services, which is a dedicated work package. That's what the Bora coordinates work package seven and what these thematic service providers do is that in the early months or months of the project, they integrated with different types of federation services that are operated and delivered by other part of the EOSC hub. You can see here what these services do. They range from kind of federation capabilities to baseline computing and storage data management and other software sharing or data sharing management services. The other scientific work package or science community work package is what I coordinate work package eight. It's called competent centers. That's less mature in terms of the setup of service delivery. Those competent centers much more focus on the integration of services, trying out services and testing those services to new users and early adopter users from their communities and then hopefully push out thematic services as well and deliver those services through the EOSC Bora. So these science communities, the more developed or more evolved one from day one, were selected for the thematic service work package and those that had less setup or less experience with infrastructures and the generic services were put into the competent centers. All of those basically benefit from the project in three ways. One is that they benefit through this integration and then they can receive generic services from the federation and common service stack. They can benefit from the training and support which they perform based on the guidances and generic support provided by work package 11 in order to reach new audiences. These new audiences can be either production users that can use the thematic services right away or can be early adopter users who test the new setups. And the third way they benefit from the project is through the EOS portal. They actually deliver the ready to use services through the EOS portal and those services are then accessed by the end users. So today we will feature a subset of those communities who we work with in such a way. Here is the complete picture. I think Barry used that slide or that diagram in two slides in his talk. It shows the complete portfolio of competent centers that we have, the complete portfolio of thematic services that we work with. These together are more than 30 communities. Half of those or nearly half of those came in through the early adopter program in 2020. The others started two and a half years ago. So today we will have four presentations from four thematic service providers. You can see a range of activities that they did. We will start with Alexander from the structure biology community. Then we will continue with Daniela from a generic that was originally developed for high energy physics and then it took up usage in other disciplines. Then Fabrizio will talk about an environmental science as a climate change service and Dieter will be a presenter about humanities and social sciences. So we cover basically the four main scientific disciplinary areas, life sciences, physics, environmental sciences and humanities in this session today. So thank you. That was just a brief introduction to put the session into context. Thanks a lot, Kerkely. I don't see any question at the moment for you. We might answer later on if there are. So I will simply give the floor to Alexander for his presentation. Thank you. Let me share my screen. Okay. So everything is right. You are seeing my slides now. Yes, perfect. Well, good morning, everyone. So I'm Alexander Montvain from Utrecht University and I'm representing today, we at MR and I'm going to talk about structural biology in the clouds and discuss say more than 10 years of experience now of using EOS and EGI services over the years. I saw in the keynote in a plenary that there were questions about chemistry. Where is chemistry? I belong to a chemistry department. So even if this is about biological sciences, we are working at the chemical level. This is also chemistry. So just a short introduction. So the Vienna Mathematics Services are integrated in EOS Pub. You can find them from the Hub project. You can also find them from EOS Portal, the marketplace. Actually tomorrow I will be part of the closing plenary and I will give a live demo of accessing those services. So I'm not going to do that today. So they have been in production since more than 10 years now on the various European project. It started with ENMR, a project where we develop our own infrastructure, also the compute infrastructure. The project involves in ENMR where we open the resource to worldwide user. That's what W stands for. And then through several projects over the years, you have EGI Engage, Westlife, Indigo Data Cloud. And over the years we have been piloting actually developments using of new resources that are offered by the infrastructure. So our compute model is an opportunistic compute model. So we are basically filling the gaps in the infrastructure resources, but the access to the resources has been formalized in service level agreement that has been renewed now several times over the years. And the current one is valid until the end of this year, which gives us access to about 60 million CPU hours and a number of sites that are committed to support us. So we are mainly using high-for-put compute resources for the old grid type of computings. We don't use much data. So users are submitting data that are processed by our portal. And we also have some committed storage, but again, it's more about processing of data and computing than about the data than providing data in that sense. So in terms of resource just and the impact that the project has, you can see here. So we have two data, I think probably today, more than 17 and a half thousand registered users. So this has been going up quite quickly in the recent weeks. Since the start of the EOS hub project, so speaking 2018, more than 8,000 new users joined the project. We are reaching more than 110 countries. So you see the word map, which is also accessible directly online. The URL is below. All aggregated actually. So this map again was probably a week old, a few weeks old. All aggregated actually, EU users like that this time have the majority. If you put all the countries together, and then we have a lot of users from Asia, but also the US. So we are a global, we are saving a global community. And this kind of research is global. So we cannot put frontiers in what we are doing. And this is something that over the years, we also had support from EU, but from EGI and EOS to provide access to users outside the European border. And that's important. We are submitting, so users are submitting job to the portal. And we also have the portal like translating those submission into high throughput jobs and in the order of 20 million jobs are submitted to HTC resources, which is more than 46 million CPU hours. We have several papers that are basically describing the portal or the software behind the portal. Those are highly cited because our portals are also highly used. And we also have a mechanism of measuring user satisfaction. I think an important point about all these thematic services is service needs has to be used. If it's not used, if it has no usage, then it's a useless service and a waste of money. So demonstrating users is important in my view. Just to show you that, yes, we are being used and we have a sustained use of your services. What you are seeing here is a plot of unique users per month. So we have more than 17,000 users that registered into the service. Of course, they are not all active at the same time. So what you are seeing here is that per month, we have, say, between 300 and 400 users, unique users that are using the services. And you also see an increase in usage in the last two months. And this is basically a COVID-19 effect. So we have seen increased registration rates to our services because we are providing tools that allow you to actually study the interaction of the virus with human proteins or to target drugs to human proteins to try to block the infection. This is the number of submissions that those users are doing to the portal. So we are in the order on average of, say, 300 active users per month. And what you see here is that those users, on average, are submitting each at least 10 jobs to the portal because on average, per month, it's in the order of 3,000 submissions that are processed. And you see here on the right side, columns appearing in red. This is, again, COVID-19 jobs since we have enabled, since April now, tagging of the submissions as COVID-19 for researchers that are specifically using doing this kind of research. This also enables us to basically target the jobs to resources, specific resources that are supporting or computing for COVID-related projects. Now, how does all this, so this is not the front end. How do things look under the hood? We need, of course, the e-infrastructure. We need to access the eOS if we put compute resources in order to be able to serve our users. But you don't only need access to the resource. You also need a complex infrastructure at your site, at the site where you are providing the services to manage user registration, submission, pre-processing, post-processing of data, and present the results to the user, all in a user-friendly way. So, or thematic services. So you see here on the right side, a view of the eOS Q&A portal as we host them in my lab in Utrecht. So they have a common look and feel for the end user. So the user interacts with the web page. This is the front end of the service exposed to the user. So the user doesn't deal with the complexity of the infrastructure behind the portal. The back end, what is behind the portal is a variety of software scripts to manage and work flows to manage all the computations. In order to distribute efficiently all our compute jobs onto the eOS infrastructure, we make use of the DRAC for EGI service. I'm going to come back shortly to those aspects. We use HTC EGI resources. We also have our own resources from time to time we have diverting jobs from say using eOS resource to local resources. Because from time to time there are small glitches and problems. And from our perspective as service provider we want the end user to be able to use the service 24-7 which means that you also need a bit of local resources to catch problem and make things transparent to the end user. To facilitate registration to the service we make use of the EGI checking mechanism as a single sign on. Some of the portal connect to data solutions like one data and we also making use of the Indigo data cloud U-docker solution in some cases. So this is the typical architecture that you will find behind the portal. So a user will never see that but the thematic service provider has to handle all of that and make sure that all the component of these workflows are working. So this is a view of the one of our portal this ways. And so you recognize the same look and feel. This is what the user will see. Once a user summits to the portal first of all they will have to log in in the portal. So who have to check is the user properly registered. This login can happen through the EGI SSO. We also have to make a lot of validation on what the users are submitting to the portal because you want that to catch errors as soon as possible so that they are not going to waste resources on the EOS infrastructure or cause problems say cause dead ends in your workflow processing. So typically there is some processing steps that are done say behind the portal on the local infrastructure where the portal is hosted. And then we submit jobs to the grid to the HTC resources using direct for EGI. This will be the default mechanism or we submit jobs to our local resources. In this particular example this is making use of GPGPU resources on the grid. Once you have to monitor what's happening to those jobs you collect back the data and then you're going to post process the data to add the end present them as a result page to the user and notify the user that these results are ready. So this is not a question of seconds depending on the portal the computations can take days before the user get back his results. So we have an email notification system to alert the user when things are ready. So I already mentioned direct for EGI as a mechanism to submit jobs to high throughput resources. For us it has been working extremely efficiently. So we have implemented it in our portals since 2015. So it's in operation for five years now thanks to the support of Ricardo and Andre who basically help us and guided us in this implementation. Now for those of you who are more into the compute side there are a lot of advantages of using direct for EGI. You don't need to root access to install it and to run it itself contains so it's rather simple to maintain. And you can even transform your own laptop into an HTC server if you have it running which I also have. So the syntax is quite simple. It also depends very much on what type of computing you are doing. So the typical use cases that we as we NMR thematic service provider have are rather these are compute intensive jobs rather short jobs but they require CPU and we don't have a large data volume and direct is perfectly suited for these kind of scenarios. We send the data through direct for EGI and we retrieve the data through the app for EGI. Another advantage for us is that DRAC is basically able to handle both HTC compute resource in terms of grid sites but also cloud resources. And it's completely transparent for us as thematic service provider which is a big advantage as well. So we don't have to worry about the cloud thing. So if the grid is disappearing tomorrow and we get a lot of cloud resources DRAC should be able to handle these for us in a transparent way. Some of our portal already mentioned are making use of GP GPU resources. You also find a number of those on the EOS infrastructure. These are two of the portals that are doing that and I'm not going to explain you what they are doing. The only thing I want to mention here is that those portals, the software behind the portals requires quite a number of libraries. So you have complex software dependencies. And this is not something that you can ask a system administrator on the remote sites to install for you. So a good solution for this is to actually use Docker containers. But Docker containers historically are an issue security issue or the size didn't like to run them because you need to have some root access. But thanks to a solution that was developed under the Indigo Data Cloud project, we can use UDocker which is basically a user based version of Docker meaning that we can run on the grid using Docker containers. And this is all the software installation trouble on the remote site. So this is something that as community we've been piloting in a previous EGI Engage project. So now to finish with a few examples just shortly of what we have been doing recently in terms of thematic services in relation to COVID-19. So one of the portal, the one which is actually the most uses or Hadock portal which allows to model interactions between biomolecules. It can be protein, protein, protein small molecule. So this is by far the most used services out of the WNMR offered thematic services. It can be used in terms of COVID research to model interactions for example between the viral and the human proteins or for drug-skinned purposes. So since several weeks we have seen an increased number of registrations. So you see here a registration curve. So this is just focusing since the beginning of the year. This is by the way a logarithmic scale. So it looks linear but it's still exponential. And you see that the curve goes up since about mid-March. And this is typically an effect of the COVID pandemic. So we see a lot of researchers starting to work on this. To respond to these demands we have also double the processing capability. This was not so much a limitation of the EOS resources but this was more limitation of machinery the back-hand of the portal. So we could double the processing capacity to about 200 docking runs per day. And we also enable a tagging of the job submission as COVID related. And this allows us to target those jobs specifically to sites that are supporting us. And since the beginning of the pandemic for example through contact between EGI and the U.S. Open Science Grid we gain access to U.S. HTC resources to run both COVID and regular jobs. Also we had contact through high energy physics again via G.I. And a number of sites have been providing us resources since now a bit more than one month like the physical particularly the Marseille kit in Karlsruhe where we should have been these days. And also a Spanish high energy physics site. And in direct thanks to Andre we added the mechanism to be able to tag the jobs as COVID and direct those to those resources. This is just a snapshot from the last months basically running until the 14th of May. You see the number of jobs that are running on the HTC resources. The violet colors are COVID tagged jobs and you see peaks. Those peaks are on screening efforts but you see there's a constant fraction of jobs that are not becoming COVID and this fraction is only increasing. Over the last 30 days the majority of jobs that are actually were tagged as COVID and you can see here where they have been running. So these are the sites that are providing specifically resources for COVID related research. A large fraction in the Netherlands. This is kits in Karlsruhe. You find Marseille, you find the jobs here in the U.S. So for ourselves we have been doing a screening of about 2,000 existing drugs, approved drugs against one of the COVID protein. And we were able to run this by using the EOS HTC resources in about three and a half days, which for Hadock is a very efficient and fast way of doing that. So you can read more about those effort they have been publication also from EOSCUB from the instructor and the results are available at our website. So with that, I want to finish. I want to acknowledge the support over the years of several European projects but also of the Dutch Research Council. And I have to thank the people who are responsible for all the operation and making sure that our portals are up to date and constantly developing our software. Brian and Rodrigo in particular, Panos and Manon for the COVID drug screening. Former group members who have been contributing to the development of a portal. The EGI team for continuous support for Tenius and Andre for his direct for EGI commitment to support us. Thank you very much for your attention. Thanks a lot. It was a very interesting presentation. So I would first see if there is any questioning from the audience just to let you know that Alexander unfortunately has to leave before the end of the session. So if you have questions specifically for him, this is the right moment to raise your hand and make the question. I'm also looking through the chat. If we have anything for you or that we can address now, just one second. I think Sean's question. Yeah, there is a question for Sean. Sean, would you like to say it loud? Let me look for you. So the application, the license software. Yeah, how do you deal with the question? So as you know, we've been dealing with a similar tool for HPC in cloud resources. We have a problem with licensing and trying to get hold of licenses on the cloud. Yeah, so for, so Hadock is using a computational engine called CNS, which is free to use for non-profit research. So it means that companies, we can not really give access to companies. Well, in principle, we can give access to companies, but the companies cannot do anything with that because they cannot protect their IPs. So the companies that are using our software will be pharma companies. The other issue here is that they don't want to bring any of their IPs and molecules outside their network. Yeah, this is pharma companies are not. We also had in the past a MATLAB-based service. And this was possible because at the time they were open libraries for MATLAB. So you would get the running libraries at the time were available. So you could deploy them on the grid, basically. So this was my timer, sorry. You could deploy with them on the grid. But for Hadock, we also basically, the registration is free, but we are checking all registration and we have to approve the registration as well. So we are filtering commercial users so actually we're making them aware of the limitations. Yeah, I was thinking more about applications that make use of licensed software like IDL or MATLAB. Yeah, so we are trying to, so except for this component which is limited to nonprofits, some of the other software are on software and it's open access. So you have to use as much as possible open access software as possible. If you want to have a service which is open to everyone. We cannot pay for the licenses to support the community. Okay, thanks. And also see a question from Mark Allen to everyone. How do you make sure that the services are sustainable? In EOS we're speaking a lot about getting support for the infrastructure, but you also need manpower to operate those services. All the infrastructure that we're running at our local site doesn't run by itself. There are always some glitches or things that you have to monitor. You need to provide support to users. You have to answer user questions so you cannot leave a service out there by itself and hope that everything was right. And this is a continuous struggle to make sure that I have people in my lab that can do all the work. I'm working in an academic environment so I don't have any permanent position associated with me except my home. And that's very different than our research center. And that's something that I think EOS should also realize that it's not only putting services out there, but it's also supporting them. And you need to support the people supporting the services. Any other questions? Oh, yeah, please. Can I follow up on that? Please Mark. I asked the initial question. Because behind that, there is a clear cost in organizing and implementing your services in this type of framework. And if it's only done in the kind of project way where a couple of years later, it's you are having to do it again because the project you got to fund to join EOS is no longer there. You have to be very careful. That's why I asked about the sustainability. And so we run a data center where we do have permanent staff and we do have long-term sustainability, but we still also have to judge the cost of implementing based on sustainability. So for us, it's a really important question, not just for projects that are for a couple of years, but projects that are for like 50 years. Yeah. So we've been operating for 10 years now and there were gaps in funding. If you look at the EU infrastructure funding, we kept operating the services. That's also important from an end user perspective. If you are stopping your services because now you have a gap of six months, you are losing your users and they are not going to come back if you do that too often. So all the software development is a research question on my side and this is funded more by research grants. But the operation requires people and that's always juggling with the resources that you have. And I hope that you will realize that this also needs to be supported if you want to make sure that there is usage of the resources. Did Sienna has a question? Was a comment, Alexander, if you allow me. I think this question of sustainability has two angles. One is how the services that you've mentioned in the workflow are sustained and surely one concern that we see also concern for years is how the depletable resources like computing and storage can be funded when you want to support a loosely coupled communities of practice like we NMR and the sustainable way to do that is to have programs at national level that ensure funding gonna sustain the long-term basis to communities that don't have a legal entity like ISFRI or ERIK. And that means recognize the sustained funding from ministries for that. In EGI, we're able to do this because we have a quite densely distributed network of data centers and there is a keen interest on the data centers in supporting successful user communities. But without a more organic funding and long-term funding for this kind of communities it's harder to ensure this long-term perspective. The other point is how to sustain in terms of operational efforts the adding value services like those presented today by Alexandra and other Sematic Services. I can give my EGI perspective. We think these are integral parts of infrastructures and we are moving as EGI to embrace and increase the support in terms of operation training and outreach to these components which are as important as infrastructure in terms of computing and storage. And this is a shift. It's an area of excellence in Europe that we should cherish and strengthen. So this is, I'd like to, yeah, this is my EGI view. And EOSC has an opportunity to ensure that. I guess in the sake of time we should move to the next presentation. Yeah, I was going to say the same. Thanks, Alexander, for your presentation and time. So I think next presenter is Daniela. Daniela. Hi, can you hear me? Yes, you can show your slide. Yeah. Can you see my slide? Perfectly fine, thanks. Okay, thank you. So good morning everybody. I'm Daniela Spiga from INFN and today I'm talking on behalf of the other team about the experience that have been achieved in the recent years into two directions in terms of communities and as well as infrastructure. So this is the outline of my talk. I will try to set the common ground about the pillars of the thematic service of Dodd-Assand. Sorry, Daniela, can you stop a moment? Is there is some, it's like we are seeing other windows from your computer popping up. So maybe we can just try to show it again. Sorry for that. Give me one second. Okay. It's flickering between different windows. Oops. Oh, okay. So are you in the outline now? Yeah. Oh, my bad now. But it's still flickering. I don't know why. Perhaps it has something to do with the, I think you're presenting via the browser. Is that right? Yeah, your corridor, is that a problem? Maybe it's not as effective. Maybe you can download it as a presentation and then it might work better. Let me see if now it's going to be better. Okay. Anything better? No, I think it's still coming through. Could you try sharing just a specific application? Maybe that way there won't be interference. So now you have changed. Can you see now? Yes, I think this is, oops, just a second, there still seems to be an issue. I have changed everything, so it should be. Indeed. Deborah, do you have a copy of slides? Maybe you can. Yeah, I have the copy of the slides so I can try to present it with Danila. Just one second. So I'm stopped sharing. Yeah, please stop sharing. I will present it on your view. Just gonna see it and then, ah. Yes. I'm trying to go to full screen. Okay, can you see it correctly now? It's not a full screen mode in my screen. It is a full screen in my screen, obviously. But let me just try. Maybe I have first to go to the full screen and start sharing. Better now? Yeah, for me it's okay. Okay, sorry for that. Okay, please Danila. Let me know when I need to go to the next slide. Thank you, so next slide. I was saying that this is the outline of my talk. There will be a very brief introduction set in the ground about the DotaStemmatic service main pillars. I will move to the showing the evolution in terms of the two main point of view, communities and providers, evolution of DotaStem. So I will make a walkthrough, try to show also the impact that we see on research activities that the DotaStemmatic service had. And finally, I will summarize and try to provide some hints about the future direction. So next slide, please. This is in briefly the main slide. The main message is that they want to share about the DotaStemmatic service. So first of all, the DotaStem for dynamic on-demand analysis service. And since the beginning, it has been designed in order to provide the possibility to create and provision infrastructure deployments in an automatically and repeatedly way, trying to tend to the zero effort model. What does it mean in practice is shown in four bullets that you see on top on the right, where you see that we DotaStry to implement the results abstraction, the automation, the support of multi-cloud, and of course the integration with the federated authentication model. All these kind of pillars have been implemented also using the common solution coming from the IOSCAB portfolio. So going further, the paradigm that implement DotaStry is based on the infrastructure as a code. This means that if we want to cut the long story short, we are fully convinced that we need to focus on what is important to deploy on the infrastructure instead of how to deploy services and infrastructure itself, computing infrastructure itself. So what does it mean these is that actually we want to let the underlying system to abstract everything quote unquote for the end user and let him to work under the hood to make the complex stuff for the user itself. So if we want, we can say that DotaStry in the end allow to instantiate on-demand container-based cluster everywhere and when I say container-based cluster, I mean a set of, a broad set of services that can be summarized as things that goes from big data pre-processing and post-processing platform up to the most canonical best system as a service possibly federated in a distributed infrastructure. Next slide, please. So this is about the summary of the key information again about DotaStry, about DotaStry from another perspective. So in the box on the right, you see how DotaStry is designed to be a Lego blocks platform which can be customized in order to accommodate needs from diverse communities. And you also see from there that it has been built on top of the modern industry standards in order to benefit of the evolution that we have also around us. The role of all of these are most of the common services that we integrate from the portfolio that we have in the EOSCAP project. So in term of users, DotaStry, both researchers, single researchers or very small group of researchers that may have also requirement-specific workflows which end up in a requirement-specific from the infrastructure point of view. Let's think about specialized hardware access, GPU, fast storage, quality of service, all these kind of things. But also target big communities because in the end it boards within a big community. I will tell about these. Small group of researchers or small communities. And finally, also resource providers are a user of DotaStry. Indeed, you may choose to provide your resources through these kind of technologies and services. Next slide, please. So this is a summary of the evolution as I introduce them of the project. It starts on 2017 within the Indigo Data Cloud project in order to integrate the first workflows from the CMS experiment, the CompactMove and Solenoid experiment at CERN. Since then, there was a quite interesting evolution and consolidation of the system that's also been helped by the intense exploitation plan that started with the HiosCAP project when DotaStry become a thematic service of the project itself. You see, there are references in this timeline both to the communities that along the time started integrating and using DotaStry. You see, in 2018, the first and the most important one was the AMS, and I will be back to these, that started to be integrated and nowadays is in production. The year later, we started integration still working progress of the gravitational wave experiment in particular Virgo and the most recent success from my perspective is integration of the Fermilat experiment that is running already a first official data analysis that will be published very soon. And in terms of resources, you see that most of the commercial clouds have been integrated along here, but I want to point the attention as I will detail in the next part of the talk that in the, toward the end of 2019, we have also performed a successful proof of concept integrating the AGI Federated Cloud. There are finally strong synergy that have been established and are in progress since the recent months with ESK project and WLCTG. I will conclude with information about that. So next slide please. This is, there are now a couple of information and I don't want to put, to give you details of course, but this is to show a couple of interesting example of a researcher specific activities or research that has a requirement specific workflow that need to run on this own infrastructure. So, and generally speaking on as having specific requirements in terms of hardware and configuration. On the left, this is an example of Imperial College of London that decided to use Dodas in order to cope with the production and the generation of specific events in the scope of CMS, the CMS experiment. And these happened more or less toward the end of 2019. And in the same period, also the data preservation and open access group exploit Dodas in order to combine it with Rihanna project with the goal of provisioning open legacy data also to people coming outside of the experiment in order to process and to access this kind of open data. Next slide please. This is another example referring to the very recent activity done by the Alpha Magnetic Spectrometer analyst. And the key message here is what is highlighted in blue color as you see that there are two analysis that at the moment in 2020, this is the very recent activities. I may have mentioned what happened in the past but I think focusing on the recent is more interesting. So there are two analysis that will be updated with the more statistics and the more information that the experiment collected. And both of them that will be carried on using the resources provided through Dodas exploiting some facility that I will show in the next slide. So moving ahead, please, next slide. This is the final example that I would provide in terms of support and impact on communities. This is the Fermilat analysis done using Dodas. So we may summarize perhaps all these information saying that the researcher stated after having a try and integrating the workflow on Dodas provider resources and she told us that's very promising and we need this kind of solution. Indeed, they are using I through computing as most of the community that I'm showing today and they have repeated step kind of analysis and they need a solution and they need to scale with computing power as much as they can. And what I did is actually to port all the software that they had in a Dodas compliant environment which in the end means HT condor for them. So something pretty standard and easy. What they got and what they did in the very recent month is to run over 60K analysis jobs and what they have in mind in the pipeline sorry is to run a few more, few more detection and which translate in a few million of jobs that they need to run. So the analysis produced with this infrastructure that will be published very soon. So next slide please. Now I move toward the conclusion of these, the second part of this talk and this is about the impact and the experience that we did integrating the resources. I'm not gonna mention all the integration and the impact that happened in the past year with the commercial cloud and other private infrastructure but I want to mention to the most recent one because it's kind of interesting and new in term of specific integration that we have carried on in this respect and this is about integration with the AGI federated cloud. As you see the AGI federated cloud is providing the infrastructure as I said this federation length on top of which thanks to Dota's Deployer we have put in place an overlay based on Condor the schema here represent from thank you a meter far what are the layers and who is doing what. And as you can see we involved five sites providing clouds and point in the end and we implemented the overlay. The overlay means that in the end you see a single site which is what typically both researchers but also the community like to see. So the less the granularity is in the end of the user or community the better is for the front end let me say. And what's happening in these approval concept is shown in the next slide so please go to the next one and this represented here. So if you see the plot on the top left this is taken from the official monitoring infrastructure for the CMS experiment. We use the CMS to benchmark the integration that I just shown between Dota's and the AGI federated cloud and the information that you can get out of this plot is that we had more or less 400 continuously running jobs during the test. There are several steps that I liked by the white arrow in the plot. The steps are the site that under the join the Dota's federation but then are hidden to the experiment that the experiment just seen more resurgence and so a step ahead peak appearing the amount of resurgence that are joining the system. But as I said, in the end the system just see one side so one end point where to submit jobs if we want to translate like this. In the plot on the right Sorry? Sorry. In the plot on the right there is the very same kind of information taken from the AGI accounting information and if we want to have a fun you can see that there is a correlation between the peak that you see on the first plot that I described and the second one part of the spike and you see that site are joining and peaks on the system of CMS experiment is seen as useful slot where to run jobs. Finally, the pie chart show you the before site generated with Dota's on top of the AGI federated cloud with respect the other three are three belonging to CMS. The bigger one is the Fermilab one and the second one in terms of sites during the exercise that we did it was that one that I described. So let's move to the final information that I want to share in terms of Dota's and providers. This is an interesting one. So Dota's has the capability to manage stateless resurgence provider. What does it mean? So this is a concrete example. We have the space scientific data center at the Italian space agency, the AGI and that is hosting resources for the AMS collaboration. And the key point is that there is no experiment dedicated manpower in the SSDC center and there is no specific expertise on AMS software and computing as well. So there is no one that can help there but there is just a bare metal running running an operative systems there. So what we did and what is in production now is the integration of these resources in Dota's fashion. So this is a complete example of stateless provider. So a provider that has a computing power but no manpower to support a specific activity. What we did was to extend the official and the pledge of the best system of AMS including the resources that I'm talking about. Moreover, we did a further integration using the thematic service resources that belong to the project, the HEOSCAP project that are running at CRAF in this specific example and we further extended to Dota to another stateless site which is running at INFN now. And you see that by magic jobs started running everywhere and in a completely transparent manner for the end users. So what you can learn from here is that we have handles and we have two some competencies in order to benefit of any kind of resources that we can get in order to help researcher and even resource provider. So moving to the next slide, there is the final part of my talk that is focusing on proof of concept activity that we are doing and we are moving to slightly different domain with what we just, I have just explained. So sorry, step back, the slide, the previous slide, slide number 12. So this is the user job, the user job dodas and example of user job dodas as a decision service means a smart service for implementing smart caching. I'm not gonna discussing what is a cache or what is a smart cache, but you think something like a bot which help the cache system in order to understand either file, it's worth to be kept or not within the disk or within the memory of the cache itself. And we have implemented it and we are in the process of developing the engine of the system and everything is running thanks to Dodas. So there is a TensorFlow as a service platform which is completely embedded in Dodas. This can be used as an inference system is used by the cache. But if you think these as a generic system, it may be coupled to anything which is which is requiring inference scaling approach for inference decision. So if you move to the next one, the very final one, this is another proof of concept which is going on thanks to the synergy that I was mentioning at the beginning with Project Like Escape, but also with the data organization management access working group in the context of the WLCG. And it's about implementing a real analysis facility which is capable to integrate federated and distributed federated system of data following the data like models that is being tested and exploited in the context of WLCG in this case. And if we want to give it a name, we are building proof of concept in order to use the technology that I have shown today. And so what is implemented by the thematic service of Dodas in terms of building a computing center with cache and computing center without cache just reading data from remote streaming from data lakes. And the key technical point here is that everything is gonna be integrated using the token-based and capability-based authentication authorization model. And this is done using the Indigo identity access management. So concluding my summary slides. Next slide, please. So Dodas is a high modular deployer manager which is built on the concept of infrastructure as a code. And the key point is that it has been designed in order to implement the possibility to automatically and repeated the build and create infrastructure for data analysis and processing activities for the scientific community. So it is evolving and consolidating over time and the evolution happened in terms of use cases, in terms of technologies that have been integrated and made available to the community, but also in terms of scientific community that are joining the project. So we have, I think, a well-established R&D program and integration program head and the two key pillars are the big data platform and the computing in the data lake landscape. And I think we have got very interesting results for communities that have been integrated so far. And the new recent activities, I think they are setting the ground for promoting the adoption in the close future, the adoption for internal communities and research. And that's it. So thanks for your attention and question. Thanks to you, Daniela. So we are running a little bit, not too much, right? But we can take one or two questions if there is anyone from the attendants that would like to ask a question to you. Please just raise your hand. No question for the moment. It seems for you, Daniela. Okay, thank you. Oh, no, yeah, we have one. Peter? Yeah, I always have a very general question about what do you see as the core part of EOSCUB that helps you make this, make Dode's work? Sorry, I mean, the first part, what about EOSCUB? So what is the... What is it? How does EOSCUB you do this? Well, in terms of So I think the two main things that EOSCUB, in particular, helped, and so this is something that we need to consider as a lesson learned is the possibility to implement an exploration plan that helped and the dissemination plan that helped has in order to get in contact with communities. And getting in contact with communities helped a lot in terms of defining the roadmap and the guidelines in order to perform the proper integration of services and to set priorities. So that's one. And the second one is the fact that having a common set of services instead of duplicating the wheel and having services and implementing something like this thing is one of the big benefits that we can have in this context. So thank you, Daniela. I think we can move to the next speaker for the moment. Fabrizio. Yes, can I share my screen? Yeah, please. Okay, you have a message. Oslo is able to attend this screen sharing. I haven't disabled anything, Rob. Have you disabled something? Okay, maybe now I have a call now. So, okay, can I see my screen now? Perfectly, thanks. Okay, good morning, everybody. My name is Fabrizio Antonio from the Advanced Scientific Computing Division of the UDOM in the Rhinocentral Climate Change Research. And in this presentation, I give you an overview of ECAS, Data Science Environment for Climate Change and its main features and how ECAS has been integrated in the EGI infrastructure. ECAS is one of the use-cap thematic services as well as a compute service in the ECNES project. It enables scientific end users to perform data analysis experiments on the large volumes of multi-dimensional data by exploiting the P-denabled self-insight and parallel approach. Looking at the architecture on the right, we can see that ECAS consists of multiple integrated components from in big data cloud like IAM as authentication and authentication infrastructure solution to check the user credentials with respect to IAM service as well from UDOT, ESGF and EGI or centered around the Ophiga High Performance Data Analysis Framework which provides a paradigm shift from sequential and client-based scientific workflow to parallel and server-sized data analysis. ECAS is a scientific data analytics environment built on top of ECAS which integrates data and analysis tools to support data scientists in their daily research activities. In particular, it consists of several components like an ECAS cluster hosting an instance of the Ophiga framework and Jupyter instance which is jointly with a large set of principle Python libraries for running data manipulation, analysis and visualization and a monitoring system based on Grafana. The environment also provides access to some datasets by using Thresh data server and a number of example Jupyter notebooks and real-world workflow describing indicators from several use cases. Currently, there are through many instances caused by CMCC and ECAS. The video data analytics framework is the core component of ECAS. It is a complete open-source solution used to perform scientific data analytics by means of high-performance computing paradigms and in-memory basic data approaches in multiple science domains such as climate change, astrophysics and so on. It provides parallel server-sized data analysis and internal storage model to manage a multi-dimensional dataset as well as a hierarchical data organization to manage large volumes of scientific data. The features of the Ophiga framework can be directly exploited in the notebooks to run data analytics tests on big datasets and plot the results on charts and maps using well-known Python libraries. This thanks to PyOphiga, the Ophiga Python bindings that allow any integration of the Ophiga operators and workflow into more articulated and shareable data science applications. ECAS has been integrated with one data in order to provide to any ECAS user and it only access to the data repository hosted in a one-data space, allowing any analysis on this shared data. In such a context, a one-provider service has been deployed at the CMCC Supercomputing Center and attached to the ECAS cluster to reduce networking allowances with respect to the access to remote external providers. To comply with security policies in a data center environment, a single client instance has been set up to interact with the provider and data folders have been mounted on the ECAS user zone through NFS in read-only mode. In this way, the data provider is well isolated from the ECAS resources and communication with the other one data services obtuse through the one-data protocol. The integration of ECAS into the EGI FedCloud has been addressed by considering two different scenarios. For aspects related to software setup and contextualization, both of them rely on the Ophiga Ansible role, which has been extended to include the whole ECAS environment services. In the first scenario, an ECAS single instance virtual machine image providing a ready-to-use in the ECAS environment has been created and uploaded to the EGI activity. The virtual machine image has been assigned to a set of trust of the virtual organizations in order to be deployed on the FedCloud. Through the EGI activity dashboard, a user can deploy a prebuilt DMI, get the public address of the running virtual machine and download the SSH activity to access. The second scenario refers to a multi-node ECAS environment, which can be dynamically provisioned on the FedCloud through the EC3 ITOS service according to the user requirements. And without worrying about the complexity of the underlying infrastructure, the EC3 service will take care of automatically installing and configuring the whole ECAS environment stack, including, as said before, services and tools such as Jupyter app, Biofigure, a rich set of data science Python libraries, the Ofica framework, as well as a comprehensive set of Jupyter notebooks from training. Moreover, through an Ansible receipt, EC3 can elastically scale up and down the ECAS cluster sites according to the current user workload. In this scenario, a rather file has been used for the infrastructure manager to define the cluster setup in terms of resources, infrastructure and software configuration and contextualization. To configure and deploy a vehicle-eclastic cluster using the EC3, we can access the EC3 platform from page and follow which guides the user during the cluster configuration process. So we can choose ECAS from the list of local resource management system that can be automatically installed by EC3, the endpoint of the provider, the cluster operating system, the instance details in terms of CPU and RAM to allocate for the frontend and the working nodes, and the name of the cluster, the maximum number of nodes without including the frontend. After a summary of the chosen cluster configuration, we can start the deployment and after a few minutes, when the frontend node of the cluster has been successfully deployed, we can download the SSH private key provided by the EC3 portal and access the frontend node via SSH. Both the frontend and working nodes are configured by Ansible. This process usually takes some time and we can monitor the status of the cluster configuration by using the ECAS cluster-ready command line tool which returns the cluster configuration message when the cluster is successfully configured. ECAS allows the user to get access to its scientific ecosystem through the Jupyter RAM interface. We can log into the system using the username and password specified in the Jupyter RAM configuration file. In particular, in Ansible role, we created for testing a white list of users and we set a global password to use in the dummy container class. So we can access the Jupyter RAM interface using the cluster-ready and the 440 port. From Jupyter RAM in ECAS, we can do several other things such as create and run a Jupyter node loop exploiting biofeed and Python libraries for visualization and plotting, execute operators and workflow directly from the FIDA terminal and browse directors and download upload files in the on-folder. Now, starting from one of the Jupyter notebooks provided for training, I would like to show you some basic operation we can perform with Ophiga through its Python interface, Ophiga and how we can easily plot and visualize their size. First of all, we need to import the corresponding modules and connect to the server using the default connection parameters defined in the ECAS environment. We can now create a data cube by importing the next VF file previously uploaded and here the main important parameters are the source path of course, the measure name, the increased dimension time in order to arrange data as a time series and the your server type. In this case, we are using the native memory system, the Ophiga your server, to run and perform data analytics test in memory. We can check the data cube using the bio-fidia list method, which is a proper to the Ophiga list of the operator. Now, keeping some sense, we perform a subset operation on the important data cube using a FIDA on dimension value. We can then perform a reduction operation by computing the maximum value over the time series for each point in the special domain. And the last step before plotting the results consists in a roll-up operation to reorganize the data structure. And these are the data cubes produced so far. We can now export data in a Python-friendly data structure and use it to create a map. If we want to consider the whole special domain and specify a particular time point, we don't need to import the next VF file, but we can use the first important cube project to perform the subset operation, then the reviews and the relapse operations. And finally, we can export the results to create a new map. And at the same way, if we are interested in the minimum value, we can apply the last operation to the subset data cube without the need to import the subset to the data set again. When we are done, we can clean our workspace by deleting all the generated data cubes as well as the container used to organize them. As anticipated before, we can also execute operators directly from the FIDA terminal. In this case, we can simply open a terminal and using the default parameters defined in the environment, run the oph.com. As an example, we can create a new container using the cc-hilias and then create a random data cube using the rc-hilias and provide the dimension size for latitude, longitude and time dimensions. We can then check the data cubes available in the VF.5 system by using the ll-hilias for the oph.ch list operator. IGAS also includes an accounting system in order to properly track resource usage on a user basis. In particular, starting from the information tracked by the FIDA server about workflows and jobs, a set of accounting metrics has been identified to extract useful statistics about users and computing resource usage. For example, until now, 180 users joined IGAS from 23 different countries and about 2 million jobs were run for a total of 10,000 for hours. To conclude, thanks to the EZ3 platform, researchers can easily deploy on demand a full IGAS cluster on the IGI infrastructure and perform their data analysis experiments exploding a server-side and parallel approach as well as visualise the results using the wide set of integrating pipeline libraries. Next step, regard the integration of the IGI checking service into IGAS in order to allow user login through the federated authentication mechanism and this task is actually ongoing. Moreover, we are exploring the OneData features for a stronger integration in terms of metadata management that acts as using the OneDataFest Python library in addition to the OSIX VF.5 system as well as the integration with Jupyter notebooks via the OneDataFest Jupyter plug-in. And here, some useful links about IGAS lab of IGI and Biophilia. Just one more thing. We would like to also thank the EZ3 at UPV and the IGI OneData teams for their presence support. Thank you all for your attention. Thank you. Thank you very much for the presentation. So we have a question by Sean, but we will make it at the end after the presentation because it is addressing all speakers. Is there any specific question for this presenter now? Well, we can move quickly to Dieter and keep the question for the end. Okay. Okay, Dieter. Thank you very much, Deborah. I will start the video here and share my screen. Okay. Can you see everything? Yeah, perfect. Thanks a lot. Yeah, so good afternoon. I'm Dieter van Edvang from Claire and Eric. I'm the technical director there and today I will try to give you an overview of what we have been doing in AOSC Hub from our work package. And the title is From Hub to Impact for Humanities and Social Sciences. I'll try to give you an overview of how the results look like for say average end user from the humanities and social sciences. I will not give a complete technical overview of everything that has been done. If you have questions on that, I will be glad also to take them or to refer you to relevant information on that. Okay, so what is Claren? Claren stands for Common Language Resources and Technology Infrastructure. And we are a research infrastructure serving the humanities and social sciences with digital language data and tools to process the data. This goes very broadly. So you can think of written language data, spoken recordings, video recordings, multimodal capturings of all kind of language exchange in different sizes also. So this goes from, say, a single poem up to huge digital collections of videos of people interacting, for instance. Together with the data also come the tools to discover the data sets, to explore them, use them, annotate them, analyze them, combine all these data sets wherever these are located. So it is important to mention here that we are distributed virtual infrastructure so we don't have one big installation where everything is running, but we exist out of a network of so-called centers. We are as free Eric since 2012 and in 2016 Claren Eric became a so-called landmark. It may be good to add to the fact that we are providing access to the data and to the tools in an online way is fact that specifically for language data there is often a lot of data that is not openly available due to privacy concerns or copyright legislation. In such cases we still would like to give access to people in as easy as possible way so through a single sign-on. I'll come back later to that in the demonstration I will be giving. Who is powering Claren? Well, 95% of the things that I will be showing today is really realized by our national consortia. So we have 24 members and observers. These exist mostly in the European countries. We have also concerned consortium since recently in South Africa and these are providing access to the language data to the tools through their national organization so to say and the centers inside these national consortia. That brings me to the content of the presentation and I'll try to give kind of an example today by a very concrete case where a historian is doing research on the concept of mental health policy in the 19th century and his researcher is basically trying to do the research in two steps. There's a first stage in searching for relevant material and then evaluating the content quickly using so-called distant reading methodology. I'll come back to that later about what it exactly is but the substance is basically quickly grasping through a large amount of textual data to understand what is in there and then later on in the second stage the researcher will be further preparing the data so this further analysis and enriching of the data for further analysis and this is typically something that is done as a research group or that is assisted by a research assistant. And then in the next step the data will be analyzed linguistically for instance by analyzing the syntactic relations that exist between the parts of sentences to answer a specific research question and just an example here could be to find all objects for the verb to treat and obviously this requires a syntactic analysis and this is where the infrastructure basically comes in. Short overview of our so-called thematic services that we have been integrating into AOSC and with the other AOSC services. First of all we have the virtual language observatory which is a catalog for metadata which allows you to quickly search and identify relevant data sets. Then we have here the virtual collection registry which is a kind of digital bookmark tool which allows you to collect links to data sets at different locations or at more similar locations and then publish it as it's a kind of metadata set in that sense people later on can also find the exact data that you have been using and they can also then use it later on for machine processability. And finally there's also our language resource switchboard which is a kind of tool that allows you to make a match between an incoming data set and a processing tool so you throw a novel to it and it will give you all kind of tools that are suitable to process the language that is contained inside that digital novel. As said I mean this is very much from a user perspective so behind these services we have a full metadata infrastructure running which again comes with lots of services I will not bother you today too much about this but if you're interested you can find more information about this on the website. So of course these services are not standing on their own they are integrated and this I think is an important part of what I want to convey today is the fact that from these pre-existing services we've gone through a deeper and deeper integration both within the already existing clarion ecosystem but also outside so with the infrastructures with neighboring communities it's an important part it really focuses a bit on this hub functionality of putting the hub in the eos cup so to say. So what are these connections? Well first of all we have searchable metadata being sent from say for instance virtual collections to the virtual language observatory so you can search it but also by making data actionable so metadata records that are registered in the virtual language observatory the virtual collection registry can also be sent through the switchboard for processing this might sound a bit abstract I hope to make it more concrete in a few seconds when I switch to a practical demonstration Good so as I said we have integrated the clarion infrastructural components now of course we've also looked into integrating this further with the e-infrastructure services and more concretely we've done this by importing metadata from Euda's B2share repository into the virtual language observatory specifically the sub-section that is relevant for language data and humanities and in a similar way we've also developed in collaboration with the Euda team a plug-in for B2drop which allows you to make data that you have uploaded into your own workspace actionable by sending it through the language resource switchboard I will demonstrate this in a few seconds Going further so beyond the e-infrastructure scene we've also added additional sources of relevant metadata to the virtual language observatory for instance from our colleagues from Europeana which are providing lots of data and metadata coming from the cultural heritage sector having lots of digital books and other relevant data sets available these are now also findable through the virtual language observatory and on the other side here at the switchboard corner so to say we've also integrated some external services to process the data for instance of YR a very popular tool in digital humanities to do this kind of distant reading on material and the nice thing here of this picture is that this means that you now can say process data coming from Europeana with for instance all these arrows that are here really mean there's a connection and it's not only just a kind of theoretical connection it means that it makes the data and metadata actionable all the way through the network and that's what I will be presenting in the upcoming demonstration it will exist out of two stages we will be researching in virtual language observatory for relevant texts we'll perform then distant reading through calling the switchboard and then in the second stage we will take a file that has been collaboratively edited and also again send that off from B2DROP to specific linguistic analysis chain and look at the outcome okay so let me switch my browser okay so we have found the virtual language observatory in the AOSC portal and we'll now go to the service by clicking here and this will present us with a virtual language observatory we said that we were a researcher interested in the concept of mental health so we will be searching for this and this means that we will be performing a search in about a million metadata records as you can see there are 18 relevant results here with the facet browser we can further or more say narrow down the search results in this case we've selected those records in so called the reached us library which is actually the kind of the parliamentary library of Ireland and for instance we can have a look at some specific record that is in there on so called lunacy law which is that was used a lot at the time for referring to mental health issues and this is really quite useful because from here on we can get access to the actual data so we can click on this icon and then we get to see the resource that is downloaded from the library and indeed you can see here that this is a digital book, a book that has been scanned and it's also been OCR so we have access to the textual material here now as we said we were interested in basically doing distant reading based on this file so we can now call click on the three dots here and click on process with the language resource which will open up the switchboard as mentioned and now with this kind of small in between tool actually does is it matches the data with tools that can process it so it has found out that this is a PDF file that it's containing English language and it now allows us to basically see all the tasks that we can perform on this specific file as said we were interested in distant reading so we can see some more information here and most importantly we can now start this specific tool on the data that we earlier on have found so what is being done now is that the content of the book is being sent to this external services which is actually provided by computing center in Canada in this case and we see immediately some basic results from this for instance the high frequency terms if we click on specific term like lunacy we can see how often it occurs where in the document and much more information can be found in a similar way similarly we can also perform additional analysis on this file so for instance we could perform a grammatic analysis so-called constituency parsing through one of the other services that are available in Clarin in this case it is Weplicht which is made available through our German colleagues and this service requires federated login but obviously this is quite convenient so this means I can use my account at Utrecht University to actually authenticate to this service and then send the data over to the specific service to have it analyzed here we get to see the interface of Weplicht and we can let the separate stages of this pipeline run and analyze the data since this can take a while I mean it is computationally expensive operation I will let it run in the background and I will already move forward with the second part of my demonstration and we can have a look at the results when this is finished and second part of my presentation of the demonstration actually starts logging in to B2Drop which is the online storage service provided to Utrecht and here we have again I can use federated login to authenticate to this service and here we yes there we are and here we have collaboratively edited one of the files that we have just downloaded from the parliamentary library and for instance here you can see indeed the PDF file we've converted that into a text file and done some collaborative editing through this interface as you can see we've basically done some basic steps like removing the title page which is irrelevant for a linguistic analysis now similarly as we did for the PDF we can now also send off the text file to the switchboard so there's also a button here sent to the switchboards and again analysis automatically proves that this is an English text file and then automatically gives an overview of the tasks that we can perform based on this input file now in this case we could say we're interested in dependency parsing which is a different way of grammatical analysis of the file and we can start for instance the DUD pipe tool provided by our colleagues in Prague and here we see indeed that the text has been submitted an analysis is made and you can get access to the output in different ways so you can have a look at the tables or you can also see for instance the tree structure that is behind this grammatical analysis you can click on specific term say law or lunacy and you can see where in the results this is to be found okay so this gives a kind of quick illustration of how the data can actually be made actionable let me also have a look at the outcomes of Weplicht so here and I'll switch to a bit of a more relevant sentence because we first had for instance here we can see that the analysis has also been made again we have access to the data in several ways so either tabular or in a tree structure we can zoom into it to understand a bit better what is where again this is something that you then can later on analyze we don't have the time for that today but it gives an idea of how switchboard and the connected tools make it possible basically to analyze the data and to make the data really actionable that is available okay so let me switch back to the presentation as that there are further steps that can be taken here I'm not claiming to data this is the full research process but it's kind of an illustration and obviously then based on the further analysis that has been made you could make an application you could again have references to the relevant data sets or the outcome of the group analysis that has been made and make a registration of that through a virtual collection or for instance in one of the repositories like B2share or a combination you could have a virtual collection pointing to data in B2share etc good bringing me to the end of the presentation coming back to the impact so what have we been doing Claren has integrated three specific thematic services with one another secondly also with several of the EOSCUP services the infrastructure services B2drop and B2share more specifically and then at the next step we also integrated this with neighboring humanities and social sciences data sets so for instance through european and services to think of way on but also many more that are available in the pipeline and I think very important here to realize is the fact that this high level of integration strengthens the multiplicator effect for data and for services I haven't even mentioned the concept of fair data here but it's very obvious that if you have fair data as demonstrated here it makes it much easier to find it to achieve interoperability for instance through the switchboard to increase productivity so that goes without saying and in the end and that's I think the most important impact of the whole exercise is the fact that you can process more data with more language analysis tools leading to easier research process less loss of time for researchers and also this whole network effect so you're basically connecting all the data sets with all the tools and I think this is a very important achievement and also something that could help to make clear why this hub functionality of EOSCUB is so important. That brings me to the end of the presentation if you're curious and want to learn more about specific examples give and have a look at these URLs and otherwise I'm happy to take some questions thank you very much Thank you very much Dieter so we are behind schedule as always in this kind of session so I will start with a couple of questions just a couple of questions that have been made in the chat which are for all presenters so the first one is from Sean who asked if you were not part of EOSCUB project would you still have integrated these services into EOS and why so what benefits do you get from integration to your services if you wanted to start first as you Good question I think we would probably have connected some of the services but probably a bit more loosely so for instance one of the more tighter integration parts that we have done here and that was not literally mentioned in my presentation is the fact that we are using some of the computational resources that were made available through EOSCUB for instance running the component registry and some of those backend services that we are providing or some of the services like virtual language observatory in its several development stages are running through EOSCUB provided computing resources I think probably we would have had less attention for the policy side and this is also something where the active participation in EOSCUB really has helped us a lot in terms of understanding concepts like for instance virtual access and all the reporting that comes with that as bureaucratic as it might be it's a useful lesson to understand how these concepts are put into practice and for instance also the policy framework for instance like FITSM where several of our people have followed one of the courses and it really helps a lot in terms of understanding how to professionally run a service and how to achieve high levels of user satisfaction for your services and I think that helped a lot besides that I think also there were some non foreseen interactions with EOSCUB like for instance our activities in the Reprolang track where it was a kind of workshop where people reproduced results based on natural language processing which required lots of computational resources to be re-run at short notice and there too we were very happy to have this kind of connection to EOSCUB and the ENFRA partners to be able to run a certain of these experiments and this worked out very well and I think that without being part of EOSCUB this probably would have been a lot more difficult for us so I think there was really a lot of added value in active participation in EOSCUB Thanks Peter any other comment from Benero Fabrizio on this question Yes I'm not sure to understand the question but I think I see a great value related to having a rich scientific environment to perform the experiment for research without for example download data as a result on the client side I don't know if the was more would you have been doing the same level of integration without the EOSCUB project that was I think more a short point Sorry Debra I don't understand the question The point was would you have done the same level of integration and for the automatic service without the EOSCUB project let's suppose the EOSCUB project would never exist would you have visited the same type of integration with other services as the one we have shown in your presentation that has been achieved during the project any comment for Donas if you want Yeah please It's very quick I shared part of the previous comments probably not at the same level at least for the technical staff of course for the integration of services being part of the project has been helpful in making a lot of synergy and so on so I as Prius commented during the answer after my talk I think there is also the part related to the exploitation of the dissemination program of the EOSCUB that is really stimulating and helpful which gave us a lot of fruits I don't know if this would be happened without the EOSCUB so I would distinguish the question between technical staff and non-technical activities from the technical perspective of course not at the same level but surely from the non-technical part I think we would not reach the very same level of maturity and activity also in terms of preparing training programs and sharing with communities at that level I think that was a push that we get from the project I think it's clear from those two answers that if EOSCUB still existed but you weren't part of the project you would still be integrating your services into EOSCUB but it would be more difficult because you wouldn't have the close involvement is that a fair summary to some extent that may be correct I mean if you want we can restate even more saying that the vision is completely shared so the vision of the project and the vision of my thematic service is surely well aligned so in that respect there is the possibility to contribute in any case without EOSCUB it's even too hard to say what would have happened so I don't know if you see what I mean it's not so trivial to say okay so in principle yes you are right but then the reality is there's a lot of implication so we are talking because we have had almost three years of intense activity of such a big process project so it's not okay thank you thank you everyone I think we are very late Rob shall we close here I know there were another couple of questions but I think we are we are demanded to close yeah I'm not sure we can close if you are okay with closing now perhaps collect further questions from the chat I have already copied it from the chat so I may be forward these to the speakers so they can answer offline sorry but yeah I know the staff has to get ready for the next section so unfortunately we cannot go too long with that so I like to close thanking all speakers for sharing their experience today in this session it is very important and it was very interesting and I hope to see you all soon into the next event thank you bye thank you bye