 Thank you very much, dear guests, the organizers. Thanks a lot. My name is Panos Bamedis. I'm the director of the Lab of Medical Physics and the Digital Innovation in the Aristotle University of Thessaloniki, the School of Medicine. And we participate in this project for which this webinar is called. RACE is the project that provides the infrastructure that we hope that it will facilitate the decentralized system for processing data based on the concept of crowdsourcing. This basically involves a transition from what is usually called classical processing, I would say, into something more innovative, into something that will utilize open data that enable open access for processing. And instead of sending the data to the algorithm, we appreciate that in RACE will develop the facility to enable the algorithm to be sent to the dataset so that things could be done in a better way. So what we hope to be doing in this workshop today, in the limited time that we have, is that we will introduce all participants to an initial mock-up of what we are hoping to become a system in RACE. And we'll show how this will be done actually, how the processing algorithm, which is usually small in size, will be directly sent into the dataset, which is usually large in size. So we avoid a lot of trafficking, but also we are safeguarding what we are hoping to be done with our data when we have them open in the system. So the current mock-up will be an interactive version that demonstrates some basic functionalities of the system. It will allow the demonstration of what we have named as a research analysis and identifier system. And through this navigation, we will demonstrate to you the participants all the basic features so that you can better understand the system and provide perhaps some valuable insights into how we could possibly improve user experience. I don't want to talk long, I just want to say how important this is. This is the first public interaction with its users in the mock-up system, as I mentioned, but the goal is for all of us, in fact, to appreciate the value of not transferring data around but not only accessing data, but just doing what is minimally required and which is more secure in the sense of protecting the IPs of those people that have collected the data as well as of those people that have created and developed the algorithms. So hopefully this whole mock-up presentation will be followed up by a discussion in which you can share your experiences, you can make suggestions, you can do all the brainstorming that is possibly achievable in such a short timeframe. Well, I'd like to stop here and I'd like to thank you in advance for your participation and I hope we have a fruitful webinar. Thank you. Thank you, Mr. Bamedis. Let me pass the floor to Dr. Evdokimus Kostan-Jermides now. We will talk about Open Science and the RACE project. Dr. Kostan-Jermides. I want to make sure that you can see my screen. Yes, sure, go ahead. Thank you, Evdokimus. Oh, okay, I just want to... At nice decay now, can you see the full slide? Yes. Yes, it's all good. Okay. Okay, okay. Thanks, thanks. Following Panos' introduction, I will go a bit deeper to the details of what Panos described. He gave us the vision, actually, and what we want to achieve but there are intermediate steps that we have to... and milestones that we have to achieve. First of all, I am sure that all of us are aware of the value of open data, which is actually increasing transparency and accountability, improving decision-making, increasing innovation and efficiency, but also economic growth and improving public services and the department of citizens. So before, and then the details of like, all of us to ask ourselves, how many of us will be leaving the value for the open data? And let me answer on behalf of all of us that we do believe the value of open data. I'm sure this is why we are here. This is why we have this webinar. But how many of us look for available open data before we start collecting our own for a study? If I speak for myself, I would say that I always do that, but not as a first step. In the beginning, I'm trying to understand how I should collect data. And then I say, but no, there is the open data community. Let's see if there is anything there. How many of us collect data? And the last five years, I'm sure that many, many of us. And how many of us made this data open and available? I know the answer, but is it only us? No, because actually researchers, they feel more secure. They feel happy to store their data, not open data, but to store them in personal physical data storage or institutional local storages or in the cloud of the institution. And they prefer to serve their data, mainly with researchers working the same project. And this is sometimes due to the obligations that come from the funding mechanisms or with researchers that they don't work in the same project, but they do know personally. So you see that there is a culture of not getting the data in a way that it would be beneficial for everyone. So the obstacles for opening data research data are some of the times lack of data repositories or the complexity of using the data repository. The fear of someone commercializing our data or analysis that happened on our data, lack of recognition. So someone uses your data, they do publication, they never acknowledge your contribution. The time and effort required not only for collecting the data, but also for making openly available, which means that you might need to transform it, you might need to learn how the repository work, you might need to learn how the metadata work and also to create the metadata. So this also implies the skills needed for something like this. The financial cost, because some of the time since we all understand that opening the data takes some times, it might be that you have to decide whether you move on your study or you stop your study and you start your investing time and money resources actually in opening your data. Legal restrictions like sensitive or personal data, especially in the health domain that most of us come from, it's difficult, which is come also from the data protection, the GDPR. So who should change? Could it be us, the researchers, trying to impose ourselves to follow what the open data needs or should it be the mechanisms for open data adapt to our fears, to the researchers' fears and our culture? If I had to answer as I would say that both sides need to make some small steps in order to find the goal point. And this is actually what RAISE project does, where a consortium of 20 partners of which half of which are technical partners. So we have three years more and five million in order to materialize part of our vision to have an open, fair and reliable research community where every researcher will be accredited for their work and all research data will be equally accessible for processing without violating data protection and regulations. So the mission of our project is to move, as Panos mentioned earlier, from open data to data open for processing because we do believe that the value of the data is not in owning them, but in processing them. But how it works, I will briefly explain and then I will end there step by step with brief explanations for all these steps. Actually, the idea is that when we have a dataset, we don't have to upload it to an existing repository or an open repository, but we have to register as available for processing. And then the other researchers, external researchers, can send their algorithm, the script to the place where the dataset is and then to get the results. Let's see, step by step. So as I said, researchers can make their data available for processing by uploading their data on their own or any other Ray certified node. And Ray certified node is a server of a trustworthy crowdsource network offering data storing and processing resources. It could be an organization's cloud server. It could be a small labs server. It could be our own computer at home. And then any researcher can look for datasets that meet their needs based on the metadata. And then they can find that access datasets or existing results from previous processing algorithms that run on the dataset. Once they find the dataset, they can either download a sample, a small scale dataset, a sum, or a synthetic data coming from the source dataset in order to prepare the algorithm, in order to make sure that they can process, they follow the format of the dataset and they process it. Once they do this, they then can upload the script or the algorithm to the place where the dataset is, to the Ray node, to your personal computer. And there the script is executed. The results of the execution are registered to a blockchain network along with the dataset, the owner, any metadata of the dataset, like the owner of the dataset, the results, but also if possible, the algorithm or the script. And this registration gives the research analysis identified, which is a persistent identifier. Then we can publish the results by publishing this research analysis identifying number because anyone with this right number can find the results, but also who was the owner of the dataset, which was the algorithm or the script. And then the researcher having this number can validate the results through the right system. So it's like, I'm sure that all of you are familiar with the digital objects identifier, the doy number, where you can put the doy ID and then you get all the information of the publication. Similarly here, you can add the right ID and then you get all the information of the experiment or actually all the information that is the owner of the experiment allowed to be publicly available. And by this, I mean that the dataset is always available, but the script or the algorithm might be proprietary, so not open or it could be open to everyone because we want, as you understood, to eliminate the fears of the researchers in opening the data, but we don't want to block appropriate companies from commercializing their own algorithms. And then since we have the script and existing script can be applied also to other datasets, to different datasets, or even to find similar datasets and in order to create a big group of small datasets where we can apply the algorithm. And of course, as we said before, then we can get the results. And these results can be set and processed and physically by the research community. So what we offer, we have in our university, at Stony University of the South Nake, we have the infrastructure for providing the central hub services because every request goes to the central hub and then it is addressed to the corresponding processing node, right node. We provide to the research community services like synthetic data generation, data curation preservation, data versioning, the application cycle, many different services. And also for the beneficiaries, the data providers and data processors will provide, we try to make their life easier by providing the corresponding APIs, SDKs and other tools in order to make sure that such a system, such a complex system can be easily used by the research community. The application domains in order to experiment with the right system during the life cycle of the project are in health, in mobility, in environment, but we try also to have a cross disciplinary study where we try in Thessaloniki actually to make use of the data of health, mobility and environment data in order to come up with insights for the city. As I said in the beginning, we are a technical consortium mainly, 11 out of the 20 partners are technical partners and a high TRL is expected at the end of the project in order to be ready to be used by the AIOSC and the research community. That's all from my side. Thank you, Paul. Thank you very much, Abelkimos. Now, Deskunapetsani will present the initial mock-up of Ray's approach, which is an interactive version that demonstrates the basic functionalities of the research analysis identifier system. Deskunan, the floor is yours. Thank you, Emi. I will turn my screen. I hope that you can see it properly. Oh, you could touch the left queue. Good, thank you. And apologies because I have two screens, so I will be looking at both at some time. So, this session is about the scientific community engagement and how we envisage this to happen in Ray's. And of course, we wanted to be an interactive session because, as Panos said, we want Ray's to gather feedback from all the community in order to create something that is useful for them. So, we talk a lot about open data, open science, open software, what the AIOSC does, what are the skills that are needed, citizen science, all these themes and ideas that are very central for the research community, but who is in the center for that and who benefits from all these? Who is the one that will take advantage of open science tool, open data? For us and for you, the answer is the research, the science. So, this is whom we tried to put a defender on the whole design process in Ray's. As you might have heard, of course, there is a popular saying that the issues of us people some years ago, what they would like to move, they would ask for a faster course. So, of course, we should have a central idea in which we will live upon and we're not asking people what we would do. So, we have a central system and idea of Ray's that the document was presented, but we want to build them in a way that the researcher and the scientist can use. So, we want to start from what we need and not from what the data or the software or the innovation needs. So, we want to sacrifice and we must sacrifice technology innovation in order to create value for the end user. There is no need to creating something that might be very innovative, but no one will use. So, we have a whole work package dedicated in Ray's that is monitoring and assessing this and this webinar is the first of the series of interactions with the community that aims to gather this feedback to show what we have to the community and experiment on how they can use it, what they can get back and how we can improve the system. So, we want the system that addresses the concerns that we have and not create new problems on how we will store data, how we will access data, how we will process data. And we try to build on the current culture. So, we try to experiment on what is currently happening, how people are currently sharing their data and analyzing their data in order to provide something that will not disrupt their everyday work, but actually build on top of them and add tools in their tool kits. We want the system that is intuitive, easy to use and, of course, useful. So, we want your views and as we move on, we will have a more interactive version in which you can see and you can experiment with the very first version of the system. And to do all that, we have a methodology that has several steps of releases of the race system. And for each release, we want the community that is being created. So, we hope that you as part of this first webinar are part of this community that is being created and will follow the releases of the system and provide your feedback as we continue. So, the first, we are now at the first release which demonstrates simple data upload and data finding functionality, which you will see in a bit with simple mockups that are interactive up to a point, of course, because they are still mockups. Then moving on to the next step, we would like to provide the functionality of preparing and analyzing a script, the scripts that will be sent to the datasets for analysis. So, this functionality is the next step to be designed and delivered to the end users. Then we have the remote processing and registering experiments. So, how you can process the data remotely and how you can register the experiments to get that persistent ID that the documents talked about that will be similar to DOI, but for your experiment and dataset, and that will preserve the work that you have done for collecting the data, but also for analyzing. Then we want to move on on getting the registered results and being able to publish the results because in a community of open science, we don't only want to share and make available our data, but we want to share also the analysis that we will do so that anyone can benefit, can build a local path and move the science and scientific evidence further. So, we want the registered results, the results that have this persistent ID to become available and to become published for others to check and build on top of that. So, these pinkies lines built the whole system of race and then we move to another version, this blue ones that add more functionalities on the basic system that we have. So, we want to offer popular dataset preservation, popular dataset from out there, but also from inside the system, datasets that are being processed a lot to get specific preservation and mention. We want also a researcher to be able to enhance a processing script because in the previous version right now you can only upload new scripts, but we want to move to a point that you can enhance your scripts and that you can make changes in order to improve your experiments. Deployment of new nodes. As nodes, we call the local storing systems that will be installed in various places across Europe during race. And of course, we want this to continue also after the project with new deployment of storing. So, we want to create, our vision is to create a crowdsource storing of data so that you can have the infrastructure that you need for storing your data and then easily send your scripts of data for analysis. And then of course, after gathering all this feedback, we want to create final improvements during the project lifetime. So, after each of these steps, we will have interactions with the race community either via webinars or with questionnaires, with maybe one-to-one interactions or other ways of adding comments directly to the system in order to gather requirements and progress on what we want to do. So, I think that I've talked enough. So, it's time for you to also start telling us your opinion and what you think. I hope that you're all familiar with Mendy. If not, it's not something difficult. So, if you are on your computers or on your phones, please open a new browser and go to mendy.com and add this code. Or if you don't want to do that, you can simply scan this QR code that you have in your screen now with your phone's camera. Or there is a third option. You can get the link from the chat that my colleague Lenny will kindly share with you so that you can enter. So, read the QR code just a bit more. Four people to join. The code is 11435592 for those that want to enter with code. Okay, good. You have already started answering. Okay, just wait a minute. You just enter now. So, there are only two people entered. So, please, the rest go to mendy.com and write this code. It will ask for a code. I can show it to you. If you go to mendy.com, it will ask for a code and you will enter 11435592. Anyone else that would like to join? Only four out of four people? If you have any difficulties, please mute yourself and let me know. And do not answer yet. Just, this is just landed. Okay, five people, seven. Do not answer yet. I have something to show you first. Just, please do this and enter. Yes, test is working. So, okay, eight people, 10, okay. Does anyone want to ask something for how to enter or do you want us to do it? Okay, so, please, the rest enter. We would like to hear your views. So, let me go back. I hope that you still can see my screen. The first mockup and the first flow that we have is for the data consumer. The researcher that wants to enter into the system in order to find data sets and analyze them and get the results back. So, I am a researcher. I have a recent problem and I need some data to answer my research question. What that person can do? So, go, if you have an account, you can just log in. If no one needs to click create an account so that you imagine that we have an account and we enter. And here is my board. In my board, what we are seeing in the future can see is my data sets, my scripts and my experiments. My data sets is the data that I have chosen to process. My scripts are the scripts, the programming scripts that I want to upload in order to process the data. And my experiments are the actual running of the scripts in the data that I have chosen. So, this is what we are pursuing for a board. And that's where the first question goes. So, what would you like to see in your main board? Is that enough? Would you like to see something more, something less? So now, please answer to your mandate that you have mandate. What you would like to see in this? The people that are entering the code, you are already registered. So, there's no need to enter the code again. Just enter your answer. And also if you are having trouble with that, you can just unmute your mic and tell us what do you think should be included in this board. So, I see here the available data sets. So, that is available in the meta data description plus in the data categories. Okay. Origin of data sets. Okay. So, what do you, if you want to speak, what do you mean by origin of data sets? I don't know for what kind. Where, who, and with experiments? Okay. So, let's see. This is the general view and some nice ideas about meta data that is available, where and who. But let's see if we cover that. So, if you go to my data set, you can see all the data sets that you have saved for using, the ones that are your favorite, the ones that have been used already, the ones that are unused and the ones that you currently have no access. So, if we go to a data set, here we can see a general description, some details that are the meta data of this data set. The files, as you can see, here is a CSV for example and with similar data sets, you can find it in range. So, here goes the next question. What information you would like to have for its data set? Is it enough? What we have thought of? Do you want to have more? What's your ideas? Do you like to have something different? So, if you go back to your menu, you can answer or you can even just pick out or write it in the chat is okay for us. Data set type of our category. So, do you want to comment more on what you mean type? Do you mean if they are CSV, if they are JSON or so, or do you mean the domain? So, what do you mean by category? What would you expect to see as the categories? Okay, you want to see this one. So, how the data really are. Okay, and would you like to see also categories for the domains, for example, health related data and mobility-related data, environmental data, would that be also useful? Would you expect something like that or does it feel free also to write it in the chat or just open your mic and talk. That's the main, okay. Responses, publications, oh, that's a good idea, thank you. So, you can think of whatever more you want to add statistics. So, these are the ones that you have chosen to work with, let's say, but there is also the data set marketplace in which, as you can see, we have category, we have data format, we have data and we have the lab. So, which node, as we call it, this is located. And we have a search thing as well in which you can arrive to work. And here you can see all the data sets that are available. And you can, for example, choose one and either save or request access. So, if you request access, you will send a request. There are some data sets that are available instantly, but some you need to request access. And then when you have access, you can go back to your data sets and you can find the ones that you have access already, but the ones also that access is pending. So, here you will need an approval from the lab that this data set is coming. Let's see, we have more, okay, geographical covers. Okay, so where the data are coming from or geographical coverage of the actual information of the data set. So, please see the one that ended up, please clarify. Method data, data protocol application, okay. Preview of the data, that's very interesting. Okay, so we have seen more or less how you can locate data how you can request access and how you can store or save your data sets for further use. Let's see how my scripts are structured. My scripts are the scripts that I create in order to process a data set that I have stored in my data sets. And you have several scripts and you have also the status. So, have you tested? Have you already run that script on your data set or not? And if it's a big process that is time consuming and the status might be running because it's continuously running on the data set and didn't produce results, yes. So, here you can see also the script information. So, the script that you have uploaded and the requirements that you have uploaded for some information about the script. And we can go to the next question which is what information you would like to have for your script? What would you like to see more? If you don't want, if you're okay with that, you can ask, so write nothing, but please feel free to propose something more that can help you when you have any script and you want to run on a specific date. So, what you will see there, requirements. So, do you mean the requirements text? The TXT, if you could please write some more info or if you want to share with us what you think, what you have in mind, interesting. So, what's the programming like? Is that the script is uploaded to it and also be taken by the extension, but that's important, thank you. Any other thoughts from anyone? Please also unmute yourselves and speak. Keep your cameras closed if you want. Okay, so you want to ask? Can I make a question, I'm Tashos from Inovax. If, how it's possible to create the scripts concerning a special dataset, do we have a clue, how is that organized? Or there is sample scripts where you can understand how to use the data? Yeah, that's a very good question. You actually can mock up, but here you can download the sample. Of course, because this is the mock up data, the sample will not be downloaded. So, you can download the sample, make some experiments like locally with the script, and then when your whole script is ready, upload and run it to the whole dataset. Okay. It was not your question? No, okay, it was clear. But on the same manner that you provide some, I think it's a good idea to have also a sample script that will help us also, maybe with the full dataset, a sample script, in order to, it will be faster for us to make a test, meaning that we use the data that we need from the fields, in order to understand also. No, that's a very good idea. And this way, if I can intervene, that's enough. And this is why we also would like several researchers that use the data and they go through the very first time that they use the dataset. If they make their script available to the community, so they don't care about protecting it and maybe they are okay with disclosing it, then other researchers can use it as an example in order to see the external searcher, the other external searchers how to read the data. And maybe also if another researcher has already implemented the filter, let's say, for the dataset, then you can use it in your experiment and then you move on with your own algorithm. Yeah, exactly. And also to add something on what the document said, actually, you may have 10 datasets from health, let's say, but you'll lose a lot of time in order to see which one is best for you. So if you have an example of it will help you to gain some time and say, okay, this is better for me. Okay, based on what we discussed. Yeah, that's a very good idea. Thank you. And let's continue some more. So this is how you upload your new script. That is the name and a URL option or there is a file option, but you can find the file from your computer. And of course, if you click upload, you have the new script and you can go back and you see the new script uploaded in your scripts. And if you can use that script, so the subject is tested, so you can constantly update new scripts to run new experiments. And so we have another question. If you have any comments for the script update, how do you find it? Do you find it cumbersome to use the upload? Or if it's okay for you, how would you use? What would you propose for improvements? How would you do it differently? So I need this question open for your comments. Can you refer to older versions of scripts? Actually, right now that is not available, but we hope to deliver it in the enhanced processing script functionalities that we will add as we continue. So this is an ongoing process that we want to improve, but this is a very good idea that you can refer to older versions if you have done a mistake or you just don't think something. So please brainstorm on what we can do to improve this functionality. And I will show you some more. Last is my experiments. So an experiment is a data set and a script. And you can see here I have script one in data set one and the experiment is running because that set is huge. So I have 5, 58%. And I have another one, for example, that is script one in data set one that is completed. I have the results that I can download. The results that my script has produced and also there are the ones that have produced some error. So if this is finalized, it might result in something completed, but it also might result in an error that has stopped the process. So you will get also the error messages that were returned. And also you have added some comments on how we can have the script and the data set combined in the view, but how do you see that run experiment functionality? Do you find it easy or do you have any comments? I will leave that question running as I show you some more flows that we have. So you can also access if you are an unregistered user and you want to have that option for being more open. And if you access as an unregistered, of course you will not have my board, you will not have script data that you can have access or experiment. This is only for users that are registered, but you will have access to the data set marketplace and from there you will be able to see what its data set has and also download the sample and proposed we can also have not only a data set sample, but also a code from different codes that you can download. And you can see also the details of the data category, type, et cetera, and a description that has been provided so that someone can see first what is there and then decide if he or she wants to register to get access. And last is about the data provider. So if I have a data set that I want to make available for the research community to use. So again, you look in and then in my data sets you can click, upload new data set. You have to write a name of data set base and a description, no one of this thing of presentation in the marketplace. And also all the information that we request. So with all those contributors license categories, all the information that you have seen in the marketplace. And here of these are mandatory information. We need some to select actually to which knowledge you will store your data. So the data sets are stored in the local places that raise provides for storing. And these are located in different areas. And in order to give that crowdsource opportunity of the space and infrastructure, so you can choose. And now you can choose all the options and that you will get a message that the data set is uploaded and you go to your data sets and you can see the data that you have uploaded. And from here, you can also have the status, how much this is used, if it's popular. So if it's popular, you get more credits and you will also be mentioned through this persistent identifier for the data that you have offered. And so this is also a question about the data upload functionality. And we would like very quickly, please those of you that have disconnected from the mandate, please enter again. And this is a final question that we would like you to ask. So please go to mandate.com, use this code and the answer from strongly disaggregated, strongly agree with the current version that you see and ask honestly as we can, if you believe that this system will be easy to use. If you find that functionality is useful, if you imagine that people will use it quickly and if you think that you will actually use. So these are three very simple questions. Please enter into mandate and answer us honestly as possible, we want to improve that. We don't want it to be something that you will not use. There is no need to build something like that. So this is very important for us. And with that, I will conclude my presentation in this webinar and I will also ask you to get involved in this process that we are doing in this attempt that we do to create a system that will help us analyze data and find the available data sets. And you can go to the link that's based in the chart and sign to our community in order to help us make the system better. That's all. For me, we have one minute to go any comments. Thank you very much, Desperna. We have only one minute. Thank you everyone for your valuable input. Please feel free to raise your hand if you have any questions for the little time that is remaining or use the chat function. So perhaps we can close this webinar and thanks very much everyone for participating. We will be following up with all of you to share the recording and the slides and also some ways for you to get more actively involved. And we hope to see you at another race science event. Many thanks also to our presenters.