 But I will, I probably will, can start. Okay, so a very impressive conference and as a conference OpenStack becomes open-infra, you should expect more and more topics that are beyond just OpenStack and cloud infrastructure. And my presentation is actually about moving from data science projects that are mostly developed as a Python Jupyter notebook and actually after the application is developed, data analytics, they need to move to cloud. And here is becoming all these difficulties. First of all, the data science and big data application is actually the same application that has software and it's need to be deployed. But interaction between data science team and DevOps team and engineers creates some kind of misunderstanding and this is actually what we are trying to solve in our work. And you see what I will be talking is a motivation why we are doing this. After that some information that quite useful for DevOps team, software programmers, developers and for data science teams. And after that we simply I will explain how we do this experiment is Amazon cloud and Azure cloud. And finally I will actually add my last slide with you see at the end. This is that we are ready to look at OpenStack as a basis for the task that we will develop and for education because very big part of my work is actually to do education and training for students and for my colleagues that are working for research infrastructures. And this is what motivation. The number one item is that we are working at University of Amsterdam. This is branded as a research university. This means that we will we run a lot of infrastructure project development and application cooperating this industry. At the same time we need to address the education and training both for students and for a specialist in different projects. So I would not say a range of training that we provide or our faculty provides but it's actually covering from such application areas like maritime industry to the data science and efficient intelligence and data spaces. And currently Europe is building I don't know how many people knows about such interesting area of activity and actual investment in Europe is research infrastructure. Europe runs a number of research infrastructures that serve all researchers in Europe. Such infrastructure like EOSC, European Open Science Cloud, EGI, European Grid Initiative also giant as a whole network infrastructure for universities and praise is a supercomputer infrastructure and also brain currently for the so brain emulation. And now the slices is a new infrastructure that we will build around five years. And currently one of the tasks that we are looking is both select technology establish the platform for running experimental research and providing services for users. And we call this, okay, surely we know the best development by standard practices and cloud and we call this platform research infrastructure as a service that use actually platform model from tele-managed form. This is not in this presentation. I'm only covering aspects related to the something that we call a running data driven research. This core is data processing, data preparation, data processing and so on. And the research infrastructure management, what is listed currently, do we have a point or not? Possibly this doesn't work also. So you see that a list of this infrastructure is a research infrastructure is covering the whole Europe many sites and running hundreds of courses for research computation and supercomputing computation. So running this infrastructure require really special training for the personnel. And this infrastructure are running IT courses, develop courses and currently we are looking at SRE because it's really the something that we see as a benefit to run the so-called data driven research. By the way, if you look at the SRE books or from Google, they actually provide the cycle, the whole cycle for support infrastructure that are user focused but actually data driven. And this is interesting aspect. So we are still not at the point to teach or train but we are looking preparing to this to include in our solution. And our motivation and what I will cover in this my presentation is the data driven research that require data science analytics for data processing and also methods or practices to manage the whole cycle of data driven application. What is different from normal software? Because there are two cycles. Either maybe not two maybe more. First is a cycle of developing the machine learning model. After that deploying this actually using the standard DevOps process. But beside this we need also to manage the whole the cycle of data because if we are working this data so we need to first of all collect data, save them and finally then we use them and use the model development we will use the production data or real-time data. And all stages of data science project development need to be their data staged or lineage supported for many reasons. So managing data storage and data say preservation or lineage is an important aspect of all data driven projects. And what on this slide I simply want to draw your attention that problem is data science project. That majority of data science project starts from the standard Jupyter Notebook. And not all Jupyter Notebook is originally run on the cloud say. So statistics somewhere here that around 80% of data analytics project have ended as nothing because they could not be implemented or deployed on real time or on real infrastructure which is typically cloud based and big data. So managing the running application this big data is a critical is a problem. So what we need to do? We need to educate data science developers this knowledge how to deploy their potential infrastructure and actually educate or provide knowledge whatever. Software engineers this knowledge how the data science project are organized which are processes and what to look. So this is what one of the task that we solve for our say running project research infrastructure for digital research and in education. And simple education actually it's a big problem if you try to run courses that are with students sometimes much more experienced than you as a teacher. But still we benefit we trying to provide benefits for students give them system. For example for data science students we provide some kind of basic knowledge for in on DevOps software engineering for software engineering students we provide introduction into data science project. So this is how we work. And going back to the okay returning to the data science development project it contains a number of steps that are not typical or not standard for the software engineering because it's starting from collecting data working this data a lot of work with data. Actually data preparation cleaning whatever is typically somebody saying is up to 90% of the data science project. So combining data after it goes actual data science project or machine learning project is a feature engineering algorithm selection more than training. After before deploying this application analytics application need to be serialized to the in specific machine learning model. And we will look at the formats in the processes so this need to be serialized and moved to the deployment stage. Deployment stage is typically doing the software engineers or sometimes we call them data science engineers or big infrastructure engineers. So depending on what what is a company select or how they define this their position. And this model actually shows you everything what is first group is related to the data science part development and second goes the deployment and production operation. And a lot of problems with kind of data you are using for this. So recommendation is that data that used in the different stages are actually separated because if you mix them or intersect will be kind of model drift and will be problems in the final product. And if you look at the standard DevOps model so you will see that it's known to you but how where we combine them. So possibly after only after the model is developed. And if you look at the more separating how is DevOps or ML Ops is more or less the same how they relate to each other. So DevOps standard is having the develop build deploy operate. ML Ops or data ops is actually a benefit from the same well developed processes in DevOps and adding this to the this data related issues. I will not be staying long time for the data science process models. But first model is a Chris DM is something what is what we advise to software engineers to learn to understand what is the data science project. What is data analytics project which are steps needed. And all this model actually includes just historical issue related to the such stages stages like business understanding data understanding data preparation modeling evaluation and deployment. And all the cycle goes iteratively at each stage. This is more generic model. But if you look at the most specific model developed by IBM assume analytics solution unified model. So you clearly see that where is a cycle of developing the data science part or machine learning model and after it goes deployment or preparation optimization. And this is how they need to interact. Quite interesting model just if you are trying to master this area it is quite useful to know the best practices. In particular this Microsoft data team data science process is a very well developed and actually implemented in a framework of DevOps Azure DevOps the process of the same step stages. And you will see left and right specific stages or developments that you need to do this. So this is well developed you get the full template for the project and this you can run. And data analytics models. This is actually the whole zoo here but knowing how these models that format that allows serializing data machine learning models is quite useful. And again if we are you or we will try to make the same available from OpenStack this is actually what you need to do to run to take this niche of data driven data science project support. Currently we well investigated and used for education the Amazon platform and Azure platform. And this Google we tried but for education we use mostly only Amazon and Azure. And the most powerful tool in Amazon is a SageMaker that actually covers full stack full cycle of developing machine learning application. And after that it goes use a standard way of deploying this application to the cloud. And this is how you do this. Left part it's a extracting the template cloud formation template deploy it and all cycle of the big rectangular actually runs on the SageMaker and allows many, many tricky things to tune model automatically select algorithm and so on. Okay Azure has the same but inside of the ML Ops framework and has one of the interesting benefits that they provide also special data science virtual machine quite affordable even for education purposes. And this connect the virtual machine fully into environments of all what is needed to run and deploy data science projects. And on this I just want to say for the purpose of education and training we actually developed so-called body of knowledge. Many of you who got education or training here about software engineering body of knowledge, DevOps body of knowledge or agile we extended them this data ops and ML Ops specific elements and this is what we did. And I told that currently we are looking on SRE to cover the purpose why we are looking at SRE to cover the task of the whole ecosystem optimization. Because currently if you talk about big research infrastructures there a optimization of energy other aspects green is very important. You cannot do this in one way but using SRE methodology you can actually create a model for optimization and using specific KPI. And I think we in next couple of years we try to use also introduce OpenStack into our courses and training. We'll find possible look for cooperation. Thank you. I'm slightly extended time but okay. If question