 Let me introduce our next two speakers, the last two of this day, of this afternoon. And in recent years, there has been a rapid increase in the use of Jupyter notebooks and applications, as you know, that combines text visualization and code in one document. They're widely used for prototyping, research, analysis and machine learning. However, Jupyter notebooks have been seen as inappropriate for creating scalable, paintable and long-lasting protection code, at least until now. Because our next speakers believe it's possible and they want to tell us how. Please welcome Viacheslav Kovalevsky and Gonzalo Gaska, both from Cloud AI at Google. How are you, guys? I hope I pronounced those names correctly. Probably not. This happens all the time in Spain. How are you? Lovely to see you. Finally, you are the golden closure to our second day, no pressure. But you've seen, we had a lot of different topics, super top speakers. So guys, the pressure is on. I take advantage to remind our viewers that they can ask you questions through the chat. Take advantage to start sending the questions now. I try to answer them all, ask them all, sorry, in order. But I'll do my best, okay? So the chat is on fire. You can do it either in Spanish or in English or whatever. We'll sort it. So welcome to the attic. Welcome to this big things conference. And we're looking forward to listening to you both. So whenever you're ready. Thank you. Thank you for the introduction. And yes, there might be a very small, very small in delays between what we're saying, what you folks are hearing because we are on the best course of the U.S. and Pacific Times on, so it looks like the latency somewhat is there, but we will still try to make it as manageable as possible. Nevertheless, let me start the introduction before we'll jump to the slide deck. My name, my short name is Slava. I usually saying people don't even try to pronounce my full name. Slava is just good enough. I leading several teams that building different, different solutions for making sure that the researchers can productize their work through the different, different services that we are creating. And we will be covering a lot of VSCs today. And today with me, I have Gonzalo. Gonzalo, do you want to also to quickly introduce yourself before we're going to jump to the topic? Slava, I'm the lead for the Cloud API Platform notebooks. And we're very happy today to talk about how to integrate these notebooks into your pipeline. So, Slava? Yes, so the first part will be kind of a dialogue with my friend in the studio. When we started doing our presentation, at some point I realized that fair naming probably would be something like amelops. Yeah, I know the wrong slide. That's the right question. So, but the thing is amelops is somewhat overhyped and somewhat broken in many ways. And so many people meaning by amelops so many different things that it probably would have not been right just saying that we're talking about amelops. So, because it's just broken in so many ways. So, with that, let me quickly show why I do believe current amelops state is so optimal in the industry. Let me start with a showing high level overview with a very simple usual ML pipeline. When anyone starting with their work, they usually doing it, you'll see in the JupyterLab notebook. This is what everyone have seen in the wild. With several caveats, when you're starting prototyping, yes, you're prototyping in your notebooks. When you're done manual prototyping, you have everything there in the notebook, you're manually uploading the model. This is what we called the level zero automation level. When you have everything directly in your hard drive or desktop and you're doing manual deployment. And this is for Python scripts or R scripts. This is just focused on Jupyter notebooks. Correct. Yes, you're absolutely to the spot. This obviously doesn't mean that you need to be hardcore only in notebooks. Yes, today we're going to focus on the people who do use the notebooks and I've seen many teams that actually building their processes around just your Python code, avoiding notebook step. That is fine. This particular talk will focus on other cases when you do use a Jupyter notebook. So at some point you're screaming and saying, no, this is not how things work because we don't use notebook. Yes, this means there's probably, you're not the audience for that specific topic. Now, after you done your level zero automation, at some point we see that teams investing some time to building continuous integration system that allow you to retain continuously your model that you have on the notebook. And this is what we call a level one of the automation. Now, obviously, as you can see, after that, you can finally create and invest in automation and doing the auto-release of your model to the production, to the inference, to the prediction. Obviously, there will be a lot of testing in between. This is a huge level of simplification. You're getting to the level two. And as you can guess, as soon as you close the loop from the monitoring to supplying the signals back to the PUC stage, you finally have the full automation from level zero to level three. So my friend, let me ask you this. At this point, is everything feels understandable, this model of description of the pipeline? Yes, I really like how you put all these building blocks together. It's very easy to understand. I like the level zero to level three flow, but this is not something that I have seen. You know, when you look into MLOs and different presentations, different architectures, I have seen a lot of different things, just to name a few data prepositions. Data ingestion, feature store, model version. I don't see any of those concepts here. Are we missing something? So you're right to the sports. Yes, and let me try to guess. I'm pretty sure I can read your mind and guess that on the internet, you saw something that looks pretty much like this one. Am I correct? Yeah, exactly that. This is what I'm talking about. You know, when you think of like an enterprise, ready machine learning pipeline, this is what you wanna achieve. Yes, exactly. You're right to the sport. If you do, we'll try to research right now MLOps topics. You probably will find a set of blueprints that looks exactly like this, which can be overwhelming. And obviously, this is the North Star where you want to go. And knowing the North Star can be useful in several cases. For example, if you're building a MLOps team and you want to figure out where you will be in several quarters, or you managing a MLOps team, and this is the state at which you're in in several quarters. Now, then you can go backwards, you can assess what you have, you can adjust several things that mismatches between North Star and what's actually happening. And other things, if you actually building a new MLOps department and you're doing it kind of waterfall style, you want to plan everything upfront, know where you're going to be. Wait, did you say waterfall? I haven't heard that term since like 2001. I mean, and right now, all the software engineering teams, I don't think they use waterfall. Yes. You're right to the spot. Exactly, yes. However, obviously, when we describing the North Star where MLOps need to be, it's inevitable that we working with assumption that everyone focusing on this level three automation. When we decide, when we're focusing North Star, we're saying, okay, how the end goal will look like. And this is an assumption that a lot of such blueprints having, and this is normal. When you're in new in the field, you first want to figure out what's the normal best case scenario state in which you want to be in. And obviously in such words, there is no level zero. Each of these components that you see on this huge diagram, they cannot be used by themselves. They only can be used when you have the whole picture build around you. Now, let me ask you this question, my friend. What's your the most favorite tool for building MLOps infrastructure to these days? Qflow, and Qflow is the solution for days, right? Why are we looking into something else? Of course, yes, yes, exactly right. I also using Kubeflow pipeline left and right in many, many of my pet projects. And Kubeflow pipeline is a tool that design to provide you the ability to do exactly that, to deliver that North Star, to go from the theoretical North Star to your practical one. And you can even found many, many pictures that says, okay, the picture resembles that specific North Star that we just shown to you. However, let's try to map where the seats on our simplified version of lifecycle of the automation. If you look on the simplified cycle for the automation, we have this level three that can be achieved with investing and building pipelines. For example, with Kubeflow pipelines. And there is a level zero, level zero where you have a notebook, just notebook. Let's say you have some automation that visualize Bitcoin prices or anything like that. That source code is not aware about the existing of the pipeline at all. And the question now is how do you going to connect these two words, level zero and level three? And if you would be just to try build a theoretical path from left to zero, you might end up with one interesting but wrong assumption. There are many chain of thoughts that confusing statements that the tool X can be used with notebooks. With tool X can be used from the notebooks. Even though it just one word difference with or from, but it's actually what makes a difference between life and death. Because if you can use tool X from the notebooks, this means that in theory, you kind of can go through level zero to level three because you can install Kubeflow pipeline SDK and inside of your notebooks and use this SDK. So in theory, this means you should be able to simply go from level zero to level three, right? Yeah, but installing an SDK into the notebook, that's kind of like not what I really want to do. Like in really data scientists, right? They are using machine learning frameworks such as TensorFlow or PyTorch. And some of them they have a hard time trying to catch up with all these frameworks because they move very rapidly. You have TensorFlow 2.X, you have PyTorch 1.6. They need to understand and master that. And now if we want to introduce different concepts such as SDK, such as maybe Kubernetes infrastructure in a notebook that would be challenging. And just for the case that, you know, once you create a notebook, you make sure like it runs and regretting a notebook introducing an SDK that I don't think it's going to be realistic for my data scientists. And in many cases, you exactly right as a spot. Surprisingly, this problem even were discussed today in big thing conference and so many, many talks. It's actually not as easy as some of us like to think. In reality, this results in the fact that imagine you have a six notebooks in your enterprise and your department. Out of the six notebook, if you're asking folks to actually invest and heavily rewrite them in order to enable continuous integration, only three like 50% will actually end up seeing automation of the training process. Out of that three, if you're asking folks to rewrite again to have supporting auto deployment, maybe two notebooks, the two artifacts that originated from the POC of the notebooks will ever reach the level two. And out of that, maybe only the most critical model that the companies build around actually will be refactored in the way to support all the level of automation. So you see the conceptual problem now, right? Yeah, you know, like you have, we have seen that you have provided the building blocks on each of these stages. It kind of makes sense that when you are starting with experimentation in a notebook and then you are moving from level one, zero, one, two, and three, you want to have a path. You want to understand what do you need in each of these levels and how you can actually reuse the previous step. So you don't need to modify it tremendously, right? So if you wanna start with a notebook, you wanna end up with a notebook as well. You don't wanna start with a notebook and maybe re-change everything. So it's really nice that we have now a North Star, right? That's the North Star that you presented us with this diagram with all these features. But if I really wanna reach that stage, I need to have a path. And I think this diagram makes sense, yeah. Yes, you're right to the spot. So effectively, what we want to discuss today is the help and what we can do to help to move from level zero to level three. And eventually you will get to that North Star. We just want to make sure that you have as smooth transition as possible. Now, year ago, we already presented a different set of views, different mindset that if applied practically, can solve some of this problem. Let's just very quickly revise what being showed back in the day. We stated that there are several principles. If they are followed, you can use some of percent of your notebook directly in production as is. These principles are very simple, but you will be surprised how many teams actually ignoring them completely. First of all, if you're working with notebook, you need to follow established software development best practices. And in my mind, people using Jupyter notebook, they are conceptually not that far away from niche development, like for example, Android UI development. And this would be a really strange for Android UI developers to say, we don't want to follow best engineering practices because we are niche. However, somehow in the market, this is completely acceptable for notebook because we are niche, we can drop all the practices. So the key principle, no, they need to be followed. And the rest of the principle kind of builds on that. Second one is you need to start with version controlling your notebooks. Third one, you need to have a fully reproducible notebooks, irrespectively who is running them, where there should be a way to reproduce exactly the same environment that guarantees a green execution top to bottom. You should to have a continuous integration system that now can use reproducible notebook and actually test them, verify that they are green. The notebooks should be parameterizable. In this case, you find that it can enable different tests by writing, for example, variables with the name of the tables to a testing tables. If you want to, finally, you do need to have a continuous deliveries that releasing all the artifacts produced by your notebooks. And the last but not least, all the experiments should be logged automatically. Now that are the principles. And today we want to show different, more deeper set of proof of concept that build around different things and different pieces of Google Cloud services that already allow us to implement some of them. And today we're going to show to you how to go from level zero in some cases up to level two including without almost lifting a finger by implementing some of these principles through different, different POCs that you can go and play with. Majority of them already available in the Geek Hub as a POCs, you can play with them yourself. And with that, for showing you how easily you can go from level zero to level one without changing any source code, I will give microphone back to Gonzalo. Thanks a lot. Gonzalo, do you want to start sharing? Okay, okay. Perfect, can you see my screen? Yes. Perfect, so right now let's try to land some of the concepts that we have seen before. So we're going to go through a demo in which we're going to be covering some of these levels and let's start with level zero. We're going to be using a app from notebooks which is a Google Cloud product in which you have access to a virtual machine with different machine learning libraries pre-installed. We call it the deep learning B.M. So you can actually use TensorFlow, PyTorch, XGBoost or other experimental frameworks as well like Rapids or Cafe, CNTK. So in this case, we have a Jupyter notebook and just to echo what Slava was saying last year compared to this one, we introduce a new feature which is called the notebooks metadata. So if you're a data scientist, you know this problem. Let's say you write a notebook, right? It runs from cell one to the last cell and then you want to share it with your colleague, right? How your colleague can run it? They need to make sure that the right library is set in style, the same version. Maybe he's using a different NumPy or pandas version and you just get an error, a data frame error. Or what about if you find a notebook online and you want to run it, it just costs a lot of errors that you've probably been there. So in this year, we're presenting something called the notebooks metadata in which if you are using a platform notebooks and you are saving one of your Jupyter notebooks, we are actually able to write in the metadata the environment that you are actually using. So in this case, I just installed the latest version for a platform notebooks and you can see that I'm using TensorFlow 2 for GPUs and actually it's version 2.3. So that's right in the metadata. So in case you want to share that notebook with someone else, they can know what environment it's actually the notebook created. So once you have a notebook, your normal development flow will be like you finish your model at the end of the day, you test it locally and then you want to write those changes to your GitHub repository. So you can submit your commit and the platform notebooks give you a nice guide integration. So you can actually do it directly from your Jupyter lab interface. And what we're going to be presenting now is this second level, which is the continuous integration. And for this, we're going to be using GitHub Actions. If you're not familiar with GitHub Actions, GitHub Actions is a workflow that runs directly in GitHub and runs upon submitting a new commit, a new PR and you define the set of instructions that you're going to launch there, this notification. So in this case, when a data scientist submitted a notebook, we're going to launch the notebook, we're going to execute the notebook in the Google Cloud Infrastructure. So in order to do that, we're going to install an SDK, which is called the Google Cloud Training, which what it's going to do is actually going to read that metadata that we write into the notebook and it's going to download the right environment. In this case, the Docker container. So we have three products right now, the AI platform notebooks, the AI platform training and the Docker container registry. So because we're able to know what's the metadata, what's the environment that the notebook was running, we can successfully say that the notebook will run just as expected as it was running in your local environment. So what's going to happen is the GitHub Actions is going to connect to cloud. It's going to use this library, the Google Cloud Training, which is going to launch this job to AI platform training. We're going to get the right Docker container and after the notebook execution is completed, we're going to grab it into a Google Cloud Storage. So it's actually publicly available. If everything goes well, we report back the state to GitHub Actions and you will see a green checkbox in your GitHub repo. But let's take a look at actually how this looks like. So this is the interface for AI platform notebooks. You can see that you access the notebook, you're going to have access to a UI. And in this case, you get a URL, which is going to give you direct access to the JupyterLab interface. In this case, we have some sample notebooks already and I'm connected to my GitHub repository. I have a notebook called reproducible.ipinb and you can see we provide the Git integration. So when you are actually opening your notebook, you can, this is a very simple notebook and you will be able to see the metadata that I was referring before. You can see that that's the latest version of the notebook, TensorFlow 2.3 version M59. And that just contains some print statements. But what if you want to download a notebook from any website? So in this case, I'm getting a notebook from the TensorFlow website. This is a call-up notebook. So it hasn't been run before. I'm going to upload it into my local instance. And you can see that when I open that notebook, if I look into the metadata again, there's really no environment. I only know it's a call-up notebook but there's really nothing about my local environment, like TensorFlow 2.3. So I'm going to close the notebook. I saved the call-up notebook and I'm going to reopen it and I'm going to see how the system actually will override that with the Docker environment. So this is a really nice feature that you get now. So you can see that it's pointing to a Google Docker container repository which is the deep learning platform release. And if you search for that container image, you can find it there. So you can see the TensorFlow GPU 2.3. So that's the first use case, the local notebook and the notebook that you want to download to be able to get the image. In this case, I'm just going to make a comment. I'm going to add another print statement. Very simple. So the notebook can execute actually really fast. So I run it, it runs fine. No error. So then I'm going to go to Git and I'm going to commit those changes. So you can see here that the reproducible notebook change, I added, I can see a Gitvif. We provide that login as well. So you can see the difference between the previous version. Update the notebook, do the commit and I'm going to push it to Git Hub. What this is going to trigger is the Git Hub actions that I was mentioning before. So I have a workflow configure in Git Hub that opens a new commit of a notebook. It will launch the execution in Google Cloud. So the commit was successfully submitted. Let's refresh the Git Hub repository and you can see that commit now. You can see that it was submitted. There's amber light that is processing. And in this case I have two workflows. One that is for validation. The other one is for a continuous integration which is the second block that Slava was talking about, the level one. And this Git Hub action workflow, it's setting up the integration with Google Cloud because we need to authenticate with the Cloud. I pre-define my project and my credentials in the Git Hub configuration. And this is how the Git Hub action workflow look like. I set up the authentication to be able to connect to Google Cloud. That's the first step. And the second step is this. This is the magic. This is the magic source. The Git Cloud notebook training library. This is the one that is gonna actually read the metadata and launch the job to AFL from training. And it's launching the job directly up on knowing which notebook was, what changed during that, that comment. So let's just review how this, so you can see that the notebook is executing, really fine. And if we go to AFL from training, you can see that there's a new job now happening. And you just launched a few seconds ago. You can actually, and you will see the different options. One of the advantages here is that not only you can, you run into the same environment, but you can also maybe define the same infrastructure, right? Like if you wanna use a GPU and AFL from training supports GPU, which is like a good advantage over other solutions. And let's do the same for the reproducible notebook. It's working, so let's do the same for the notebook from the TensorFlow website, this classification notebook. I'm gonna do the commit. I'm gonna push it again to GitHub. So this is a notebook that is running some evidence dataset. So it's a longer notebook, but I just want you to see that both cases is successful. So if we go back to our GitHub repo, we're gonna see something very similar. You can see that there's this new commit for the classification notebook. It's also processing, and you can see there the workflow, the workflow running. And right now let's do a buff commit, actually. So you can see the case where there's actually an issue in your notebook, right? Like maybe there's a typo, maybe there's the one, the one parameter in your model. You wanna actually make sure that before you submit something, it's actually a healthy notebook. So right now this is gonna be the third, the third job that we're gonna see in the GitHub repo and in our AI platform notebook. So we have there the very first one, which was the reproducible. And you can see how when we create the notebook, the environment was there. So that's the demo for the submission of the three of the three notebooks. And let's go to the next, oops. Okay, let's go for the demo of how this actually looks like when it completed, right? So we saw the success case. We wanna see the case when it actually failed. So let's take a look at this short video. So you can actually find the status for those three jobs that we launched. So you can see that there were the three jobs there in the AI platform notebook. I was trying to say that you have the right environment that it was deployed. One of two of them already completed. And you can see this green checkbox there. The last one of the bad community is still running. So if we actually go and look into the GitHub repo, you can see, you will see the status, but if you wanna look into details like what might have gone wrong, you can look into the logs, right? But let me show you later why this might not be the best option. You can see how the print statement is wrong, fail. And there's an error there. So you have two success cases and one case that is supposed to fail. So you have the classification that are producible. You have the green healthy state. And let's just wait for a few seconds for the last one to complete. And there you go. You see that your committee is not healthy. And with this, I finished the demo. We were able to commit code for a Jupyter notebook without any modifications in the notebook at all. Everything worked transparently for the data scientist. Slava is gonna show you next a new product that we have. And Slava, please take two of them. Yes, yes, thank you, Gandala. And so let me start sharing on my screen. And let me know if it's working. And I believe we should be there soon. Perfect, okay. So what we just showed, and huge thank you, Gandala, for this amazing part of the demo is how easily you can go without changing your notebook from level zero to level one automation. But this critical here is eight. First of all, GitHub will already publish on the GitHub all this example, but we effectively enabling possibility for your ops team that never have worked before with machine learning platforms. They have no idea what MLOps is, what the pipelines. Take their current knowledge and skills and help your department that working with notebooks to be able to automate their processes easily. What we're just shown is just a normal CI. You can actually substitute GitHub actions with cloud build with Jenkins with anything you want, as long as you preserve the process that preserve the metadata and you submitting it through this decade that we show to you. Now, another proof of concept that we do going to show you that we'll allow to go to level two automation for some of the notebooks. Let me first give an aesthetics about some of the notebooks. In many cases you have a notebook that already visualizing something. Actually going to be showing notebooks that visualize some trends for Bitcoin prices, I believe. And this notebook by itself might be production ready. What I mean by production ready, this could be a notebook as a dashboard. Document with visualization that you're sharing with someone within your company. Imagine the financial department that need to see this visualization. You have build them a notebook and there is really no need for you to refactor it and making it in different dashboards. You can and should be able just to take the execution from the level one that Gonzalo just showed and then visualize it. For that, POC not yet available on the GitHub, but soon you should be able to go to GitHub, see how this POC specifically built, but let me show you underlying premises because you can reproduce this easily by piping together several stitches. So as we end up showing you, now you have a continuous integration level one of tomato system that produces a guaranteed green commit of the notebook and puts the final notebook on the Google Cloud storage. So we're just sitting there. It's IPINB file with all the cells post-populated according to your execution. And if you have notebook that is a good example of the dashboard, technically you should be able just view it. And this is exactly what I'm going to show to you. There is a simple way to create a notebook viewer service that you can deploy on the Cloud Run or yourself hosted, which is effectively and be convert that will take Google Cloud storage notebook and convert to HTML and give it to you. And if you will host it on GCP, obviously it comes with Cloud AM out of the box. Now just to show how this POC might look like and if you will invest some amount of your time of reproducing our setup, it could be something like this. Here is a notebook with visualization and here is the notebook with visualization effectively on the viewer. So now you have a notebook in your ID, you have a CI that enables level one that builds it and you have level two and level two is effectively, this is your production quote unquote dashboard that is a result that you're showing to another department and people can just view it. Therefore, this is by design already full CD that delivers your production. As I said, it's not exactly full level two. There is many, many other production cases that not yet covered. And believe me, we will cover them eventually. Hopefully next year we will have way more different things and toys to show to you. With that, I wanted to reiterate that we show to you how to go from level zero to level two without changing the code and refactoring in notebooks. We don't try to substitute North Star. Our North Star is still what we show at the beginning to you, but we're helping you to go from the step of zero, one to two without changing. And when you are ready, you can start refactoring different step automation to go to, for example, more pipeline native. You can build a melops team who will build expertise in building Kuplo pipelines and they slowly will substitute thing that will show to you to more native Kuplo pipeline steps. And this is right way to go. But when you're looking at the huge zoo of the notebooks in your company, we expect only few of them will be fully refactored in enterprise grade pipelines where the rest might reach just level two without being refactoring at all. And that is fine. We want to help all the notebooks that you have that need to be automated to go to as high level as possible with as small amount of friction as possible. And we also want to enable your ops team to be able to help the data science team. People who knows how to do ops now can easily use all these tools to help the people who know how to use JupyterLab. With that, I think we can move to the question and answer section. Let me see what we have in the chat. We're here, we're here, my God, my God. That was very interesting. Thank you so much. I cannot believe, Slav, now that you tell me you have a short version of your name, which I wrote you an email and you never said so. And I was practicing all night, like, yes, just love, yes, just love. Now you come and tell me. I'll never forgive you for that. We're running out of time. So we have time for one question, guys. I'm so sorry, but the audience can otherwise contact you. They know I've done it to all of the audience. I cannot read all the questions, but could you give us more detail on how to monitor the pipeline to react to failures? Failures? Yes. This is a very good question because there are two types of failures there. One is the failure of the model continues predicting the results and outcome. And for example, the behavior of the target that you're creating, predicting changes. That is one of the reactions. And that is where MLO currently being developed, how you're going to change your team, how to be able to train the second part if your service infrastructure has failed. That part is simpler and quote-unquote simpler because it's very similar to how your web server has failed. It's just normal ops. So the first part will require the whole talk on its own. And looks like this is what probably we will be preparing next time. Do you agree, Gonzalo? You're okay with that? Do you want to add anything to that? Yeah, I mean, I think next year we will be bringing the whole pipeline together. You're going to leave us like this. You're going to say, I'm going to have to wait till next year for more different things to show us. Is this true? I'm going to have to invite you. Next year, you're starting on the first day. Now that I've learned all your names. Actually, I mean, you do such a great job, Slav. You know, with your mic and asking questions, you're a natural, you know. So next year you can come and do my job from here asking all the speakers. There you go. But guys, it was excellent to have you, the two of you as the final keynote for the second day at this Big Things Conference 2020. It was fascinating and you're leaving us with all these new things to come to wait to look for next year. So we'll have to see you then. In the meantime, I'll invite you to stay around and to join us tomorrow, of course, for our third day and last day. So thank you to Slav and to Anthalo from Google for coming on our second day. Big kiss to the east coast. Enjoy your day and we'll see you soon. Thank you. Thank you, bye-bye. Well, that's it for today in the attic. I hope you found it as fascinating as I did and you saw the different subjects and speakers that we had today was mind-blowing.