 Thank you for kind introduction. I'm Alain. I'm the group product manager on Google Kubernetes Engine. World is moving towards cloud-native computing, and Kubernetes has become the de facto standard for cloud-native computing. Many organizations around the world are building machine learning platform on Kubernetes. Since there are many commercially available machine learning platforms that are fully managed, one would wonder, why do organizations build their own platform on Kubernetes? So we interviewed many organizations in order to find out their motivations and also to learn best practices. And we found that there are four key reasons why organizations build their own machine learning platform on Kubernetes. The first and foremost is the portability. So organizations design their platform, often for multi-cloud and hybrid architecture. They want the freedom to write the code once and run everywhere, including on-prem and in a multi-cloud environment. As such, they prefer open-source software-based stack. The second key reason is customizability. Organizations that have deep Kubernetes expertise, they like to customize their stack, and they appreciate the flexibility to do so. They also want granular control over the resources. Third key reason why organizations build their own platform is the performance. So many organizations, which are deeply savvy, actually like to hyper-optimize their architecture to fine-tune performance and save cost. Often, they are also motivated by unique security and networking requirement. The fourth key reason is the consistency. Very often, organizations want consistency between their machine learning operations and DevOps. They don't want them to be siloed. They don't want diverging architectures. That's one of the key reasons as well. So when we think about developing machine learning and training as the key requirement, when you think about building and end-to-end machine learning applications, you typically think about models and training. On the other hand, in order to build and end-to-end machine learning applications, you have to accomplish many complex tasks. This includes gathering the data, cleaning the data, extracting the features, building the models, provisioning resources to train the model, and when model is trained, putting in a production for serving the prediction and then monitoring the model. All of this requires a very sophisticated machine learning platform and tooling, which can make the entire end-to-end process simple, easy, reliable, automated, and reproducible. So machine learning is inherently an experimental discipline. Data scientists have to gather the data and extract the features and try out different features, try out different algorithms to hyperparameter optimizations and many different things in order to build a model and then compare many models across for performance comparison. And they want to track what worked and what didn't. And they have to do this in a very reproducible and reusable way. And there are four key technological building blocks that enable us to do the prototyping and experimentation in an expedited way. So what are those key building blocks? So the first and foremost is a feature store. What is a feature store? Feature store is a centralized feature repository which basically enables organizations, ML practitioners to discover and reuse features. It also is a scalable architecture which serves features for training and prediction in a low-latency way. It also helps to alleviate training-serving skills. And there is a very popular open-source feature store called FIST, which is widely used with Kubernetes. The second key building block is the notebooks. Notebooks are the gateway to data science. Notebooks are also integral part of development training and deployment workflows. Jupyter is one of the very popular open-source notebook. The third key requirement is experimental training and visualization. So data scientists want to analyze and compare different training runs. And TensorBoard here is one of the very popular open-source visualization tool. The fourth one is pipelines. Pipelines are the way to automate workflows for efficiency and rapid iteration. Kuflo Pipeline is a very popular open-source pipeline that we have seen many of our customers use with Kubernetes. So another key aspect is constant retraining. So world is always changing and data moves fast. Models get stale. So data scientists have to constantly use the new training data and new algorithms to build the new models. And we want a platform that can make this simple, easy, and cost-effective. This is exactly where the Kubernetes shine. Kubernetes offers unmatched scalability. You can scale up your training environment from a single node to thousands of nodes. And this is an ideal platform for distributed training and hyperparameter optimizations. Open-source Kuflo training operator is a very popular operator which implements a major machine learning framework and helps you do distributed training. The second key requirement is the performance. And here again, Kubernetes is quite helpful. It actually natively supports hardware acceptors such as GPUs and TPUs, which can help you save time and money by expediting the training. However, GPUs and TPUs alone are not sufficient to get the best performance. You also need high throughput networking and storage, as well as high IOPS, in order for you to keep your GPUs and TPUs busy. Last but not least, one of the key requirements that do-it-yourself machine learning platform developers have is the cost efficiency. And they heavily use discounted spot VMs in order to minimize the cost. However, as we all know, spot VMs can be preempted. So remember to checkpoint frequently or use elastic training operators, which do this automatically for you. We all know GPUs and TPUs are very expensive resource. And getting the highest utilization possible is one of the key design consideration for ML platform builders. ML platform builders also use batch mode for training as batch gives them time and flexibility, which also saves you money. So after your model is trained, you want to put it in a production with a live data to do the prediction. And in this case, you have to enable logging and monitoring, which can alert you when things go wrong or SLOs are violated. And this also has to be done in a way that is wetted, auditable, and reversible. So Kubernetes is one of the best way you can actually optimize your serving cost. The reason it is is because it allows you to autoscale from zero node to thousands of nodes, depending on the traffic. You can also do scalable multi-model serving in order to tightly win pack multiple models to reduce your cost. K-Serve is one of the popular open source stack that runs on Kubernetes and helps you achieve many of those properties. Another key requirement is model performance monitoring. So Kubernetes observability primitives are typically designed to help you monitor uptime, latency, CPU utilization, memory utilization, and other service level indicators. They are not well suited for machine learning model monitoring for several reasons. Machine learning practitioners, they really care about accuracy, precision, F1 scores, and data drift and concept drift. They also really care about training and serving skew, hail divergence, feature importance, and things like that. Last but not least, model fairness is also an important consideration. And good news here is that there is an open source framework called Evidently, which can do all of this. So you can use it if you are designing an open source based machine learning style. Last but not least, upgrades are a very important part of machine learning lifecycle. And here the platform builders typically allow you building blocks that can do ABs kind of testing. So you can test your models with different demographics before they go into production. So the model management and governance consideration grow across the entire machine learning operations workflow from end to end. Model management deals with registering your models, collecting the metadata, also doing version management. Governance of the model deals with issues of reviewing the model, approving for production, and when things go wrong, going to the rollback. Here again, there are very nice open source tool, machine learning ML flow, which can be used for model management. Another key requirement is metadata and linear tracking for governance and compliance. We want to automatically track inputs and outputs of all your components in the pipeline in order for us to create a lineage graph. So all the dependencies that produce a particular run for the model can be tracked and visualized. We also need the auditability capability for model provenance and compliance reason. Last but not least, explainable AI is a very, very important in order for us to build a trust in the model and also to assess whether model is fit for the purpose. So those of you who are considering building a machine learning platform, natively on Kubernetes, one of the great places to start is Kuflo. And I'm very pleased to announce that Google has decided to donate Kuflo to CNCF so all of us together can innovate and bring it to the next level. So machine learning and AI are changing the way we work in an unprecedented way and may the AI force be with you, thank you. All right, Ian, thank you very much, Morgan. Yeah, awesome. If you didn't see that news, that's really cool, right? That Kuflo joining the CNCF. I think it's nice, it's gonna be official part of everything that we're all doing this week. So yeah, thank you very much. I don't think we have any time for questions. We need to move straight on to the next talk. But thank you again.