 Hello everyone. It's amazing to be here with this incredible audience. Many of you are a leader team at Bloomberg that provide machine learning infrastructure that drives Bloomberg's AI advancement. Hi everyone, my name is Leon and I work on that team to build performance among infrastructure. At Bloomberg, we believe in fostering a transparent and efficient financial market. Bloomberg's business is built on technology that makes news, research, financial data, and analytics sociable, discoverable, and actionable across the global capital market. Bloomberg's flagship product is called Bloomberg Terminal. It delivers a diverse array of information, news, and analytics to help facilitate over clients, business, and investment decisions. Currently, there are more than 350,000 subscribers across 170 countries. Bloomberg handles a massive amount of data every day. We handle more than 300 billion market messages from hundreds of exchanges every day. Those information are collected, normalized, and redistributed in real time to clients. Bloomberg's system also houses a collection of more than 200 million trusted documents. Those documents include sell side research reports, presentation, press releases, transcripts of earning course, and more. There are over 1.5 million curated news in 30 languages ingested each day from over 125,000 sources and they're made available and searchable. Across all of Bloomberg's products, we handle more than 450 billion data points each day. Bloomberg has been investing in using AI for almost 15 years now, and Bloomberg Terminal uses AI to perform a range of tasks for the terminal users, such as extracting valuable insights, enriching and enhancing information, searching and retrieving what's most relevant to them, and summarizing the information consumption with daily digests. Let's take Key News themes as an example. As a terminal user, they need to find the most relevant and market moving news for their specific industries and companies. In order to do that, they need to sift through almost 2 million news stories each day. This is no easy task, but thanks to Key News themes, this AI-enhanced terminal functional will cluster the relevant newses automatically and summarize them to extract headlines and key insights. This along with other terminal functions powered by AI, such as company analysis and earning summaries, will make previously daunting daily tasks much more manageable for our users. In order to support these amazing features, we need performance and scalable ML infrastructure. At Bloomberg, many teams work together to facilitate every stage of the machine learning lifecycle. That includes annotating high-quality datasets to building bespoke knowledge graphs, to model exploration and model training, as well as model evaluation using quantitative methods, all the way to model deployment, and gathering feedback, feedback that will be used to kick off the next iteration of the ML lifecycle. All these stages are very essential and requires infrastructure support at scale. However, to build the entire infrastructure stack from scratch takes a lot of time and effort, so we decided to bring in various CNCF infrastructure projects to reduce the overall delivery timeline and provide infrastructure support at Bloomberg scale. And first, we'll leverage open-source solutions that help us build out the compute clusters, such as projects that help us manage the Kubernetes cluster, as well as projects that help with cluster observability and orchestration. And on top of that, to support the machine learning workloads, we look to projects like Kamada and BuildPak for packaging source code and efficient multi-cluster scheduling, as well as projects that help with the model exploration, model training, and model deployment phases. Let's first take a look at Manage Notebook. So Manage Notebook Platform is a key functionality we provide to users to explore analyst data and experiment model training algorithms. This Manage Notebook Platform provides data access control, compute resource management, compute framework management, and system upgrade automatically for users. We brought in multiple CNCF projects to provide this Manage Platform. For example, we used Istio WebAssembly and Open Policy Agent for the authorization process to decide who is authorized to access this notebook. And Calico as a network policy tool is very useful to control the egress access, which decides which external system this Jupyter Notebook can access. This diagram demonstrates how authorization happens in our Manage Jupyter Notebook Platform. The user request will first come into our clusters and keep the Istio Ingress Gateway and eventually reach the Jupyter Notebook part. And inside the Jupyter Notebook part, there's the Istio Cycle running along with the Notebook part and together with the WebAssembly plugin. The WebAssembly plugin will extract necessary information from URL and JLT token payload and eventually patch those information into the Request Header. Those Request Header information will eventually be checked against by Open Policy Agent Regal Code to perform the authorization check. Jupyter Notebook also opens an interactive session to data storages and external services. To ensure the data security, we want to control notebooks only plugged to a pre-approved list of storages and external services, so we can restrict its access. Calico provides a mechanism for us to configure network policy layer 4 and layer 7. Bring this technology into the Manage Jupyter Notebook system helps us to improve the data security. Next up, we're going to talk more about managed model training, multi-cluster management, as well as model deployment. For model training, we're using BuildPack to package the training code, as well as Kubeflow, to enable distributed machine learning training and Kamada to improve the GP resource utilization rate. In this diagram, we showcase how we use BuildPack and Kubeflow to enable secure and distributed machine learning training at scale. As a first step, user training code will first be pulled into the BuildPack cluster and packaged into images. Then the GPU training cluster will download these images containing the training code from the image repositories and orchestrate the distributed machine learning process using Kubeflow's training operator. As our training workload scales, so does our compute resources. And as we scale to multiple clusters and data centers, GP resource fragmentation starts to occur at the cluster level and starts to deteriorate our utilization rate. As shown in this diagram, we can address that by using Kamada to efficiently schedule GP workloads using a highly available control plane with multiple Kamada clusters. By performing the resource aware scheduling, Kamada can greatly improve our GP utilization rate and ease our maintenance effort at the same time through the decoupling of the scheduler from the member clusters. In our model serving platform, we use Istio, K-Native, and K-Serve to provide a fully managed model serving experience. K-Serve is the open source project that Bloomberg initiated together with the Kubernetes community and in the future would love to work with the CNCF community to continue to drive its roadmap. K-Serve as an open source project has the ability to serve popular large language model by integrating with various open source model serving round times such as TorchServe, Huggingface TGI, and Triton Server. In our managed serving platform, Istio serves as an ingress gateway and we use K-Native as an underlying service layer to auto-scale the number of incidents based on the number of incoming requests. This is the overall timeline of how we started the Bloomberg data science platform back in 2017 and we gradually brought in different CNCF open source project into the platform by working closely with the community and it has been a very rewarding experience with the community and we are looking forward to more opportunities of working together. Thank you so much for listening. Please scan the QR code below for more information and stay tuned for more talks on ML infrastructure, workflow, and build packs by my amazing colleagues. Thank you. Thanks everyone. I hope you enjoy the clip.