 Hello, everyone. Welcome to the A Day in the Life of a Data Science talk. My name is Terry Chang. I'm a data scientist for the Ezmeral Container Platform team. And with me, I have in the chat room, there will be moderating the chat. I have Matt McCow as well as Doug Cackett. And we're going to dive straight into kind of what we can do with the Ezmeral Container Platform and how we can support the role of a data scientist. So just a quick agenda. So I'm going to do some introductions and kind of set the context of what we're going to talk about. And then we're actually going to dive straight into the Ezmeral Container Platform. So we're going to walk straight into what a data scientist will do kind of a pretty much a day in the life of the data scientist. And then we'll have some question and answer. So big data has been the talk within the last few years, within the last decade or so. And with big data, there's a lot of ways to derive meaning and a lot of businesses are trying to utilize their applications and trying to optimize every decision with their application utilizing data. So previously we had a lot of focus on data analytics, but recently we've seen a lot of data being used for machine learning. So trying to take any data that they can and send it off to the data scientists to start doing some modeling and trying to do some prediction. So that's kind of where we're seeing modern businesses rooted in analytics. And data science in itself is a team sport. We're seeing that we need more than data scientists to do all this modeling. We need data engineers to take the data, massage the data and do kind of some data manipulation in order to get it right for the data scientists. We have data analysts who are monitoring the models. And we even have the data scientists themselves who are building and iterating through multiple different models until they find a one that is satisfactory to the business needs. Then once they're done, they can send it off to the software engineers who will actually build it out into their application, whether it's a mobile app or a web app. And then we have the operations team kind of assigning the resources and also monitoring it as well. So we're really seeing data science as a team sport, and it does require a lot of different expertise. And here's the kind of basic machine learning pipeline that we see in the industry now. So at the top we have this training environment and this is an entire loop. We'll have some registration will have some inferencing and at the center of all this is all the data prep, as well as your repository such as for your data for any of your GitHub repository, things of that sort. So we're kind of seeing the machine learning industry go follow this very basic pattern. At high level I'll glance through this very quickly but this is kind of what the machine learning pipeline will look like on the Esmeral container platform so at the top left will have our project repository which is our persistent storage will have some training clusters will have a notebook will have an inference deployment engine and a rest API, which is all sitting on top of the Kubernetes cluster. And the benefit of the container platform is that this is all abstracted away from the data scientist. So I will actually go straight into that so just to preface before we go into the data Esmeral container platform, what we're going to look at is a machine learning example problem that is trying to predict how long a specific taxi ride will take. So with a Jupiter notebook, the data scientist can take all this data, they can do their data manipulation train a model on a specific set of features such as the location of a taxi ride, the duration of a taxi ride, and then model it to trying to figure out, you know, what, what kind of prediction we can get on a future taxi ride. So that's the example that we will talk through today. I'm going to hop out of my slides and jump into my web browser. So let me zoom in on this. So here I have a Jupiter environment and this is all running on the container platform. All I need is actually this link and I can access my environment. As a data scientist, I can grab this link from my it admin or my system administrator, and I could quickly start iterating and start coding. So on the left hand side of the Jupiter, we actually have a file directory structure. So this is already synced up to my get repository, which I will show in a little bit on the container platform. So quickly I can pull any files that are on my GitHub repository. I can even push with a button here. But I can open up this Python notebook. And with all this unique features of the Jupiter environment, I can start coding. So each of these cells can run Python code. And in specific the container, the Esmeral container platform team, we've actually built our own in house line magic commands. So these are unique commands that we can use to interact with the underlying infrastructure of the container platform. So the first line magic command that I want to mention is this command called percent attachments. When I run this command, I'll actually get the available training clusters that I can send training jobs to. In this specific notebook, it's pretty much been created for me to quickly iterate and develop a model very quickly. I don't have to use all the resources I don't have to allocate a full set of 8 GPU boxes onto my little Jupiter environment. So with the training cluster, I can attach these individual data science notebooks to those training clusters and the data scientists can actually utilize those resources as a shared environment. So the essentially the shared large 8 GPU box can actually be shared. They don't have to be allocated to a single data scientist. Moving on, we have another line magic command. It's called percent percent Python training. This is how we're going to utilize that training cluster. So I will prepend the cell percent percent with the name of the training cluster. And this is going to tell this notebook to send this entire training cell to be trained on those resources on that training cluster. So the data scientists can quickly iterate through a model. They can then format that model and all that code into a large cell and send it off to that training cluster. So because that training cluster is actually located somewhere else, it has no context of what has been done locally in this notebook. So we're going to have to do and copy everything into one large cell. So as you see here, I'm going to be importing some libraries and I'm going to, you know, start defining some helper functions. I'm going to read in my data set. And with the typical data science modeling lifecycle, we're going to have to take in the data. We're going to have to do some data preprocessing. So maybe the data scientists will do this. Maybe the data engineer will do this, but they have access to that data. So here I'm actually going to be reading in the data from the project repository and I will talk about this a little bit later with all the clusters within the container platform. We have access to some project repository that has been set up using the underlying data fabric. So with this, I have some data preprocessing. I'm going to cleanse some of my data that I noticed that maybe something is missing or some data doesn't look funky. Maybe the data types aren't correct. This will all happen here in these cells. So once that is done, I can print out that the data is done cleaning. I can start training my model. So here we have to split our data set into a test train data split so that we have some data for actually training the model and some data to test the model. So I can split my data there. I could create my XGBoost object to start doing my training and XGBoost is kind of like a decision tree machine learning algorithm. And I'm going to fit my data into this XGBoost algorithm and then I'm going to do some prediction. And then in addition, I'm actually going to be tracking some of the metrics and printing them out. So these are common metrics that we that data scientists want to see when they do the training of the algorithm just to see if some of the accuracy is being improved. If the loss is being improved or the mean absolute error. So things like that. So these are all things data scientists want to see. And at the end of this training job, I'm going to be saving the model. So I'm going to be saving it back into the project repository in which we will have access to. And at the end I will print out the end time. So I can execute that cell and I've already executed that cell. So you'll see all of these print statements happening here. So importing the libraries. The training was run reading and data, etc. All of this has been printed out from that training job. And in order to access that kind of glance to that we would get an output with a unique history URL. So when we send the training job to that training cluster will the training cluster will send back a unique URL, in which we'll use the last line magic command that I want to talk about called percent logs. So percent logs will actually parse out that response from the training cluster. And actually, we can track in real time what is happening in that training job. So quickly we can see that the data scientists has a sandbox environment available to them. They have access to their get repository. They have access to a project repository in which they can read in some of their data and save the model. So very quick interactive environment for the data scientists to do all their work, and it's all provision on the Ezmeral container platform, and it's also abstracted away. So here I want to mention that again, this URL is being surfaced through the container platform. The data scientist doesn't have to interact with that at all. But let's take it's take a step back. This is the day in the life of the data scientists. Now if we go backwards into the container platform, and we're going to walk through how it was all set up for them. So here is my login page to the container platform. I'm going to log in as my user. And this is going to bring me to the view of the MLOps tenant within the container platform. So this is where everything has been set up for me. The data scientist doesn't have to see this if they don't need to. But what I'll walk through now is kind of the topics that I mentioned previously that we would go back into. So first is the project repository. So this project depository comes with each tenant that is created on the platform. So this is a more nothing more than a shared collaborative workspace environment in which data scientist or any data scientist who is allocated to this tenant. They have this POSIX client that can visually see all their data of all of their code. And this is actually taking a piece of the underlying data fabric and using that for your project depository. So you can see here I have some code I can create and see my scoring script. I can see the models that have been created within this tenant. So it's pretty much a powerful tool in which you can store your code store any of your data and have the ability to read and write from any of your Jupiter environments or any of your created clusters within this tenant. So a very cool ad here in which you can quickly interact with your data. The next thing I want to show is the source control. So here is where you would plug in all of your information for your source control. And if I edit this, you guys will actually see all of the information that I passed in to configure the source control. So on the back end, the container platform will take these credentials and connect the Jupiter notebooks you create within this tenant to that Git repository. So this is the information that I passed in. If GitHub is not of interest, we also have support for Bitbucket here as well. So next I want to show you guys that we do have these notebook environments. So the notebook environment was created here and you can see that I have a notebook called Terry notebook. And this is all running on the Kubernetes environment within the container platform. So either the data scientists can come here and create their notebook or their project admin can create the notebook. And all you'd have to do is come here to this notebook endpoints. And this, the container platform will actually map the container platform to a specific port in which you can just give this link to the data scientists. This link will actually bring them to their own Jupiter environment and they can start doing all their model just as I showed in that previous Jupiter environment. Next, I want to show the training cluster. This is the training cluster that was created in which I can attach my notebook to start utilizing those training clusters. And then the last thing I want to show is the model, the deployment cluster. So once that model has been saved, we have a model registry in which we can register the model into the platform. And then the last step is to create a deployment cluster. So here on my screen, I have a deployment cluster called taxi deployment. And then all these serving endpoints have been configured for me. And most importantly is this endpoint model. So the deployment cluster is actually going to wrap the train model with the flash wrapper and add a rest endpoint to it. So quickly I can operationalize my model by taking this endpoint and creating a curl command or even a post request. So here I have my trusty postman tool in which I can format a post request. So I've taken that endpoint from the container platform. I've formatted my body right here. So these are some of the features that I want to send to that model. And I want to know how long this specific taxi ride at this location at this time of day would take. So I can go ahead and send that request. And then quickly I will get an output of the ride duration will take about 2,600 seconds. So pretty much we've walked through how a data scientist can quickly interact with their notebook. They can train their model. And then coming into the platform, we saw the project repository. We saw the source control. We can register the model within the platform. And then quickly we can operationalize that model with our deployment cluster and have our model up and running and available for inference. So that wraps up the demo. I'm going to pass it back to Doug and Matt and see if they want to come off mute and see if there are any questions. Doug, you there? Hey, hey, Terry, sorry, sorry, just had some trouble getting off mute there. No, that was that was an excellent presentation. And I think there are generally some questions that come up when I talk to customers around how integrated into the Kubernetes ecosystem is this capability and where does this sort of Ezmeral starts and the open source technologies like Kubeflow as an example begin. Yeah, sure, Matt. So this is kind of one layer up we have our MLops tenant and this is all running on a piece of a Kubernetes cluster. So if I log back out and go into the site admin view. This is where you would see all the Kubernetes clusters being created. And it's actually all abstracted away from the data scientists. They don't have to know Kubernetes. They just interact with the platform if they want to. But here in the site admin view, I have this Kubernetes dashboard. And here on the left hand side, I have all my Kubernetes sections. If I just add some compute hosts, whether they're VMs or cloud compute hosts like EC to host, we can have these resources abstracted away from us to then create a Kubernetes cluster. So moving on down. I have created this Kubernetes cluster utilizing those resources. So if I go ahead and edit this cluster, you'll actually see that have these hosts, which is just a click and click and drop method. I can move different hosts to then configure my Kubernetes cluster. Once my Kubernetes cluster is configured, I can then create Kubernetes tenant or in this case it's a namespace. So once I have this namespace available, I can then go into that tenant. And as my user, I don't actually see that it is running on Kubernetes. So in addition, with our MLOps tenants, you have the ability to bootstrap Kubeflow. So Kubeflow is a open source machine learning framework that is run on Kubernetes. And we have the ability to link that up as well. So coming back to my MLOps tenant. I can log in. What I showed is the Ezmeral container platform version of MLOps. But you see here, we've also integrated Kubeflow. So a nod to HPE's contribution to utilizing open source. It's actually all configured within our MLOps platform. So hopefully that answers the question, Matt. Yeah, actually, Terry, can you hear me? It's Doug. So there were a couple of other questions actually about Kubeflow that came in. I wonder whether you could just comment on why we've chosen Kubeflow. Because I know there was a question about MLflow instead. And what the difference is between MLflow and Kubeflow. Yeah, sure. So just to reiterate, there are some questions about Kubeflow and I'm just going to talk about. So obviously, one of the people watching saw the Kubeflow dashboard there, I guess, and so it couldn't help but get excited about it. But there was another question about whether MLflow versus Kubeflow and what the difference was between them. Yeah, so with Kubeflow, it's an open source framework that Google has developed. It's a very powerful framework that comes with a lot of other unique tools in Kubernetes. So with Kubeflow, you really have the ability to launch other notebooks. You have the ability to utilize different Kubernetes operators like TensorFlow and PyTorch. You can utilize a lot of the some of the frameworks within Kubeflow to do training like Kubeflow pipelines, which visually allow you to see your training jobs within the Kubeflow dashboard. It also has a plethora of different serving mechanisms such as Seldin for deploying your machine learning models. You have KF serving, you have TF serving. So Kubeflow is a very powerful tool for data scientists to utilize if they want a full end-to-end open source and know how to use Kubernetes. So it's just another way to do your machine learning model development. With MLflow, it's actually a different piece of the machine learning pipeline. So MLflow mainly focuses on model experimentation, comparing different models, doing the training, and it can be used with Kubeflow. The complimentary, Terry, I think is what you're saying. Sorry, I know we are dramatically running out of time now. So that was a really fantastic demo. Thank you very much indeed. Exactly. Thank you. So, yep, I think that wraps it up. One last thing I want to mention is there is this slide that I want to show in case you guys have any other questions. You can visit hp.com slash Esmeral hp.com slash container platform. If you have any questions and that wraps it up. So thank you guys.