 Hello everybody. Thanks for joining. We're going to get started in a minute while we're waiting Don from Hewlett Packard Enterprise has shared a bunch of links in the chat bar So feel free to check them out and make sure that they're still there. Are they still there because I had a little blip here with zoom And I still see them so I believe they're still up in the chat window Excellent, and you can see my screen. I hope Hey everybody. Thanks for joining. We're going to give a minute or two to start the presentation in the interim. There are a bunch of links in the chat for you to check out. Oh, it looks like maybe they're not in there to the new attendees. So I'll go ahead and post those right now. Thank you. Okay, the links are up in the chat. So let me know if you still don't see them. Looks like it works. Great. And we'll get started in a minute. Okay, let's go ahead and get started. I want to thank everybody for joining us today to the webinar building dynamic machine learning pipelines with cube director Presented by Hewlett Packard Enterprise our presenters today are Tom Thieland fellow at software organization and Karthik Mathur master technologist and Don Wake technical marketing engineer. A few housekeeping items before we get started as an attendee, you're not able to talk during the presentation, but you are able to ask questions down at the bottom of your screen and the Q&A tab will be asked answering those questions at the end of the presentation. Also, this webinar is subject to the CMCF code of conduct, so please be mindful of what you say in the chat or the questions and please be respectful of your fellow attendees as well as the presenters the webinar and recording will be up later today on the CMCF webinar page. And with that, I will pass it over to Don Karthik and Tom. Great. Thanks a lot, Julius. I'm Don Wake and I guess I'll have each guy wave so you know who they are. Tom, go ahead and say hi. Hello everyone. And there's Karthik right there. Hey everyone. I love your shirt, Karthik. Got you. I thought at first it said got milk. I'm like, dude, why are you wearing a got milk shirt? No, it's got ML. You got to pay attention to details here. Okay, so let's move along. So we're going to talk about building dynamic machine learning pipeline. So machine learning, you know, what, what is, what is it all about when you're deploying an enterprise artificial intelligence application? And how do you do that at scale? And, you know, what kind of tools do you have available to do all these things? Number one, it's a very complex thing to set up, execute and operationalize as is quoted here on the right by Gartner where 60% of the models may be developed with the intention of, you know, getting answers from them, but they quickly grow to a level of complexity that's hard to manage. The data sets change frequently so it becomes hard to keep your machine learning pipeline, you know, dynamic and up to up to speed. And then Kubernetes obviously is the tool of choice that everybody's using to orchestrate manage their containers, but it also is fairly complex. So what can we do to help, you know, improve the usability of your machine learning pipelines. So that's what we're going to talk about today. We've got our overview. Stateless and staple applications will be discussed briefly. What CUBE director is really the subject of this webinar and how it is used to help you build these dynamic machine learning pipelines. What, what you do when you're creating these clusters of CUBE director apps called KD clusters. And then once you have your clusters how you use them to train your ML model register the model create an inference deployment and serve queries to get answers. So let's jump right in. Go ahead, Tom. Thanks, Don. So, as Don pointed out, we're talking today about ML pipelines and the use of Kubernetes to deploy and control those pipelines. But before we can begin, we have to actually look at the application types that are involved in that pipeline of applications. Now as we all know, Kubernetes is best used or is most designed for stateless applications. And when we say stateless applications, what do we mean? We mean application that is designed for the from the ground up as a micro services architected application. That means the storage, the state of the application is separate from the compute instance that is running. So when we use containers, we can horizontally scale the compute. So we bring on more and more containers running in parallel in order to run our application more quickly. So we get to an answer more quickly in the ML parlance. These are typically referred to as cattle, because any of these containers can be killed, can be exited, can move around, and they don't lose the state of the overall application. This is key when we think about the applications that we're using to build our pipeline. So, Don, let's go to the next now. We'll contrast stateless applications with stateful applications. Stateful applications typically are legacy applications. They may come from the 80s, the 90s, the early 2000s before the industry embraced cloud native architecture, micro services solutions, and that sort of thing. So what these applications have is they tend to co-mingle state with compute. Now, with Kubernetes, we spoke about with stateless applications, there's a separation of compute and storage. But with stateful applications, there's some metadata. There might be a little bit of configuration information that's unique for each container that's running a portion of the compute for that application. If that container exits, there can be a loss of state, so a loss of continuity that has to be kind of rebuilt when we spin up a new instance to replace that missing instance of the container. These are referred to as pets. So even there they're horizontally scalable, there's kind of a separation of compute and storage, but if you lose a running instance of a container, you have to be careful on how you bring it back to life. That's because each pet is unique. So even though there's a hundred or a thousand cats that look much like your own pet, your pet's unique because there's a different personality. So when we talk about these applications and Cartiq will get into more detail about the specific applications that are used here, these applications tend to be a little bit stateful, but not entirely stateless. So we have to be careful as we assemble our pipeline. Fadhan, can you move forward please? And Kubernetes is the open source community. Like I said earlier, Kubernetes was originally designed for stateless applications, but there's been a lot of progress over the years. Various items and resources like objects and pools, stateful sets. I already mentioned persistent volumes, persistent volume claims. The concept of an operator or a custom resource definition is a specific piece of go code that helps manage stateful applications a little more easily, but they can be cumbersome. Fadhan, let's go forward. That's why we're introducing Cube Director. And let me just back up a moment in full disclosure to the audience. Myself and Cartiq, we come originally from a startup company known as Blue Data. Blue Data was acquired by HP Enterprise a few years ago. When we were at Blue Data, we specialized in running artificial intelligence, machine learning, deep learning applications in containers on Kubernetes. What we found is that the existing tool set for setting up pipelines of stateful applications didn't really meet our needs. And so we instituted this or kicked off this Cube Director open source project under the auspices of the CNCF. And so that's what we're going to be talking about today. There's a link here on the slides. If you want to go to our GitHub site publicly available, go ahead and begin to play around with with the Cube Director. So Don, let's just one more one more time forward please. We'll do and just real quick this link. I pasted it into the chat window so you can go to that it's called it's labeled Cube Director code if you'd like to see the source. Thank you very much. So Cartiq I'll turn it over to you to talk about how we support legacy applications with Cube Director. Thanks a lot Tom. So Tom talked about stateful and stateless application and he talked about, you know, if you want to configure legacy application, how the state is very important. And we don't even have to go that far, even if you take an example of something like Hadoop. It's a classical example, you know, of a stateful application. A lot of services are dependent on, you know, on one service like in the Hadoop world, a lot of services depend on ZooKeeper. So they need to know the state of ZooKeeper until the ZooKeeper, how many nodes are there, where are they running, what are the addresses until the ZooKeeper is not up, they cannot start configuration. So that's a very good use case for something like Cube Director, you know, so how do we achieve it? We maintain in cluster data structure, which gives you a cluster level view, which tells you what services are running, how many pods are there for every role. So that, you know, a service A, which needs to talk to service B, can look at this data structure and do the orchestration accordingly. So that data structure we call as config metadata and that is managed by Cube Director and is constantly updated as the state of the cluster changes. We also have some lifecycle events, what we call the guest config hooks, you know, basically they tell any service what's happening in the cluster. Is the cluster being expanded? Is it being shrunk, right? Nodes are being added or deleted and then the config metadata data structure is updated and then the service can react accordingly. And all that orchestration layer is what we call as app config, which relies heavily on config metadata and guest config hooks. And for any service to query this data structure, there is a nice little tool which Cube Director injects whenever a pod comes up and that tool is config CLI. Let's keep going. I think Tommy we're going to talk to us about this sort of how you deploy it. Good. So, I mean, we may have been remiss just to be clear, Cube Director is a Kubernetes custom resource definition itself. It's an operator. It's a piece of go code that is deployed installed within a Kubernetes cluster. And then what you can do with it is you submit a JSON file or a YAML file to that operator, and then it will manage the instantiation and the life cycle of that stateful application on Kubernetes. So with here we have with our cube call standard command, we are deploying a deployment of Cube Director in our Kubernetes cluster. What we're showing here with the little graphic is that you have the Cube Director operator. It has a when you're specifying a application to be deployed and managed by Cube Director. You provide a definition of that has some it will go through examples in the following slides. It will define an image it will locate an image from the image repository. It will specify connection information it will specify configuration information. A scalable roles information about that application. You apply it to the Cube Director. Cube Director will deploy it. If you wish to update it you use Cube call again you probably pass in a new configuration file and Cube Director will do the reconciliation. So it will bring the running instance of the cluster of the application into alignment with the new configuration specification from the YAML file. The standard operator behavior or Kubernetes cluster but what we're providing is we're providing a solution to allow you to run any stateless application without having to write your own operator. All you have to do is specify some JSON or YAML provided to the Cube Director and Cube Director will do that for you. So I'll pass the fact of Carti to go through some more detail in this regard. Yeah and just to give you an overview before Carti talks about the applications KD apps. You know just a level set. I think everybody on this call knows what we're talking about with a basic machine learning pipeline. This is what we're going to be working towards building out with Cube Director we have the training we have input data sets. This is an illustration of your model once it's created all of that circling around some central persistent storage repository where you have your data prepped and stored and accessible by the clusters and then where the models themselves are made available for inferencing. So Carti take it away. Sure. Thanks Don. Thank you Tom. Let's talk about Cube Director application. So we already know now by now that Cube Director is a Kubernetes operator in Kubernetes operator world you have custom resource definition CRD and then instantiation of those are custom resource. The application is a generic you know CRD it's it's where all the expressiveness for any giving an application comes where without writing any go code of an application developer can define what is the application going to look like we will be going deeper into that like we will be and define things like roles what services are in a roles cardinality and a lot of other things. So that the custom resource is the Cube Director application and we will be looking at some examples for those For this presentation specifically we will be focusing on three Cube Director application. One is your training application your Jupiter application and then finally a deployment application and then how we stitch them together using some Cube Director features like connections. Awesome. And again we also pasted a link that you see on this slide in the chat. So look for the label get hub example applications you'll see exactly the code that we are going to be walking through which is right here. So back to you. Yes. Okay, so this is a concrete example of a Cube Director application. We will be focusing on the green blocks. That's the areas of more interest. So now, Cube Director knows about this custom resource called Cube Director application what it is entirely depends on the application developer. They can define what roles, you know for every role behind the scene Cube Director creates stateful sets for every role to you know convert into the Cube and it is native resource for every role, you can define a bunch of services and for every service you can attach metadata. All that definition happens in the Cube Director application Jason. This can be a YAML as well in case they are used interchangeably. So if we focus on the first green block, we are defining our role, what is the cardinality means you know just one part for that particular role. And the heart of that is package URL where you give your orchestration code. So Cube Director invokes these scripts once the pod is launched for that role. And all the business logic, how to configure whatever service you need to configure in that particular part is happening here. And this is actively using the config metadata using config CLI utility to happening what's to know what's happening in the rest of the cluster. And then it also takes an image. That's a Docker image. So second green block is pretty similar just talking about another role, which is a different image. In this particular example, we have three roles using three different images. Every role has its own kind of an orchestration layer, which you supply using package URL. And finally, you have these bunch of endpoints under services defined and you can, you know, tell Cube Director something more about those services that I want the service to be secured using a property called has ought to open. Once you do that, Cube Director generates a unique token or authentication token or identifier so that anybody who's trying to hit that endpoint will have to supply that authentication token in the HTTP header. And then inside the pod you can write the logic to verify if the token is correct. So, so a lot of those features are Cube Director out of the box supplies, which, which people don't have to write again and again from different operators, which would, which is kind of how it happens outside of Cube Director, like if you're writing Kafka, if you're writing Cassandra, Mongo, also machine learning application, you have to keep rewriting lot of data code, right? All that is taken care behind the scene by Cube Director. Awesome. Thanks, Karthik. So it really shows how you're starting to wrangle the difficulties that you run into building these complex machine learning pipelines. So here's an example of a pipeline that we're going to use. And again, even the data sets for this is available online at our GitHub, I believe is we linked that as well. So the problem description and artificial intelligence application has to predict the travel time for proposed taxi ride. So we're going to take this huge data set from years and years of compiled data in New York. And then we're going to allow the user to query specific points A to B to see and figure out what the average travel time would be. So back to you, Karthik. Quickly, I'll just go ahead through this, Sidon. Thanks for setting up the demonstration we're going to show today. So Karthik has already talked a little bit about three components. We saw those in the YAML file. So let me just kind of piece this together. So Sidon points out what we're going to do is going to build a model. We're going to build a neural net model. That will allow us to predict travel time by taxi from point A to point B. So what we need to do is need to train a model. So we will deploy a training cluster. That training cluster typically will be deployed using on resources using GPUs because GPUs are more efficient when training a model. Now we also need to deploy an inference model. Inference typically runs better on regular CPUs, doesn't need the special power of the GPU. So we'll have two different specifications. We'll have a specification for our training engine, we'll have a specification for our deployment model, our deployment engine. And then we'll also have a Jupyter notebook as Karthik points out. And why do we have a Jupyter notebook? Because it's easy for the data scientist. They're familiar with the Jupyter notebook in order to interact with the training cluster and train it appropriately given the data sets that we will be using. So what we're showing here in this little box is the three cube Cuddle commands that deploy the specification for each of these components that we'll be using in our pipeline. Our app training engine, our Jupyter notebook, and our deployment engine. So what's going to be shown in the next few slides here is how we assemble these, how we share data between the training engine and the deployment engine. And that's the real power of cube director. So I'll send it back to Karthik. Thanks Tom. So now we are looking at the second, you know, important custom resource that cube director manages, which is a cube director cluster. So cube director application is kind of a template is a very static thing where you define bunch of, you know, mappings about between roles and services and service to endpoints. This is, this is the more interesting part with cube director manages is the instantiation of that particular cube director application. You can create one or many cube director cluster and cube director is constantly watching this custom resource called cube director cluster. And anything that's happening in the cluster is reacting to the change. So cube director cluster relative to cube director application looks relatively simple in terms of, you know, just if you look at the YAML. Here you just already predefined your application, which in this case inside spec is training engine inside that you already defined your roles while defining the app. So now all you're doing is for every role you're asking for how many members and what's the resource for every role. So for this, this presentation, we're, we don't have GPU YAML, but it can you can provide GPU while requesting for resource. So, yes. Okay, on the connections I guess what we're how you use that between the clusters. Yes, exactly. So now, it goes back to our the title for this presentation is building dynamic pipeline right like there are two two keywords there pipeline and dynamic. And that's that's the connection is at the heart of that pipeline means you are stitching together a bunch of things. In this particular case, you're stitching together a bunch of cube director clusters. Those are unrelated cluster and how do they cluster knows about each other is through this cube director feature called connection. We talked about the data structure called config meta. If you want to extend config meta to have information of other running cluster. That's what connection lets you do. If you focus on this YAML here in the spec there is connection and then you say to config maps. So that cube director reads those config map, get the data and sticks it inside your running clusters config meta data. So now this cluster also knows about those config map and as those config maps are changing this meta data is changing and services can react accordingly. Using connection. We have currently implemented three resources like a secret can be a connection config map or or another Katie cluster, but the idea is more broad or generic. You can have more resources in future in cube director, you can extend cube director to have another custom resource, which is, which is an operator in itself, or another CR to be a connection. Once you do that, then this cluster will know a lot more about what's happening there. And as those things are changing, this cluster is constantly evolving and reacting to the change. And that's that's how we get the dynamic part of it. So in this, this is a very simplistic pipeline. So you have observed like in actual machine learning you have, you can drill down on every step like training itself will do a lot of data munging data pre processing. All of them can be separate running cluster in themselves and using connections you can tie them all together into a nice pipeline. So alright. So this is what we're going to build. I think we've laid out the foundation for, you know what the problem statement is at the end of the day as Carter just mentioned we're going to have three clusters individual clusters doing individual jobs they're all part of the pipeline. So we'll build this out step by step. And as you can see at the top there is a persistent storage layer. That's where our taxi ride data set will be that the training cluster will read. That will train from eventually get our model stored in that same place which can be accessed by the Jupiter notebook and by the inferencing deployment engine, using also storage scoring scripts and in the persistent storage. So let's jump right in and get this sucker built. So we're looking at the QQ training app, and we've been talking about different QQ director app and cluster. So it's a simple cube cut to create for people familiar with KS or Kubernetes. You just do a cube cut to create or apply offer given YAML and the Kubernetes will take care of creating and managing that resource. So this will create your first block in the pipeline which is, which is the actual training engine where the training computation will happen. Okay, let's keep going down. So we built the ML training as promised we're going to kind of slowly build this picture out so there's our tax and write data set there's our Coop director cluster. Right. And now comes the most interesting part for data scientists which is a Jupiter notebook. Which is a QQ director application itself. So Jupiter notebook is the most popular kind of an interface for people doing machine learning or any data science problem solving. In the same spirit like you did, like Jupiter notebook, they can come up with any, any notebook, it could be Zeppelin or something else, which they are more familiar with. For this example, we have taken Jupiter notebook. Now you have this lightweight notebook and you have this beefy training cluster using connections multiple people or data scientists can just spawn this lightweight notebooks and connected to the training cluster. And we have extended the notebook as part of the QQ director examples to utilize this connection so that you can remotely submit your code. Whatever you're doing in Jupiter notebook will not be executed locally and can be then posted on a training cluster so that you get access to the beefier resources, more GPUs. So we've got that kind of beefy cluster over here that's got physical resources potentially GPUs we can do a lot of that hardcore, you know, machine learning. And then as you mentioned you got to just with a couple coob co create commands and definition files. You've got a connection between the lighter weight Jupiter notebook the data scientist is using and accessing kicking off, you know, managing this this cluster. So this I guess talks about talk about how the Jupiter notebook itself, you know, utilizes that connection. Right. So here now, as I mentioned in last slide. We have done some, some, we have added some intelligence to Jupiter notebook about the feature that we have been talking about called connections. There is a functionality in Jupiter notebook where it gives you some hooks to extend the interface. Those hooks are called magics. We have defined our own magic in so like you see percentage attachments that's that's a line magic that run some code behind the scenes to get all the metadata for a given connection. In this case it tells you what what is your training cluster. Again, you can continue to add more training cluster using connection snippet to your notebook. And when when you rerun this magic, you will see more than one here if you've added more or you can remove. So all that is very real time. And then as you are bringing and removing cluster you can then try your training example on different cluster just by sitting in one notebook session. Great. So we actually had a question there that came in from rock about you know what is the Jupiter notebook and hopefully that what we just showed gives you an idea of how the Jupiter notebook is sort of. I mean, how would you call it Tom the card it gets it's really what runs the show it's it's your direct interface to your abstract model you know we're doing all the hard plumbing in the background but here your Jupiter notebooks you know really building the model or allowing you to to make things happen right. It's just it traditionally used to be called I Python notebooks. Now it's called Jupiter notebook, they continue to add more and more features it can give you authentication with external ADL depth servers. So this is this is a very nice tool industry standard, which is constantly evolving for data scientists, and that's why it's very popular predominantly used for writing Python code, but with the example of each we have shown even when you are in a Python kernel for a notebook using magics you can put your our code, and we will just take that code and post it on our kernel or in some other language. Yeah. Awesome. Thanks. Okay, so you know we're building out our pipeline here showing the training engine now with the connections actually now we actually see a model also being stored in a central persistent storage library. So now the config map. So a model can be you know it's it's very subjective concept. It depends on library what you're using. Once you have a model to be able to deploy that in a proper production allies pipeline, you need to see realize the model, and then you need some you can deserialize the model, and then can basically give an endpoint, which people can hit and make inferences. For us, we have concept we have represented model in in a KS resource type which is very popular config map which is nothing but a key value structure. You can provide the path of your serialized model. Where did you store your model on persistent storage using the training cluster. You as part of the training job, and then the intelligence about that model, what what libraries was that model serialized with how to de serialize that model, how to plug in new input to that model that goes in in a script called library, which, which people who are interested in deploying this model will have to write. And we maintain that where the scoring script is and where the model is in this config map. Again, we will be using connection feature, while creating deployment cluster to make deployment engine aware of what models, you have to deploy with them. Great. So we've now gotten to this point in the almost have the full thing built. We got our config Mac built, and now we're going to create an inference server deployment. Right. So now deployment engine is is in that first block in the pipeline is number third block where you have your training. You have your notebook. You have your config map, which, which represents your model. Now finally, you want to deploy it. So again, bring up a deployment agent engine, which has the compute libraries to run the scoring script, and then just connected with the, with the config map. So that's what we're doing here. We defined a model in config map. We use connections to let the deployment engine know about that model. And I see here it says cube director can inject the contents of config map into the JSON file within the clusters. Exactly. And that's that's what the connection feature is essentially when the server is connected to director will be constantly watching that for you as things are changing there, it will be reading that and getting the contents and sticking it inside your config metadata. And you have config CLI, which you can constantly query. On top of that, we also give the lifecycle event hooks, which tells that, you know, things have changed here, do you want to reconfigure do you want to make use of this new model So all that is happening and changing real time and things are constantly evolving. That's that's how this dynamicity of this pipeline. Fantastic. So what would you say is happening here. This is finally you have your ML pipeline running. You, you have published an endpoint, whatever that model was in this particular example. So if you're interested in predicting to go from point A to point B, how much time I'm going to take in based on the training data set which was for New York City. So while training the model, these were the interesting features to the data scientist, you know, let long, you know, day of the week. If we ask you for these, you give us new input for a deployed model using the pipeline, it will go to deployment engine, you plug it in. And then it will make some prediction for you. So at the end of the day, 1679 seconds is the answer. Yes. Yeah, it's not the most optimized model. No, it's actually fantastic because this is really demonstrating everything. This is coming from the perspective of the user from some URL anywhere into this model and after a few simple inputs, you get the outputs. Of course, this is a, as you mentioned before, simplified for demonstration. Exactly. And this is just making a rest rest API call to the deployed model on the published URL. It could be an RPC call. Instead of Python, you can write a rest client in, you can use some postman or you can use any rest line to consume the model up to the end user. Okay, so that brings it all full circle. And it looks like we've got lots of more things to kind of tell people about that this is a very live open source project with more to come. Right. So this project, for this example, we are showcasing how you can build dynamic machine learning pipelines. But this project is much more than that. It is trying to solve the problem of rewriting operators every day for every application and whatever the common day to operation is we are trying to take care of that and make it inside cube director. So we can, we will continue to enhance this based on, you know, what are what use cases we are seeing. So like, like we mentioned, the concept of model is very simplistic right now it's just a config map but it can be a CR in itself and it can have much more intelligence and we can constantly monitor that better. Once we define that as a CR. We are brainstorming up about that. We will be enriching our example catalog, which already has quite a few applications but we will be adding more something like distributed and the flow to utilize GPUs. We can add more CR which are basically makes this pipeline more richer by adding things like model registry data set management and feature engineering sort of application in the mix. So the secret is already better than config map because it's base 64 encoded but still we want to be further solidify that and encrypt those so that inside the part. People cannot just decode it who has book and whoever can do to execute inside the part. So we're working on that. There are policies for role scaling placement constraints there there are a whole bunch of things that we're already working on. All those issues are listed in the URL don people can just go there. Take a look start contributing ask questions. Yeah I was just going to say contributions that's the beauty of this it's an open source project these are the things that we have seen kind of as the experts and founders but I guess Tom is there more you would say there as well as we this is open source. So it's completely, you know, we're trying to get into incubation stage so each piece in the process of, you know, going through the process with cncf to get that done and of course we'll move through the pathway to fully fully support and release open source project within cncf. But please don't don't let that slow you down. We'd love to build out the community of developers. The more people we have working on this the better more quickly we can enhance it to add additional functionality. And cardiac what is this we have basically all the examples we showed you in a whole lot more available they're also pasted in the chat right right so we are constantly updating the open source project with more and more applications. So internal teams are building and outside folks are also continuing to build and add more and more. So we have already built quite a few to start with legacy kind of, you know, if you call her legacy today, using cloud distribution that is there using map are, you know Kafka it's it's it could be a huge pain to write an operator for something like Kafka right. And we we have tried to solve that by having a computer application, which takes care of Kafka. We have quite a few machine learning kind of application like TensorFlow spark is another very popular one which can be used in the machine learning areas as well as for compute and many more and this will continue to add more and more here. I want to point out, similar to how, you know, blue data was acquired by HP, we've also acquired what is very popular in the machine learning from the data's perspective map bar, which is why I see map R 610 that is rebranded as the Esmeral data fabric, and it's integrated into the HP container platform as well. So that completes our presentation of all of our data. We've got our Twitter handles here so you guys can bother us please bother us as much as you like we'd love to you know, we love this stuff where the nerds. We have a slack channel, you can reach us on email, the blog post is also posted there in the chat, and there was a question also about, you know, when can you get this actual presentation and in Julius could even answer that when that will be available. But, you know, Tom take us back to the beginning here and wrap it up. Sure. Thanks Don so what we remember when we spoke about in the very beginning is the difference between stateful and stateless applications. What we've demonstrated today is with TensorFlow and the Jupyter notebooks, you can use cube director without having to write your own operator for TensorFlow or Jupyter. So, you know, you can use the Jupyter by itself, you know, passing the animal files as we provide here is able to deploy and manage those applications doesn't have to be TensorFlow it could be spark could be something else is as a cartoon pointed out, we're adding some examples of the animal files but we'd love to have other people jump in and provide those animals showing how you can run other stateful applications with cuba with cube director on kubernetes. That'll help us to demonstrate more usefulness of this open source project not only for pipeline deployment from machine learning, but for all types of applications. And we've been trying to keep up with some of the questions please go ahead and type them in right now to the chat, or the Q&A, I'm not seeing a Q&A pop up. Julius you can take over on Q&A or. Yeah, so everybody feel free to ask your questions down at the bottom right of your screen in the Q&A tab. At the bottom, the recorders right up the screen. So we've already got one from Steven. How does this relate to interact with cube flow in the concept of ML ops as this will be also available in Esmeral CP. Anyone to answer that. Yes. So yeah, since you mentioned Esmeral CP, I can briefly answer that but cube director is more of a, you know, open source project in itself. So it's, it's pretty extendable in terms of who do you want it to use it along with, but specifically in ECP, we have plans in coming releases to tie it nicely with cube flow where you do that application can be for, you know, your starting point where where you are building your application coming up with a pipeline. Then you want to take it to next level where cube flow already has richer components in terms of hyper parameter tuning is in cutting more kind of workflows using our goal. You can utilize that and how to bridge them together is first you need a common storage. We're trying to solve that problem because cube flow solves the problem of storage slightly differently where every user has its own kind of persistent volume, which is not the case in cube director. So in ECP specifically we will try to solve in terms of, you know, how to get their user depending on what tenant they belong to can get the tenant storage. So if they have access to the common storage and they get serialized model, then you can start probably if you train the model in cube director you can deploy it using seldom or KF serving on cube flow side. And that's kind of a nice, nice bridging. So those kinds of things we are already working on, you should see some of them released in future releases. Thank you, Carl. Let me just get a couple of things here. So just for the audience's sake, when we say ECP, that stands for Esmeral container platform. So it's for full disclosure again, Esmeral container platform is an HPE providing product. It deploys and manages your applications on Kubernetes. What we presented today is an open source solution called cube director that does work on the Esmeral container platform from Hewlett Packard Enterprise. However, you can also deploy cube director on any open source Kubernetes system. So what we're showing is a general usage of that. And then there's a great question. So what's your plans? How are you going to integrate with with cube flow? Are you planning to replace cube flow? The answer is no. Cube flow is a wonderful tool. Okay, we are going to augment cube director so that together these two tools provide the additional functionality as Karthik pointed out. So in Julius, I don't know if you saw this other one that came through in the chat. Okay, go for the Spark question. Oh, in the chat, let's see. Yeah, I can get it. Is there a difference between using cube director to manage Spark and the Spark operator? Yeah, that's a good question. And just now Tom mentioned ECP, we also have Spark operator. But fundamentally, the difference is Spark operator is kind of more native to Kubernetes. In the operator, we are not trying to say here that, you know, this is one size fits all like this can solve all kind of problem that all the operators are solving individually. No, this is more like a generic, you know, for if this does not serve your use case, then you evaluate. Can you make use of this? Or do you have to write your own operator? So a Spark consumer, if they want resource manager to be native Kubernetes, then they should use operator. But the application that we have will have some resource manager as part of the virtual cluster, which could be yarn or which could be Spark standalone. Right, because cube director is just launching given for a given role parts, but it more intelligent resourcing, it leaves to a resource manager, and it relies on that. So that's where a specific operator like Spark operator is kind of will be ahead of a cube director Spark application. Does it answer the question? Yeah, that one came in from from rock. So if he has a follow up, you still have a few minutes. Anybody has any questions and another question, what is what are the advantages of cube director with network service mesh in providing connectivity among clusters. Tom you want to take that. I'm not entirely sure I understand the question. Okay, this the service mesh is a way for different applications to find out about each other, and to connect. Okay. I wouldn't say that cube director has an advantage over network service mesh. It would use that so when we deploy the inference cluster we deploy the training cluster it would register itself with the service mesh. And then the pipeline would be assembled by querying the service mesh to get the endpoints for the components of the pipeline. Right, and just to clarify the connection that you get a feature is not doing anything on the network layer. It's just basically it's giving the config metadata of different cluster to a given cluster. When you're talking about more like a service mesh like Istio kind of thing so yeah it's definitely a very different concept from that. What we can do is using cube director application can utilize something like a service mesh like Istio like you already does, and we have been talking about that. So they will be complimentary and not competing for sure. Great. Well I hope that answered your question if not please feel free to ask a follow up. And yeah if there are any more questions please ask away down at the Q&A tab. I got one for Karthik. How do I get one of those shirts? Oh these shirts? Yeah where'd you get that? I think we just, it used to be in the view data office. There used to be tons of those. I think for one of the event we came up with these and then Well we own that then so we should just rebrand them. I think we just got rid of that building. Yeah I mean we own .ml. We got to have more t-shirts that's it. I think it's a standard shwag for RASMIL container platform. Please you know next time we have a face-to-face conference and we're giving out t-shirts again. Come by we'll be happy to discuss and give out t-shirts. Absolutely. Okay well if there's no more questions we can go ahead and wrap up but feel free to keep asking away. So I want to thank Tom, Karthik and Don for the great presentation and for the Q&A. And we at the CNCF will have the slides up later today on the webinar page cncf.com slash webinars. And it looks like we have a question that just came in. Yeah. Is config CLI similar? Oh no you answered that. I think I answered that. Oh there we go. Could you please give a brief explanation about cube director and cube flow and their difference? Right. Right so cube director is that's a great question actually since we are talking about machine learning pipeline. Cube operator is more of a genetic operator and it's not very specific to you know any particular stream like machine learning. It basically lets you build any kind of application. For this presentation we have utilized cube director to build an end to end machine learning pipeline. Cube flow is an operator in itself which lets you build these pipelines and gives you various components right. It's a specific technology and for any technology you can build an operator in Kubernetes world and that's what cube flow is. Cube director might have some overlap but essentially it's very different. You can, cube director lets you build your own application without writing any go code. Cube flow is an operator in itself to extend it. You have to write go code or to add more components to cube flow operator. You have to bring in new operators in itself. While cube director is an operator where you don't have to write new operators but new CRs which is a cube director application for every new application. Great. Well thank you guys for a great presentation. The webinar will be up later today on the CMCF webinar page and I'd like to thank everybody for attending. Have a great day everybody. Thank you. Thank you. Thank you.