 Welcome to the Red Hat Data Services Office Hour. Maybe we should rename this to All About Data Analytics. Don't be surprised if the name changes. I'm Michelle DePama, Red Hat Principal Solutions Architect, and today I'm hosting a Rhodes partner call with Intel. So remember that Rhodes stands for Red Hat OpenShift Data Science, which is an add-on to our OpenShift Managed Services. But today's focus is Intel's OpenAPI AI Toolkit. So I expect to have time for introductions. We're going to set up the topic, and then we're going to do a deep dive into the toolkit and include a demo, and hopefully have time for Q&A. So I'm here with Audrey and Rachel. Audrey, fellow Red Hatter, could you give us a brief introduction? Of course. Can you hear me okay? I just wanted to... Yes, and we can see you. Lovely. My camera's working now. So hello, everybody. My name is Audrey Resnick. I'm part of the Red Hat Data Sciences team here at Red Hat. I have been with Red Hat for a year, but I've been with the IT industry for over 20 years for full-stack development to data science, and I have been... Oops, sorry. And as a data scientist, I have had the pleasure, or you could say the torment, of developing and deploying AI ML models into production. So in a world of Jupyter Notebooks, VMs, get lab runners, S2I containers, public hybrid cloud, on-prem cloud, et cetera, plus OCP, OpenShift platform. I'm very excited to work with the Red Hat Data Science team and specifically share information on their Red Hat OpenShift Data Science offering with you. Rachel will be introducing herself, and Michelle, I'll ask you just when we're ready, just to let me go ahead and then I'll talk about what I'm talking about today. Okay. Hi, Rachel. Can you introduce yourself? Hello. Yeah. Yeah, so my name is Rachel Oberman. I am a technical consulting engineer here at Intel. What that means is that I work with these product teams for our AI offerings, in particular for the AI and tool kit, which I'll be talking about today to make sure that our products are the best they can be and the customer voice is heard. And in turn, I also work directly and to end with customers either on high-touch engagements or in enabling conferences and trainings such as this. And it's a very exciting job and I'm really excited to talk about what Intel and Red Hat are doing today with Red Hat OpenShift Data Science and the AI kit. Awesome. Okay, so Audrey, I was gonna ask you to kind of set the topic up. So I'm gonna share, I think, Rachel, is this your screen I'm sharing? Yes, I think so. Okay, so I think it's actually, is it a few slides down, but there's sort of an overview and I just, you know, we don't actually have this chat every week. So I think we need to remind people of where we are in our kind of ecosystem. So it would be, hang on, back up one maybe. No. It's our favorite slide. Keep going. This slide that helps me. There we are. Wait, oh, oh, oh. Oh, I, these are the slides I see. Maybe you could share your screen if you have to slide up. Yes, I do. I see it. Nope, I'm sharing your screen right now. So, so Audrey, can you tell us where we are? Is this the slide that you were thinking of speaking to or, oh, did we lose her? We may have lost Audrey. Hang on. I think we're having technical difficulties. I think we're all streaming from our home. So, you know, it's something, you know, when we do these things, it's always, it always happens once in a while for technical difficulties. It does, it does. Okay, so let's see. Hang on. Okay, I'll let her know. We can't hear her. Can't see or hear her at the moment. I'm just going to go back to the title slide right now. Yeah, like, I mean, honestly, if you want to start talking, you can. This, the idea was just to kind of set the topic up and see where we are in our ecosystem. But, and then right after that, we were going to go into all your slides anyway. So I think she's going to join. She just dropped off. Okay. All right. So go ahead, Rachel. Yeah, sure. So I think Audrey was supposed to go first and kind of set up the red hat and open Shisha. I should just go ahead and talk about AI kit. And then maybe, maybe we could talk about AI kit and then we could go ahead and talk about red hat open shift and how the two connect together. I think that sounds good. So I'm actually going to go ahead and skip to that part and we'll get back to that good stuff later on. But let's talk about first about the one API AI analytics toolkit and how Intel as a partner is enabling the AI kit in red hat open shift data science. So to understand the one API AI kit, the first thing that you have to understand is what is Intel one API? Essentially, this is a big, big drive by Intel maintained on two specific principles. One is performance. And the other is open source programming across XPU. And I'll talk about what XPU is in just a moment. But essentially with one API, and it's not just in AI libraries, it's not just in Python. It's also built in C++ and Java and so on. It's basically taking all of these low level libraries and all of these beloved libraries and having them built on the C++ and SQL standard and having these all consistent so that you're not dealing with different standards or different compilers and so on and so forth. You're able to mix and match between the libraries and not only that, but also having it be handled on all different types of hardware. So we're not talking about just Intel CPU. We're talking about GPU, PGA and so on and so forth. This XPU where X is like any kind of accelerator that you're trying to run on. And the idea is that once you are coding on CPU, you should be able to port and be able to a code to say like GPU and so on and so forth with much ease rather than having this very differentiated ecosystem, which is really hard to handle. Now, with that being said, what you see on the right is a mix of different things that are all part of the 1API ecosystem. We have the languages encompassing 1API. So we have C++, that's SQL standard and that DPC++ compiler you see there that I mentioned, Fortran and Python. We have libraries that Intel has built and maintained regarding these, such as 1TBB, 1MKL, 1DNN, 1DAL and 1CCL in particular, which have a great importance to the AI kit library. And then we have a media and rendering libraries as well, Embray, OSP, Ray and so on. With 1API, we actually go ahead and package many of these libraries to cater to different personas. So in this case, for example, the AI kit persona is a data scientist or AI developer and I'll talk more about that in a moment. And beyond the libraries, we also provide tools to help with these libraries, such as the DPC++ compatibility tool and DCTune profiler. Now that's everything that's going on with 1API and with this, you're seeing all of these libraries that are built in C++ and I'm sure if you're a data scientist or an analyst or something, you're like, I don't want to learn C++, that's a bit too much. I just want to go ahead and code and take a learn and that's okay because what I'll be talking about today is how we've added these very performance-optimized libraries and we put them underneath these beloved frameworks such as TensorFlow, PyTorch, Psych2Learn and so on and so forth so that in a drop-in acceleration format with less than, I'd say, five lines of code. If that, maybe two, I think is the average. If that, you can actually go ahead and accelerate these beloved libraries to a significant level just by adding a line of code or turning on an environment variable and you don't need to go ahead and overhaul your whole framework or overhaul your whole pipeline to fit a specific framework or to fit a specific hardware. Now with that, let's talk about, that is what's powering underneath the 1API AI Kit libraries and I'll be giving an example later on of how that works as well as a demo but let's talk about what is included as part of the 1API AI Analytics toolkit and what you see on the right is all installed by default when you install the AI Kit operator for Red Hat OpenShift Data Science. So as I mentioned before the 1API AI Analytics toolkit it is the persona, its goal is essentially to accelerate end-to-end machine learning and data analytics pipelines with frameworks and libraries optimized for Intel architecture. So go really fast but also don't compromise the integrity of your pipeline in doing so or don't go insane in trying to fit a specific architecture or a specific framework. You can go ahead and use your favorite frameworks without having to fight with yourself to have your hardware work for you and for this in terms of what this is targeted towards it's targeted towards that scientists, AI researchers and so on and so forth these accelerations in particular. Now in terms of the actual optimizations that you get with the toolkit you get in deep learning realm you get Intel Optimization for TensorFlow and Intel Optimization for PyTorch which are optimized by our one DNN our one deep neural network library from 1API underneath to bring you great accelerations to TensorFlow and PyTorch. We also have Model Zoo for Intel architecture. So if you want actually pre-trained models which are really optimized already by Intel you could go ahead and pull those by Model Zoo. And then we have the Intel Neural Compressor where if you are a user and you typically and you have a great model that's coded and say like instructions that like FP32 and you want to convert it to Int8 for better usage and performance so that you take better memory advantage but you don't want to lose accuracy you could actually play through the Intel Neural Compressor tool and it will go ahead and teach for you what the best way to do that without losing your accuracy. This tool was also previously renamed the Low Position Optimization tool or L-POT for short. Now in terms of machine learning we have Intel Extension for Scikit Learn which plugs in directly with the typical Scikit Learn but when you add two lines of code to it and I'll be showcasing this later you see huge, huge performance benefit and then we have optimizations for ex-divutors included as well. And the data analytics realm and this is really exciting for me personally is that we have Intel distribution of modem and the Omicide back end that we help power and maintain with it with Intel distribution of modem in particular you could think of it as distributed pandas for in a single line code change you could distribute your pandas data frames which typically run on a single core to all available cores. So if you hate waiting for read CSV to run for like hours on end, don't worry this really speeds it up and you don't need to worry about that. And in core Python, as I mentioned pandas we also have Intel optimizations for NumPy and SciPy and Numva that are backed by our 1MKLs library and we are also working on DPPY which is part of that larger XPU XPU drive that I talked about in the beginning and enabling Python across all of your architectures and enabling these frameworks across all different hardware. Now that I've told you everything that's included why should you care about this? Why should you use the 1API AI kit on Red Hat OpenShift Data Science? So besides just accelerating performance maximizing and really maximizing your performance and also giving the opportunity to scale we've actually given features in things such as scikit-learn and scikit-learn where they do not only get this huge, huge performance gain but like say in the case of modem you could also distribute and scale your data now to where the typical library may not handle it. You could also streamline and to end workflows so that all of these components I listed before work seamlessly together so you don't need to worry about you have it all in one place you don't need to worry about one thing not working with the other all of them work seamlessly together and this will help you improve your productivity because if everything works together and you don't need to worry about version controls and it's all packaged together you don't need to worry about like oh do I have to like downgrade my scikit-learn because of this like pandas thing that I'm trying to use you're improving your productivity there and also speeding up development in that because in like say a case where you're trying to have these frameworks work with a specific hardware you're trying to learn that or you have to learn a brand new library to fit your specific framework that takes a lot of time, a lot of hidden costs in terms of learning a new framework or is learning a new hardware or how to optimize it so by being able to skip all that you know you're saving a lot more time just by adding like two lines of code and using the Intel optimization from this toolkit Rachel, we have a question does the Intel distribution of modem come packaged with RAID desk? Yes, correct so the Intel distribution of modem comes with a different back end and it has both a RAID and a desk back end so you can go ahead and choose and you can turn off and on the Omni-Sci back end which is basically we partnered with a company known as Omni-Sci it's a really great partner and they have a library known as the I think it's the Omni-Sci STL library that's running in the back end that brings even more performance here in Intel hardware and you can turn off and on that so you can use a RAID or desk as needed depending on what you want and if you are interested in that's basically you could use either or depending on what you need it's a really great library that's actually helping in the back end of modem do the heavy lifting to distribute across all available cores fun fact so with that we've actually gone ahead and we'll be talking about this more I'll probably come back to this slide when Audrey comes back essentially we've actually enabled all of these great optimizations within the Red Hat OpenShift data science platform so we've actually enabled the AI kit in a friendly JupyterLab interface including a pre-installed environment so you are when you actually go into the JupyterLab and I'll be talking about this later and showing it in real life you actually get three pre-installed conda environments so you could go in and have these optimizations there for you or if you want to go ahead and mix and match or make your own environment you could go ahead and do that as well and create your own environment but those three like beginning environments to help you get started are there for you and then for OpenShift of course this is all available as a Red Hat certified operator and we also have opportunities for inference and deployment with a deep learning framework such as TensorFlow and working with the OpenVINO toolkit which I believe was talked about previously on this podcast for deployment as well now with that I've been talking about how you get these performance benefits and how they're so great and how they actually work but what is the performance benefit that you can actually see and so I'll be showcasing this later in a demo using our XGBoost Dolphi Inference Optimization but as you could see from this diagram you could actually connect and stack these different optimizations together so that you can optimize every single step of the DAS-sign pipeline so starting on the left with the data preprocessing step by using Intel Distribution of Moden and just doing a single line code change you could actually see up to like 38 times faster ETL in the case of like a US Census 2020 workload that we ran ETL being like where you're dropping NANDs and you're cleaning up the data to get it ready for your model training then using Intel Expansion for Psychic Learn on that same use case I think the ridge regression was used in this particular essentially when you use it for both fitting and predicting you could go ahead and see up to 21 times faster fan prediction compared to the stock or unoptimized Psychic Learn by just turning on the simple drop in acceleration for an overall 40% accelerated performance. Now looking at the model training step it's not only about Psychic Learn but I'm gonna talk about it a little more just because I'll be talking about after this slide with Intel Expansion for Psychic Learn as you saw on that use case it was around 21 times faster fit and prediction but you could actually see up to 100 times faster performance or even 200 times I think also in recent benchmarks we've also done over a thousand times faster than a regular stock Psychic Learn through the observations we have for our 1API Data Analytics Library we also compared to our competitors this can be compared to up to 10 times faster compared to Nvidia GPU or up to five times faster compared to running on an AMD CPU now moving now that's all great for data processing and classical now but moving on to the deep learning side of things in terms of model training and just the full disclosure we also have TensorFlow optimizations as well not shown here we have in PyTorch if you use our PyTorch optimizations you can see up to 1.55 times faster DLRM training by using the DS16 instruction set rather than FP32 and that same and those same optimizations if you go ahead and use the DLRM inference with that you can see up to 2.8 times faster using the NT instruction set versus FP32 now in terms of optimizations actually as of 0.81 XTboost we've actually been up streaming our training optimizations to the mainstream XTboost so if you go ahead and use XTboost then you're using the HIST method you already got those going but we have even further optimizations and model inference and that's compared up to 4.5 times faster inference compared to Nvidia GPU if you use that and for TensorFlow if you use the TensorFlow optimizations and quantized inference again that neural compressor tool to convert your model inference from FP32 to INT8 you can see up to 2.8 times faster quantized inference now some of these numbers are really huge and some of these are smaller but let me tell you if that 2.8 times faster is a matter of between hours and minutes you know that could be a huge huge gain for a data scientist they could spend less time waiting around and more time improving their model and getting other things done so it's a really exciting thing to see now with that let's take an example on how a user will go ahead and actually go ahead and use one of these what is some of these optimizations I will begin by talking about Intel exception first like it learns and then I will be given an example on how to use our XTboost and dopper pie optimizations you may be thinking to yourself what is doll for pie but don't worry I'll help you through it you could think of it as basically the Python version of that 1API 1Doll C++ library talked about in the beginning so don't worry about that so in terms of if I am a user I'm using the AI kit operator on Red Hat OpenShift data science how can I actually go ahead and use these optimizations for you so let's say for example I'm trying to use this for a support vector classifier as is in the case of this little small code snippet here what I will want to do is I want to use Intel exception for scikit-learn which is powered by the 1API data analytics library and it's Python API equivalent doll for pie which can also be used directly if you so choose but you could also just go ahead and use the Intel exception for scikit-learn and stay within your regular scikit-learn API so I'm ready I know a bit about how to turn on scikit-learn let's go ahead and do it so I take my original code this regular scikit-learn code and I go ahead and add these two lines here from scikit-learn EX import patch under for scikit-learn and then I turn and then I use the function of scikit-learn what this is telling the computer is hey, anything that could be optimized by Intel exception for scikit-learn and the 1API data analytics library which is one of the most optimized machine learning libraries we have here at Intel please do so now if it is not optimized by the one doll library please just use regular scikit-learn so there's no loss only gain in this case and there's also opportunities where if you don't want to turn on everything that one doll can optimize you could just turn it on for a specific model or whatever you choose and you could also turn this off at any time with the un-patch method so you would just add UN at the end of this and that's all you would need to do and suddenly your SVC fit and predict are really optimized now this already comes pre-installed for you as part of the AI kit operator but if you want to go ahead and try it out as just a component and just do it in like a sandbox environment it is available in the default condos channel the Intel channel condo forge and pip install as well so it's also available there now in terms of the actual what this looks like in terms of an actual speed up compared to the regular scikit-learn you could look at the chart here so on the left is a bunch of popular algorithms such as team fit and predict if you ran in force and so on and so forth we even optimized train test split a very popular processing method used in the scikit-learn library and all we've done is taken our code and we've just gone ahead and added those two lines of code and this is how much SVM you could see so in some cases as in the case of support vector classifying predict on the MNIST MN ISP data set you could see up to 200 times faster so that is like really like the power and the kind of performance thing you could see by just adding that simple drop in acceleration there's no other further change is needed in order to get that acceleration once you use this patch method everything is turned on for you again it's very flexible where if you want to turn it off at any time you can do that or if you want to go ahead and or if you want to go ahead and just do a specific algorithm you could do that as well but you know the results of being able to actually go ahead and use this is very very huge and you know you could think that all I did was add two lines of code and using the AI kit operator and this is what I'm able to do which is a really really huge thing you know if you think about you know having to change your whole workload to try and get this performance gauge is being able to get this in two lines of code and not having to deal with C++ or anything that you would typically need to do now before I move on to the next bit and then going on to the demo I want I I as I explained the this intelligence for psychic learning is powered by the one doll library the one EPI data analytics library what we've done is that that library is coded in C++ and we've gone ahead and put it into a Python friendly library known as doll for pie so it has a Python equivalent so you can think of it as a cake where there's one doll underneath C++ then there's doll for pie which is the Python equivalent and then on top you have Intel sessions for psychic learn but that doll for pie library has more capability than just just accelerating the psychic learn in Intel sessions for psychic learn it also has optimization such as distributing linear regression and and other algorithms found in a psychic learn so if you actually want to go and distribute or use online streaming mode you could actually use the doll for pie library directly this is also included in the AI kit operator but we only recommend that if you really need to it's a very friendly library if you know psychic learns it's I'm not I'm not lying this is a very easy library to pick up but besides that it also has accelerations for other libraries that we've included such as XGBoost and like GBM so if I'm a user and I have this great optimized X library that's using those upstream uh treating optimizations that we already upstream to the regular XGBoost but I want more I want to optimize in my XGBoost inference well you can do that you by using it and by converting it into something that doll for pie can read and then running it in with doll for pie and I'll be demonstrating that in just a moment in the AI kit operator for red hat openshift data science or road so if I have gone ahead I finished the training part of my XGBoost model and this capability is also available for the light GBM library but we're going to be talking about XGBoost for right now I've done I'm ready to go ahead and use inference what I can do is basically take that pre-trained model here I can import doll for pie which again is already imported for you and it's available as part of the AI kit operator and I will just go ahead and call this and say get GPT model from XGBoost and say hey please convert this model into something that doll for pie can read and then afterward I'm just going to go ahead and call the gray and boosting prediction as needed with this which is saying hey convert this this is something that doll for pie can read and then go ahead and actually go ahead and use the gray and boosting classification prediction which I will usually use in XGBoost and just use it in the doll for pie one and what this is doing is that is taking advantage of the doll for pie optimization for gray and boosting inference what this means is more efficient model representation and memory I'm taking advantage of the AVX 512 instruction set and just better cache locality usage and what this will happen is that you see this huge huge performance increase compared if you were just using the regular XGBoost inference call you see in this case up to 20 times faster speed when you're using these two lines of code to go ahead and convert your model and there is no actually loss when you actually go ahead and do that so again just adding these just doing these simple little code changes can prove really really profitable if you are looking for a better time usage and better cost usage for this so being able to do this at 23 times performance boost is really huge now with that let's actually go ahead and demonstrate what this actually looks like on the AI kit for Red Hat OpenShift data science and then we will try to go back to Audrey if possible so I'm actually going to stop sharing my screen for a moment and then I am going to reshare with the demo and show off how what this what AI kit looks like on a Red Hat OpenShift data science AKA wrote and let me move my screen and there we go now I'll go ahead and share okay all set I see it okay you can see my screen okay awesome so here we are in the Red Hat OpenShift data science platform so as you can see here we are in the applications tab here which is basically if I have gone ahead and installed the Red Hat OpenShift data science operator and I have gone ahead and launched into it if I've also installed other partner software that Red Hat has collaborated with such as Intel I will also show up here so in this case I have already pre-installed the Red Hat OneAPI AI kit container as well as the OpenBino toolkit which is also available on Red Hat OpenShift data science so if I go ahead and play the video I'm probably just going to go ahead and wave my cursor around to say look it's there now besides showing what's enabled on the Red Hat OpenShift data science platform there's also all these great resources and optional applications that you could go ahead and learn about so in this case I took the AI kit application here and it will come up with information on what is AI kit as well as how to install the operator so all of that is available there and besides that there's also resources available from Red Hat and it's partner it's partner software so if I actually go ahead and I think I'm going to go ahead and search AI you will actually see AI kit pop up here and it'll come up with a documentation to help you get started now after all of that is done say I've gone ahead and I finished reading the documentation it's amazing I have it installed I'm really excited about using AI kit let's go ahead and launch it now one thing I want to disclose is that there for the sake of this demo it's not going to show just based on how it was recorded but typically this would show a spawner screen with all the available containers that you could select and when you go ahead and pick the AI kit tile AI kit will already be pre-selected so you just have to go ahead and do an extra thing where you click AI kit and launch and then it will show to where it's where it's going to now which is the Jupiter lab environment of AI kit where you could do a lot of great development using our optimizations and pre-loaded conda environment so if I go ahead and stop the screen here what you'll see on the left is a couple of things first of all we have the models.git which is where the models do with those pre-trained models for our deep learning libraries are located we have the Bunny PI samples folder as we have actually already gone ahead and giving you samples to help users get started there's also more samples that are linked in this welcome page I'm looking at but we'll talk about that later and we'll also be showing off one of these samples today as a demo and then there is also the AI kit welcome page so the AI kit as you can tell as I talked about in the beginning there are a lot of a lot of optimizations here and there and so you may be wondering like well where are they in these pre-trained environments and you know what if I want more information so I would definitely recommend if you install the AI kit operator to get started with the AI kit welcome page as you can see here it just basically is an intro to AI kit as well as environment package information like what is where so in the state for the scikit-learn xtbooth and modem environment which if I do not highlight maybe it'll yeah it will go ahead and show intel modem intel testing for scikit-learn which we just talked about in intel distribution of modem including our NumPy sci-fi optimizations and intel optimized xtbooth and if you go ahead and click on each of these links that are a little bit stated right now it will go ahead and take you to more resources on that now if that's not enough for you we also have provided more resources for you to get started such as access to the AI kit website AI kit code samples that include more code samples than what is included here as well as the AI kit release notes that you can get more information on that so I'm going to go ahead and do that and if I am a so if I've gone ahead and finished the welcome welcome page review you will see when you actually click that plus tab to start a new notebook you will see these again these preconfigured environments so in the intel pie torch and quantization tool you'll get the neural compressor and intel pie torch optimization in intel scikit-learn xtbooth and modem as I just mentioned you will get those kinds of optimization and intel tensorflow and quantization environment you will get the tensorflow optimizations and the neural compressor as well so besides having these available for the samples you can go ahead and launch an experiment in those condit environments or if you want to create your own environment you could go ahead and do that for for you and also has a terminal access which if I recall correctly it's going to show off here so I'm going to go ahead and highlight over that and the reason that this demo is recorded just as a full disclosure is that you'll see in a moment that it takes a while for the unoptimized xtbooth inference to run so I don't want it waiting around for that to run so that's why it's that's why it is it is taking a that is why it's recorded so again there's these resources for you so say I've read this and like okay I'm going to get started let's go to the samples now there are three samples if I pause this here we have a one use case using the intel distribution modem and the intel extension for scikit-learn optimization we have the intel optimizations for a tensorflow and the lpot tool that is now renamed to the neural compressor and then we have our intel optimized xtbooth sample using our xtbooth training optimizations and that are upstreamed and the dolphin pie inference method that I talked about right before going into this demo so we are actually going to go ahead and show off those xtbooth optimizations and that dolphin pie in print and I will go ahead and click into that notebook so as you can see here this is the the sample and what is going on is that this is an end to end sample the goal of this sample is that you're a data scientist essentially and you are trying to use the Higgs data set to basically predict if a particle is going to produce a Higgs Bason process or it's not going to so if it produces it it will mark it as a will classify it as a one if it does not we're going to classify it as a zero and we're going to analyze and predict using the information on the particle features and functions of those features that's all you need to know you don't need to know any physics just basically if it's if it's a Higgs Bason it's going to produce a one if it's not it's going to do a zero and this is based on the information from the data set or the particle features so as any data science work flow first starts we'll go ahead and start by importing and organizing the data so we're going to go ahead and do that here you can see I've already imported Higgs and Delfar Pi and we're just going to go ahead and pull the model and turn test split now I'm actually going to go ahead and run through this very rapidly so that it can run while I'm explaining it so we've already explained that first part so it's going to go ahead and pause there and in this case I'm only using that the Higgs data is pretty vast so we're only using the first 100,000 rows in this case but we will still be able to see a performance speed up with that now once I am done with importing and organizing the data I'm going to go ahead and train the model so as you could see here I am using the XGBoost classifier from the package and I'm using these different parameters as you could see from tree methods the Higgs parameter is marked meaning that we are taking advantage of those Intel optimizations that are upstream to the XGBoost library and let me tell you this is going to take a bit of time to run just to train the model but I've run it without the optimizations before and it takes it takes quite a long time like I think in this case it takes two minutes to train the model and I'm going to go ahead and skip ahead to that part but in most cases in many cases you know compared to the unoptimized because it is a huge performance gain so I'm going to go ahead and pause and wait here and I will go ahead and skip to where it is going to finish training and it should be right now yes give it a moment there we go so after it is done training it will go ahead and just put it down and say hey I'm done training here's your future model object and after that we're ready to do some prediction now maybe I'm kind of a skeptic and I'm like well is the Dolphi prediction faster let's go ahead and try it out so what we're going to go ahead and do first is that we're going to try it without the XGBoost without the Dolphi optimizations for a gradient boosting first and we're just going to do a typical XGBoost prediction and add a timer for that and as well as an error count because even if the Dolphi turns out faster is it going to be as accurate now after that I'm going to actually go ahead and run the Dolphi prediction which is again using those two lines to get the to take this model that we just trained and convert it into something Dolphi can read and after that we are going to go ahead and run the prediction object and go ahead and use that model that we just convert to Dolphi and run that here now and then we're also going to go ahead and compare the accuracy and use the timer there so we are actually going to go ahead and run those now I think they already pre ran at this point so as you can see here I'm just highlighting these are the lines that are running to give that performance speed up and let's go ahead and see the results so this is the printout of the comparison of the if I go back and pause it one moment so if I go here this is the regular XGBoost inference results and this is the a Delphi Pi gradient boosting inference results using the same exact model and as you can see here the both have a very both have the same error count and the same accuracy score and the prediction time in this again this smaller data set for the sake of the sample the XGBoost prediction time took around like 20 25 seconds and the Delphi Pi inference time took around 0.06 seconds and if we go ahead and actually visualize that you can see here that this is approximately a 3.47 times speed up just on that little data set so again that huge performance gain that can really scale up as your a data set size your training model gets very complex now that's great look at accuracy and the great thing about these Delphi Pi inference optimizations which are again those like two little simple line converter changes is that there's no accuracy difference compared to the typical XGBoost stock inference accuracy so you could go ahead and get a huge performance gain by just adding those two lines of code and after that just using the same accuracy as you keep coding for you so that's just a bit of an example of using the AI kit of what you can see the kind of optimization gain that you could see with the AI kit operator on a red hat open shift data science so with that we've just finished talking about AI kit some of the great performance optimizations and how they actually look like in code as well as a demo of what it looks like on the red hat open shift data science platform now I think I'm going to turn it over to Audrey to go ahead and talk more about the red hat open shift data science platform and how excited to have an AI kit operator on this is so let me stop sharing my screen and is Audrey are you going to go ahead and share your screen or should I reshare these slides oh do we how's your connectivity okay it's okay I will maybe I should share yes yeah yeah okay yeah so I will go ahead and share on slide two and we will talk about how what is where you can run the AI kit optimizations with a red hat so let me go ahead and present and Audrey whenever you're ready force mute oh no sorry I couldn't see quite what it said hang on Audrey it said I'm mute her I'm mute her oh my she's not muted no she's there sorry no I you're as green as Rachel is no huh oh no maybe maybe try and plugging your headphones maybe it's an audio setting thing because I had to set my default differently for this no nothing huh I'm looking at my settings hang on there's I don't think this there's no more for me to to do for a guest it's sort of your okay I'll I'll put you in the background I'll bring you back okay let's see I'm rude hang on and she's back we're all learning our new platform any luck no nothing so um so I think with this I think uh with this we've actually there was a great presentation that Audrey has already done with uh Ryan Loney on the uh Openvino toolkit powered by red hat OpenShift Datasign so if you want to go ahead and learn more about the uh how to get the AI kit how to what is red hat OpenShift Datasign and what the AI kit operator looks like on it you already got the second uh part since you already know you know all the AI kit optimizations that I ran through but uh just because of technical difficulties again we're we're streaming from our home so this can happen from time to time um yeah I think it might be good to go ahead and look at those red hat OpenShift Datasign resources um but just from my personal experience with red hat OpenShift Datasign it's a really uh it's a really uh great platform um that really helps you yeah I know Audrey's like it's awesome that's awesome it's an awesome product and all that so um but uh red hat OpenShift Datasign is basically this really great open hybrid cloud platform and really is powered and focused on uh AI datasign to basically help you get rid of that really complex infrastructure set up and give power back to the uh datasign test and once you are using the red hat OpenShift Datasign platform you could take that you know no more infrastructures or hassle with cloud or hassle with Kubernetes or anything like that and then you could couple that with the power of AI kit where you no longer have these like have to overhaul your whole pipeline to uh go ahead and fit a certain hardware or infrastructure so really couples together really giving power back to the datasign test and I'm just going to go ahead and flash the features of the red hat OpenShift Datasign so that users can read but again really great hybrid cloud for the cloud service that you can use on different vendors and really speed up your datasign workflow and I'm going to go ahead and flash the infrastructure of all the great partners that red hat is partnering with to bring these optimizations on behalf of Audrey I hope I hope I'm giving it justice I'm getting some stuff so I think I'm giving it some justice so in terms of the partners you know the one API AI kit sits up here with the customer I mean I see software and Intel is Intel AI kit is a partner software so I'm actually going to go ahead and skip to the end here and represent and so if you want to learn more about the AI kit and how to sell red hat ownership datasign here are a couple of resources that will also be available for you after this presentation so if you want to go ahead and download it I definitely recommend cooking this link which will link you to the red hat marketplace where the AI kit operator can be installed and beyond that just some notices and disclaimers about the numbers that I showed but I really encourage you to go ahead and try out the AI kit operator on red hat ownership datasign today so what's that I'm going to stop sharing my screen well I want to thank you that was really it was really impressive actually and I'm not a data scientist so I didn't have any questions to ask but I was impressed and Audrey I want to thank you for trying so hard to join I will have to and of course I want to thank our listeners for bearing with us as we go through all of our new technical issues on this on our new platform and everything so I don't see any questions at this time so I'm going to go ahead and wrap up even though we're a few minutes early and I will do my best to make sure that we promote these these demos and videos that we've done on our product page so anyone who wants to come in later can just go look at the roads product page and find them so again thank you both I really really appreciate it anyway have a good day everybody take care bye