 Hello. Welcome, everyone. This is Jim Spore from IBM, and Prasant Pallavarthi from Microsoft and I are happy to welcome you to this session on Onyx. And we hope you're all staying safe in these pandemic times, and sorry we can't all be together in Austin, Texas. But we'd like to just briefly introduce ourselves. I'll go first. So I'm Jim Spore. I'm the Director of the Cognitive Open Technology Group at IBM. I was also recently elected Technical Advisory Council Chairperson at Linux Foundation AI. And you should know that Onyx is a graduated project of the Linux Foundation AI. Also recently I was elected to the Onyx Steering Committee. You can see my LinkedIn and a few other links there. Prasant is the Principal Program Manager at Microsoft for artificial intelligence platforms. He's also an Onyx Steering Committee member and an Onyx co-founder. So today I'd like to just briefly, in my portion of the presentation, take you through a little bit of the past, present, and future of Onyx. I'll do that by just describing a little bit of the work that's happened in the past in Onyx, one of our recent community meetings, and then challenge you all to get involved in the Onyx community briefly. So what is Onyx? Onyx is a... And why do we need a standard like Onyx? And if you look at the left part of this screen, you see TensorFlow, PyTorch, Spark, Cafe2, Keras, lots of different machine learning, deep learning frameworks out there. They're all in their own unique silos. They all have their own unique representation and format. So by having Onyx as a standard interchange format, this allows inference to be optimized and all of the different tools to interoperate. And when you get a copy of these slides, you can see here at the top, I've got Nick Pantry. This is a slide from one of his presentations. All of these are hyperlinks that can take you out to additional information about Onyx. Again, I like this slide by Jarek Kargil. He also shows the different tools that are out there. Onyx is an interchange format, and we'll get into it in just a little bit more depth shortly. So the best way to learn about the past of Onyx is to take a website tour. If you go out to onyx.ai, I encourage you to first check out the news. If you look at the news, Onyx has been around a few years, and as you look at the news, you'll see things like the version 1.7 release just came out. Lots of news items in there talk about new Onyx members, tool vendors joining the community, lots of exciting information there, and also information about upcoming community meetings, and other activities around Onyx. So please do check out the Onyx website, check out the news. The next thing you might want to look at is the about box, some getting started information. There's lots of different ways to get started with Onyx, building models, using models from the Model Zoo, and then there's a number of support tools as well. If you want to go up to the GitHub, the link is there from the Onyx website, and also there's a getter if you want to join in and be welcomed into the community, urge you to go first to get started with the website and check out those aspects. To learn about where Onyx is today, recently we had our first virtual Onyx community meeting. That link will take you out to all of the presentations, the reporting of the presentation. After a brief welcome and updates, we had the partner presentations. This is typical of the Onyx community meetings. IBM's chief data officer welcomed us to the work that IBM is doing with Onyx. Then we had a great presentation by Huawei on Mindscore and what they're doing with Onyx. Microsoft runtime optimization, great presentation there. Xilinx is doing a lot with the FIN project and FPGAs with Onyx. We had a fantastic presentation about how Onyx is being used in genomics from UC Santa Cruz. Then Microsoft presented more of the work they're doing with Azure and OCR. Many people are familiar with MathWorks. MathWorks is also a tool vendor that uses Onyx and we had a fantastic presentation. I encourage you if you want to see a snapshot of where Onyx is today to go out to this most recent community meeting and check that out. In the community meetings after the partners present, the SIGs, each of the different SIGs present, each of the different working groups present, and the SIGs for infrastructure and operators and various working groups which are short-term initiatives that the community starts also presented. A few of the highlights of the last community meetings, there were over 200 registrants and over 100 different organizations. I believe this is the largest that we've had in any community meeting. Onyx continues to grow and we continue to welcome new organizations and individuals for the community. This chart shows some of the tremendous growth that we've seen in the Onyx community. The poll requests were up by 11 percent, contributors up by 21 percent, get up stars up by 22 percent, get up forks up 31 percent, amazingly published papers about Onyx up over 111 percent, and the models who up over 24 percent. And this I should mention is the progress over the last year in April. Since then there's been even more progress and I put those in red here. You can see that just even since April, the Onyx community has been growing and we're excited to see this growth and encourage you to get involved. This chart right here just shows this was also presented at the last community meeting. There's even been more members joining but you can see constantly adding new members to the Onyx community. Now, just in closing, before handing it over to Percent to tell you more about Onyx at Microsoft and getting into some of the details, as I mentioned at the beginning, Onyx is a graduated project of the Linux Foundation Artificial Intelligence. And if you haven't gone out to visit the Linux Foundation AI and seen the other projects there, the incubating projects, the graduated projects, I encourage you to go out there and take a look at the Linux Foundation AI landscape. This is a diagram which shows lots of different open source projects that are associated with... that are on GitHub. Over 250 GitHub re-closed projects are indicated in the landscape diagram. Together they have a combined 1.4 million stars, come from companies and organizations with 12 trillion market cap, some of the ones from startups, a total of 5 billion investment. So really, if you're interested in artificial intelligence, machine learning, deep learning, please go out and take a look at this Linux Foundation AI landscape. And I would have to say right at the center is Onyx. Because if you look at all of these different tools and each with their own format, you really get a sense of the need for Onyx, a standard interchange format. So again, you know, sorry we're not there in Austin to interact with you directly, but we do have a chat window open to start putting questions in the chat. And we really would like to find a way to welcome you into this Onyx community. Now, I'd like to turn things over to Prasan. And are you there? Hey, thanks so much, Jim. Hi, everyone. I'm Prasan Polarthi. And today I'd like to take a few minutes to talk to you about Onyx and practice. Why do we use it at Microsoft? And give you some tips on how you can use it as well. So a little bit of background. At Microsoft, AI and ML are used in all of our products. We have a kind of a diverse set of products that span the spectrum of solutions. And they all have different machine learning components being used in them. And this means that there are a large number of people developing AI and ML solutions for use in these products. And they're using a variety of different tools to do that job. So we talked to a lot of these teams at Microsoft and we worked with them closely as being as part of the AI platform team. And we identified a number of common themes that were slowing people down as they developed ML solutions and tried to deploy them into production. So the common problems that people were facing are listed here. So a top issue that people had was that the inference latency was too high to put into production. They developed this really awesome model that did the tasks that they were trying to do. But then to put it into the production service or put it into the actual app, but they needed to deploy into a C-sharp or a C++ or even a Java app. So they're using popular Python toolkits, whether it's Scikit-learn, PyTorch, TensorFlow. And then they were trying to figure out, okay, well, how do I take this model that I've just developed in Python and fit it into my production application which is not written in Python. There were also a number of teams who were trying to run their models on edge and IoT devices. So they can train in the cloud, but then they need to figure out how do I make this work on this much smaller device that doesn't have the full power and capabilities of the cloud. In some cases, the same model needed to be run on different hardware. They're going to run on different operating systems. They're going to cross-platform. It works on different types of hardware, different GPU vendors or different CPU vendors. And they need to be able to run the same model across this variety of different platforms. Another issue was that some teams need to be able to take models from different people who use different frameworks and run them all in their product. These are products like Windows or SQL Server, because those products have customers who use a variety of tools. And those products, Windows and SQL Server, for example, have to support models that come in with various different formats. More recently, especially with the popularity of transformer models for natural language processing, we've seen that training very large models takes way too long and it also impacts agility. So all of these items listed here are basically impacting machine learning productivity for our data scientists and developers. So the solution for us was Onyx and Onyx Runtime. And Onyx, Jim, just gave you a little bit of information about it and we're going to talk a little bit more, is this common format that allows you to represent models from various different frameworks, PyTorch, TensorFlow, Kera, Scikit-learn, a whole host of them in the common format. And then the Onyx Runtime is a highly optimized runtime, highly compact, it's only a few megabytes, it's cross-platform, runs on all the operating systems and various devices and hardware accelerators. It basically takes these Onyx models and runs them really well on a variety of platforms. So basically, this was the situation we were having. There were different training frameworks being used and the models generated from these frameworks had to be deployed to different types of devices and different types of hardware accelerators in those devices. And with Onyx Runtime and Onyx, basically our developers and machine learning scientists got the freedom to use a tool of their choice and they were able to deploy their solutions to these different targets with strong performance and compatibility across a variety of platforms and accelerators. So I'm going to give a couple of examples of this, scenarios that we're using in Microsoft. And then I'm going to dive into how you can use it for your own scenarios. So one example is speech. Microsoft has a speech service that's used in a lot of our products, Xbox, Office, it's available as a cognitive service from Microsoft Azure as well. And the speech service basically uses Onyx Runtime to power their product. And they didn't always use Onyx Runtime but they chose to use Onyx Runtime because it gave them an improvement in agility and in performance. For agility, they were able to reduce they had a 10x reduction in time basically to take a model from development to production. And so why is that? Well, with Onyx Runtime and Onyx they didn't have to rewrite the model into the language that's needed for their production system. They didn't have to spend a lot of time tuning it and manually optimizing it for their production scenario. They automatically got all of those benefits by using Onyx Runtime. And then for performance, once they put it in, they got more latency improvements. And then I'm also listening here that they got accuracy improvements. So you might be wondering how does that work? Well, it ties back to the agility because they were able to run more types of models and more types of improvements. They were able to come up with new types of models that they could deploy that gave accuracy improvements. Also, because of the latency improvements, they were able to use larger models without exceeding their latency budget. So speech, like I mentioned is part of our Microsoft's cognitive services and there are a whole host of different services in this portfolio that are using Onyx Runtime. The text service, the text to speech, computer vision, custom vision, ink recognizer, visual search, image search are just some of the different services that are making use of Onyx Runtime to deliver their product. A few other examples. Azure Connect, if you're familiar with this it's basically a small device that you can connect into your PC with the USB and it allows you to use depth sensing and tracking capabilities of the cameras that are part of that device, of all the sensors that are in that device. So this comes with the SDK that you can run on your desktop machine and they provide different machine learning models to do things like body tracking. And when they started making use of Onyx Runtime, they got a significant improvement in the performance of 7.8x here for the first frame processing time. Now the interesting thing to call out here is that this is a scenario where some people use the Azure Connect SDK on their desktop PC but there are also a lot of scenarios where they want to run it on smaller devices like the Invidia Jetson. So with this is a ARM powered device but it has a GPU in it. So with Onyx Runtime they're able to run the same runtime and the same model on both of these devices and it gives that cross hardware platform support that we were talking about earlier. So another example or rather elaborating on that we have this portability is an important scenario that comes up often. People are running the same model on laptops and PCs they're running them on these edge devices this IoT devices like the Upsquare or the Jetson. And with Onyx Runtime they're running the same model and the same application code across these different platforms. So here we're showing that Onyx Runtime is being used with a Onyx model and they're using the Python API and they're running on these different devices with different hardware accelerators. So on the PC they're using CUDA for the Nvidia GPU on the Upsquare device they're using OpenVINO for the built-in VPU they're using the TensorFlow RT for the GPU. And basically they're able to get the same results running on these different models. The underlying hardware of course will give different performance but with Onyx Runtime and Onyx you can be rest assured that you're getting the best performance that you can out of the hardware that's available to you and you don't have to go do a lot of customizations and try to tune it for the different hardware. So you save a lot of time. So another solution here that makes use of Onyx Runtime is Windows ML. So Windows ML is part of the Windows operating system. It's basically an application API that enables people to inference machine learning models without having to worry about installing all the different drivers and libraries that are needed to make use of your GPU or VPU or other hardware acceleration device. It does this by making use of DirectML, which is based on DirectX. So every this basically uses all the drivers that come with Windows and with devices that are compatible with Windows to make it really easy to do machine learning inferencing. And so when Windows was building this out they chose to use Onyx Runtime because it gives them a common format for models to accept. So as a data scientist you can create your models with TensorFlow or PyTorch or Scikit-learn, you export them to Onyx and then you can run it with Windows. And Windows doesn't have to worry about making sure that all the frameworks are installed and updated or the application developer doesn't have to do that either because it's part of the installed operating system. And because of the simple nature of Onyx Runtime it can work with the different types of hardware accelerators. So DirectML basically just plugs in as a hardware accelerator and itself provides kind of an abstraction layer on different types of hardware, whether it's GPU or VPU or other types of devices. And of course Onyx Runtime also has highly optimized CPU inferencing. So some other customers that use Onyx Runtime Azure as you know has many customers. One example is an ISV who uses Onyx Runtime for economic scenario modeling. They basically train their financial models in Python with Scikit-learn but their production environment is pure C-sharp. So how do they go from these pythonic environments whether Scikit-learn or PyTorch into the C-sharp? They also use Onyx Runtime because the C-sharp API made it really easy. They can just train their models as they normally do, convert them to Onyx and then run them with Onyx Runtime. So they were primarily looking for this cross language support but they also happened to get a bonus 2x speedup when they did that. So some very more recent news is our work with transformer inferencing. Earlier this year we announced onyx Runtime to really provide breakthrough inferencing speed for transformer models like BERT and GPT-2. And Huggingface is a popular library of different transformer models. It's usually popular in the NLP space and they provide models that can be trained with either PyTorch or TensorFlow. About a month or two ago Huggingface added new capabilities to allow you to save your model as Onyx models. So they have this their library is called Transformers and they added this module called Convert Graph to Onyx and it basically generates an Onyx model for you. And the reason for doing that is once you have your Onyx model for these Transformers they run really fast when you run with Onyx Runtime. So there's a blog post that we did about this and the graph here is from that blog post. It basically shows that when you use Onyx Runtime to inference these transformer models you get significant speedup whether using CPU or GPU and because Onyx Runtime is compatible with frameworks like PyTorch and TensorFlow you don't have to change how you do the training aspect of it. You can just start getting these inference benefits by plugging in Onyx Runtime. So another recent announcement that we had was Transformer Training. So just like we optimized the inferencing of Transformers we've also been able to optimize the training models. So these models can sometimes take a very long time to train depending on the complexity of the model and how much data you're trying to feed into it. And some of these models can actually take days or even weeks to train. And so any optimizations that can be provided are a huge savings both in terms of agility and in cost for these users. So with Onyx Runtime's latest capabilities for training we're able to bring that cost down. So it integrates with PyTorch currently in the preview release that we've put out there integrates with PyTorch and the Transformer flow is coming soon. And it basically incorporates all the latest algorithms and techniques that are available. Microsoft has published some techniques called DeepSpeed and Zero and Parasail and Addison. They're different papers and different kind of proof of concept projects out there that show these algorithms. But we wanted to make sure that they were available in one place that was easy to use and fully supported. And that was Onyx Runtime. So all of these are incorporated in there so you don't have to kind of pick and choose different libraries. They're all just there. And this is being used by different teams in Microsoft already, Office, Visible Studio, etc. They use it for their production models and it's available as a preview for anyone to use on the GitHub. And so the chart here shows some of the gains that we saw when using Onyx Runtime with PyTorch. Again, Onyx Runtime just integrates with the PyTorch framework currently. And they saw significant gains about 30 to 40% savings on training these very large models. And this is pretty significant because it goes down, for example, from eight days of training down to like four and a half days, which is a considerable amount of savings. All right. So I've been talking about all the different ways that Onyx is used at Microsoft. So I wanted to talk a little bit about how you can get started with Onyx and Onyx Runtime. So Jim mentioned the Model Zoo and I want to elaborate on that a little bit more. So the Model Zoo, which is available at this URL, provides a variety of models that have already been pre-trained and are available in the Onyx format. So you can pretty much just download these models and start using them with Onyx Runtime. And we support both, we support a number of categories, vision, language, and there's some speech models coming soon as well. And for vision, for example, we have all the popular models for image classification, object detection, etc. And these are kind of the tried and tested models that have been there for a while, as well as some of the latest ones, you know, whether it's YOLOv4 or MascarCNN, etc. So you can basically take these and for each of these models, there's a notebook that shows you how to use it. So it tells you how to do the input processing, how to take the image, for example, and turn it into the format that the model needs and how you interpret the output of the model as well for your application. So this is a great way to get started easily with Onyx and Onyx Runtime. So I encourage you to take a look. And we also would love folks to contribute even more models. As Jim mentioned earlier, the Model Zoo has been constantly growing. And that's all thanks to community members who contribute their models. Another way to obtain an Onyx model is by exporting or converting an existing model. So if you have your own model that you've trained yourself, you know, you've written it and you've trained it, you can basically export it to Onyx. So all these frameworks that are shown here support Onyx export and they're different the APIs for doing that are different for different frameworks. So I'll show you a little example for some of the more popular frameworks here. So for PyTorch, the Onyx export is actually built in. There's a module called Torch.Onyx that allows you to export the model. So you can basically just load the Torch model and then specify what kind of an input it should have and then export the model to the Onyx format. For Keras it's again the steps are very similar. You load the model and then you convert it and then you save it out. The Keras functionality is available in a module called Keras to Onyx which is pip installable and then you can just import it and incorporate it into your scripts. All of these are also runnable from the command line. So for the TensorFlow I chose to show you kind of the command line version of this where again you can basically use the TF2 Onyx module and call the convert method on it and pass in some parameters and it'll output the Onyx model for you. Psykit Learn also works kind of similarly. There's a module called Psykit Learn to Onyx and you can pip install that and import that and save out your file. With Psykit Learn you've made a number of performance optimizations. So whether you're using a small batch size or a large batch size you should see pretty good performance improvements on that and we'll be publishing a blog about those results in the near future so stay tuned for that but you can try it out yourself now. For Onyx Runtime the best way to get started with this is to go to this website onyxruntime.ai and there's a picker there which allows you to specify the configuration that you are reusing and it'll give you the instructions for how to get the specific module to install for that particular setup. So you'll see that we support Windows, Linux and Mac and then we support Python, C++, C-sharp, C, Java, JavaScript and WinRT. So the JavaScript one is basically for use in Node.js so it basically provides a JavaScript or a TypeScript interface that you can use from your Node applications and then WinRT is really useful if you're writing UWP apps on Windows. For the architectures, we support the X64, X86, ARM64, ARM32, some of these have pre-built packages and some of them you need to build from source and we provide instructions on how to do that. Hardware acceleration is something that I touched on a little bit earlier. We support a wide variety of hardware devices and this is done through a mechanism called execution providers. So Honest Runtime has this execution provider's API that allows different hardware accelerators to basically just plug in. So of course we have a highly optimized CPU implementation. We also have a CUDA implementation that comes with the Honest Runtime GPU package and then we have different hardware vendors working with us to integrate their optimizations into Honest Runtime as well. So for example, NVIDIA partners with us to integrate TensorRT with Honest Runtime. So TensorRT helps accelerate various models on NVIDIA GPUs and by integrating with Honest Runtime you can basically get all the perf benefits that TensorRT provides as well as make sure that any Honest Runtime model or Honest model can run because Honest Runtime provides full support for the entire Honest spec. So even if a particular operation that's in a model is not supported by TensorRT for example, Honest Runtime will still support it as a fallback and so the whole model will still continue to run. Similarly with OpenVINO and other accelerator solutions that are out there, Intel has partnered with us closely to integrate OpenVINO with Honest Runtime so that we get the performance acceleration from the VPU and the other optimizations that are part of OpenVINO but you still get the full Honest compatibility and support. So I encourage you to try this out. You can go to this site. There's also a link to go to the GitHub if you need to get the source and compile it yourself. So once you have this, how do you use it? Well we have APIs in all the different languages but I just wanted to show you kind of Python and C-Sharp here which are fairly simple to use. For Python after you pip install Honest Runtime you can basically just import it and create a session. There's one session per model basically and you can initialize that session and then you can run it by passing in the different inputs. Pretty much the same story for C-Sharp you use the Honest Runtime module and then you instantiate the inference session and then you run it by passing in the input. So I want to talk a little bit about how you can get involved with the Honest community. Jim gave a very nice overview of all the participation in the Honest community. All the different people who are involved, kind of the quarterly or periodic community workshops that we have, the meetups and I want to talk a little bit about the resources and the different things that are happening in the community. First, Honest has open governance. It's part of the Lynx Foundation AI as Jim mentioned. So there's a well documented governance model that describes how decisions are made and how transparency is provided, etc. There's an annual steering committee election held and so people can run for this. There's five steering committee members and the steering committee meets every week and the meetings are open to everyone so people can dial in and watch those and also the meeting notes are published ahead of it or the meeting agenda is published before it and the meeting notes are published after so you can always take a look at those and propose items for the agenda as well. The technical decisions, kind of the day-to-day decisions are actually made by SIGs and working groups. Jim touched on this a little and I want to elaborate a little bit more. So SIGs are permanent organizations that own different parts of the code. So we have four SIGs currently, one for architecture and infrastructure, one for the converters and one for the models and tutorials. And so these SIGs are responsible for ensuring the health of the project long-term. Every repository, like every file in Onyx is owned by one of these four SIGs and so they have to kind of make sure that any new features that we're adding are properly fit into the architecture in a good way. Working groups on the other hand are temporary. They have a specific goal or charter. They spin up in a sufficient momentum for that charter or goal and then they work that problem until it's solved and then they disband. And working groups are currently one or two active working groups. There's one for training and then there's one for release management release process and we have a number of completed working groups as well and if you go to our GitHub, you can kind of see the list of completed working groups. There was one for quantization. They successfully added the quantization support into Onyx and they disbanded and there's some that complete their mission and they basically do the investigation and they find that, hey, we don't need this or it doesn't make sense and they disband as well and we consider that a success as well because it was an interesting thing or useful thing for the community to look into and figure out what the right decision is and sometimes that doesn't mean there has to be code produced. The SIGs and working groups meet periodically as well and everything is open for the meetings, the meetings are all open to everyone and the meetings are all published on the calendar Onyx.ai slash calendar and you can see all the different meetings there and you're welcome to attend any of them. The other thing is that for the SIGs we basically have contributors and approvers and these are kind of the two permission levels if you will contributors can basically they're folks who contribute extensively to the Onyx project and they get kind of voting rights on different decisions and then the approvers are more expert contributors who've been making really good decisions for a while and they have basically the merge permissions and so these are the open governance basically describes how these folks are selected as well and everyone should aspire to kind of join those ranks. A lot of our communication currently happens on Gitter so the URL is listed here Gitter.com slash Onyx. You'll see that we have different rooms for different folk for the different SIGs and the different working groups there and you see a lot of active discussion going on so I welcome you to join that and then of course I encourage everyone to sign up for the mailing list we have a number of mailing lists but this is probably the most important one to sign up for Onyx announcements the traffic is very low but this is basically where we notify people about upcoming community workshops virtual meetups, things like that so I've gone through kind of how Microsoft uses Onyx extensively I've talked a little bit about how you can use Onyx and I've talked a little bit about how you can get involved and so now we'd like to pretty much open it up to any questions that you have and Jim and I will be happy to answer those I also wanted to make sure that you have the resources here for you to follow up later you can go to Onyx.ai to learn more about Onyx you can go to Onyx.com to learn more about it so thanks so much and we'll take any questions now Vincent I tried to answer the one question I saw in there about IBM Watson Studios Fort Onyx, yes I tried to include in that link a link to our slides on SlideShare in case anybody wanted to click on any of the hyperlinks I think there was also a question about I don't know if you can read it or not, Vincent, about the Philippines there where can I buy Yeah, Azure Connect I don't know I can respond to this question later I believe these questions are available after as well and make that if you can't find the information on the Azure Connect website I think there's another question about Azure Connect availability worldwide I don't know often I see another question about can you provide some information about the current state and development of pipelines basically raw data to preprocessing training in Onyx so the pipeline is a working group there was a working group for it it has not been active in a while I think there's some interest in re-activating it so if that's something that you're interested in I would recommend joining the Gitter Channel for that and making your views known there it is something that we're very interested in there's some support for this today especially for kind of images modifying images to make sure that they can be sent into the model but for other types of data other data types there's some more operators etc that are needed and there's a lot of interesting discussions to be had about that as well how custom or how general should the operators be how custom should they be and so on so it's a great question and I encourage you to get involved in the Gitter Channel for pipelines for now and then hopefully soon we can announce the restarting of that and I'll just add Marie what Croissant said there's a group that are IBM Tokyo Research Lab and our Yorktown Research Lab interested in re-activating this working group to take another look at some particular new ideas they have so definitely we should connect and make sure you get connected into that I guess it's a proto working group that may be forming I see another question about the URL for the Gitter yeah thanks for caching that correct URL is actually Gitter.im thanks for that correct I see another question are there any books coming for this subject great question Jim are you planning on writing any books I don't have plans but it's a great idea and we should I know it came up in our last community meeting as well so we definitely should there are some people that could contribute something we should get a community effort going around for sure yeah definitely if you have thoughts on that please reach out I would love to collaborate and see what can be done there are a number of kind of experts in the community and I think if we have a good idea there I think that's something we can do together yes there's a question on concept I don't know if you saw that sorry which one there was one on concept drift there was a question I can't find it now yeah the question one question I see is does the honest runtime log the prediction requests it's up to you I mean so honest runtime has some debugging and logging capabilities you can make use of them if you'd like but honest runtime is basically a component that you include in your application or service so in order to kind of record the full request and the response and maybe you want to record the time it takes and things like that that's going to be up to your service or application that's hosting the honest runtime and there's different formats and different mechanisms that people use to log their request so honest runtime itself doesn't handle all that but the information is all there for you to do I see another question about how do you manage to decrease the time of ML training so especially for large transformer models that's where honest runtime currently provides the most benefit and it comes down to a few things so one is how how optimized or how efficient is the operator kernel for actually doing these different computation so some of our kernels have been highly optimized to make sure that they're just they do fewer operations etc so that it runs really fast so they're hyper-optimized kernels the other thing is making sure that the GPU or the hardware accelerator device is completely busy so if we spend a lot of time moving data back and forth between CPU and GPU or waiting for other things to get processed before running the computation on the GPU you're actually spending a lot of time that you could be utilizing the GPU more so there are a lot of optimizations and efficiencies that honest runtime does to make sure that the GPU is completely and is always cranking on the computation so that helps reduce the time as well and then there's some other kind of algorithmic techniques to make sure things are being done in parallel and so on so that we can make the overall training run faster so yeah take a look at the honest runtime website links to blogs that describe this work in more detail another question about are there any commercial deployments of onyx-based models yeah so all the scenarios that I talked about are commercially deployed so I talked about Azure cognitive services which has speech and OCR and image search and all these things they're all using onyx runtime and onyx models the office is onyx models for kind of everything from camera checking to suggesting replies or text to put in your email there's also you know the Bing search engine uses onyx models quite a lot for different scenarios you know whether it's image search and kind of vision based things or question and answering kind of natural language things scenarios so yeah there are a lot of commercial deployments of onyx models at Microsoft and then there are a lot of commercial deployment of onyx models outside of Microsoft as well I mentioned one customer that we work with on Azure it's a financial ISV right but there are a number of other ones out there as well so any production grade definitely being used for commercial deployment and math works as well of course another place you can get quick access to onyx I see another question just came in about does onyx support CNTK and the answer is yes so CNTK has a onyx export built into it so if you're using CNTK you can export your model to onyx run time another question just came in about do you need to have knowledge on the hardware acceleration tools such as cancer RT and OpenVINO or does onyx run time provide an abstraction layer the short answer is onyx run time provides an abstraction layer like cancer RT and OpenVINO integrate with onyx run time through the execution provider run time API basically so for you you would just be using the onyx run time APIs with your onyx model and you would need to configure your installation to have cancer RT and OpenVINO one of those available but when you're interacting with onyx run time it's pretty much the same you load the model you specify your input you run it and then you process it out so this is what makes it very easy for you to move between devices which is one of the scenarios I mentioned earlier different teams, different companies have this requirement where application stack sometimes it needs to run on a when just the device that has cancer RT on it sometimes it needs to run on the up square that has OpenVINO and IBPU in it and rather than having to rewrite the stack for these different devices you can kind of use the same onyx model same onyx run time API this variety looks like one just came in around any scholarship or certification plan not quite sure what he means by scholarship I did mention there are many over hundreds of onyx citations in various scholarly articles the certification plan I don't know how would you answer that one I guess everything is open source so a lot of the information is out there if you're looking for tutorials or training a lot of it is already out there so you can take a look at that if you have ideas on what the certification would look like I'd love to hear about that we haven't really thought about it honestly feel free to contact us or post it on the Gitter or attend one of the screening committee meetings or file an issue on the Github and propose what that might look like different people use it in different ways so I'm not sure what the certification would look like but I would love to hear from the business I see another one that just came in what are the benefits of using onyx and training instead of just converting to onyx after training so there's kind of solving different problems so you can certainly train your framework then save it to onyx for inferencing purposes and that makes the inferencing side of things is going to allow you device portability and acceleration and so on for using onyx runtime in training the benefit is that your training itself will be faster than if you were just using the base framework and so it speeds up the training for your transformer models currently so there are kind of two different scenarios there and you can either use one or the other or both I'm going to see one more question that came in about scikit-learn conversion does it support all the pipelines or only some of the model types it supports a variety of the model types it is not 100% because there's a lot of stuff in there that the scikit-learn has that currently is not represented but for all of the common model types you should be able to convert it we have a number of people using this you can take a look so if you go to the scikit-learn to onyx converter at github they kind of have a list of the different things that are supported and you can take a look to see whether your particular pipeline can be fully converted or not and you can also just try it and it'll tell you what didn't work alright I think we're at time for the session so Jim do you have any other final words I'd just like to thank everyone for joining and stay safe and please get on the onyx mailing list or get on the getter and join one of our next community meetings or one of our other meetings please get involved yeah thanks so much for joining us take a look at the resources and like Jim said we'd love to have you join us in our onyx community online or in some of our regular calls so thanks again