 Hi, I'm Peter Burris, analyst at Wikibon. Welcome to another Wikibon theCUBE digital community event. This one's sponsored by HPE. Like all of our digital community events, this one will feature about 25 minutes of video followed by a crowd chat, which will be your opportunity to ask your questions, share your experiences, and push forward the communities thinking on important issues facing business today. So what are we going to be talking about today? The strategic importance of machine learning and how it's transforming the way businesses work, but especially the need to introduce an operational style to how we handle machine learning. The big challenge that we face is that we're seeing companies experience important advances utilizing these new technologies, but it's episodic, it's not consistent. And so Stu Miniman and I, another Wikibon analyst, were able to sit down with a number of HPE thought leaders over the course of the last couple of weeks to talk about the need to start introducing greater operationalization on how we deliver the outcomes associated with these machine learning use cases. The first one that we had was me speaking with Anant Chittimanetti. Let's see what Anant had to say about some of the important reasons why we need to start rethinking the challenges of creating business value with machine learning technologies. I hear all the time, customers telling me that it's just too complex. All these use cases are driving often bespoke workflows to actually achieve the use cases and variety of different roles and responsibilities. That seems like it's a prescription that's going to only lead to periodic success. Have I got that right? Absolutely. I mean, if you look at what you can do with data and analytics, there's obviously different types of business users and business use cases. We're talking about in financial services or even retail. Any of these large enterprises that have customer-facing operations, there's value to be generated at the time you intersect with the customer. There's value to be generated in identifying opportunities to upsell, cross-sell. There's also opportunities around just revenue generation coming up with new business models. Let's face it, all these industries are being disrupted and they're trying to come up with ways by which they can be more data-driven and create these new business models. The problem is that when you have these different groups, there's a number of use cases and there's a number of different ways to solve it. You have human beings involved there who have their tools of choice, who have their specific methodologies in trying to go after a specific problem. So there's no uniformity and no uniform platform either. So each of these silos of environments that are being created. So you have this sort of trend where you have an exponentially growing set of use cases, but then the workflows are not there for them to kind of scale these use cases in a consistent, repeatable fashion, even if you're using different tools. And I think what really we're trying to suggest that enterprises do is allow problems to suggest their own sets of solutions using data science and related technologies, but come up with a way to ensure a uniformity of success. Now to do that, it seems as though we need to start thinking about how we're going to operationalize those workflows that tie the data science work to the actual implementation and runtimes that lead to the business getting the outcomes that they want. Absolutely, you know, I think for the last several years, everybody was fascinated by creating the best, you know, Python based machine learning model, or now more recently, you know, doing modeling with, you know, autonomous machine learning type of techniques. And there's a lot of different ways to do create these models that demonstrate some success in the lab. But ultimately, if you want to get business value from those models and all the hard work that you've done, it has to be injected into the business process. Whether that's, like we talked about the use cases, whether it's doing scoring, you know, at the edge to find a defect in a manufacturing process that is, you know, a multimillion dollar sort of cost, or if you are trying to run something on a nightly basis or on an hourly basis to identify fraud or, you know, security breaches. So, you know, you're absolutely right that operationalization of machine learning is ultimately the key. And I think that's the progression that enterprises have to make, which is they've made lots of investments in talent, in tools to create these models, but they have to figure out how to operationalize them. And so that's absolutely the next frontier. And I think if you look at the New Age companies, they've got unified platforms where it's easy for their data scientists to come up with an idea, try out different tools, access the data, and then operationalize that model. So you have a feature or a capability in the sort of New Age internet properties available, like within days, sometimes even hours. And that's a capability that's missing in the enterprises. So I think that discipline of operationalization, allowing users to work with their tools of choice, access their data sets, but all in the context of security and governance and trying to operationalize it, is absolutely where these enterprises need to go in order to get success and real business value. Great stuff from Anant. I've always enjoyed speaking with him here in theCUBE. But let's dig into this issue a little bit more deeply. Jim Cabilis is the lead Wikibon analyst on issues pertaining to machine learning. Stu Miniman was able to sit down with him recently and discuss some of the roles, the responsibilities and the crucial workflows required to realize value out of these technologies. It's a complex arena, as Jim explains. Let's hear what he had to say. Jim, been hearing a lot about data science and how machine learning is coming into this environment. Give us a little bit of a guidance as to how this whole space fits into data science. How does that infrastructure fit in with data science today? Yeah, well Stu, data science is a set of practices for building and training statistical models often known as machine learning models to be deployed into applications to do things like predictive analysis, automating next best offers and marketing and so forth. So what machine learning is all about is the statistical model. And those are built by a category of professionals known as data scientists. Data scientists operate in teams. There are data engineers who manage your data lake. There are data modelers who build the models themselves. There are professionals who specialize in training the models and deploying them. Training is like quality assurance. So what it's all about is really these functions are increasingly being combined into workflows that have to conform with DevOps practices because this is an important set of application development capabilities that are absolutely essential to deploy machine learning into AI. And AI is really the secret sauce of so many apps nowadays. All right, Jim, as we look at data center ops, walk us through the tech, the process and the people. Okay, data center ops really is data science ops or often, well, Wikibon we refer to as DevOps for data science. And really what we start with the people. I've already begun to sketch those out. So in terms of the people, the professionals involved in building and training and deploying and evaluating and iterating machine learning models. There are the data scientists who are the statistical modelers. You might call them the algorithm jockeys though that may be regarded as a majority, but nonetheless, these are the high powered professionals who know which algorithm is correct for what challenge. They build the models. There are the data engineers who not only manage your data lakes. The data lakes is where the training data is maintained. The data for building the model and for training the models are maintained in data lakes. The data engineers manage that. They also manage data preparation, data transformation, data cleansing to get the data clean and correct so that it can be used to build high quality models. There are other functions that are absolutely essential. There are what some call ML or machine learning architects. I like to think of them as subject matter experts who work with the data scientists to build what are called the feature sets, the predictors that need to be built into machine learning models for those models to perform their function correctly, whether it be a prediction or like face recognition or natural language processing for your chatbots and so forth. You need the subject matter experts to provide guidance to the data scientists as to what variables to build into these models. There's also coders. There's a lot of coding that's done in data science and ML ops that's done in Python and Java and a variety of other languages. And there's other functions as well but these are the core functions that need to be performed in a team environment really in a workflow. And that is where the process comes in. The workflow for data science in teams is DevOps. It's really the continuous integration of different data sets as well as different models as well as different features into the building and training of AI. So these functions need to be performed in a workflow that's highly structured where there's checkpoints and there's governance and there's transparency and auditability. Great stuff from Jim and Stu talking about these crucial roles and responsibilities that dominate a lot of machine learning outcomes. With that kind of variability though, it really does become important to think about machine learning and machine learning operations from an architected approach. And to better understand how that's going to look in the direction that it's all gonna take, we had a great conversation recently with HPE's Nanda Vijaydev. Nanda had some interesting insights on how we're going to do a better job of architecting these solutions, bringing together all the different users and use cases so that we can have an evolvable but nonetheless stable and productive foundation for accelerating and increasing the likelihood that we succeed with our overall machine learning outcomes. Let's hear what Nanda had to say. Should this architected approach be tied to one or another set of algorithms or one or another set of implementation infrastructure or does it have to be able to serve a wide array of technology types? Yeah, great question, right? This is a living ecosystem. We can no longer build for, you plan something for the next two years or the next three years. Technologies are coming every day. And the reason is because the types of use cases are evolving and what you need to solve that use case is completely different when you look at two different use cases. So whatever standards you come up with, the consistency has to be across how a user is onboarded into the system. A consistency has to be about data access, about security, about how does one provision these environments. But as far as what tool is used or how is that tool being applied to a specific problem, there is a lot of variability in there. And it has to cater, your architecture has to make sure that this variability is addressed and it is growing. As these solutions have started to become more popularized, have diffused across the industry, a lot more people are engaging. Are all roles being served as well as you need to be? Absolutely, I think that's the biggest challenge, right? In the past, when we talk about very prescribed solutions, end to end was happening within those tools. So the different user persona were probably part of that particular solution. And also the way these models came into production, which is really making it available for a consumer, is re-coding or redeveloping this in technologies that were production friendly, which is you rewriting that in SQL, you're re-coding that in C. So there is a lot of details that are lost in translation. And the third big problem was really having visibility or having a say from a developer's point of view or a data scientist's point of view in how these things are performing in production, that how do you actually take it back, take that feedback back into deciding, is this model still good or how do you re-train? So when you look at this lifecycle holistically, this is an iterative process. It is no longer a workflow where you hand things off. This is not a water flow methodology anymore. This is a very, very continuous and iterative process, especially in the new age data science, the tools that are developing where you build the model, that developer decides what the runtime is and the runtimes are capable of serving those models as is. You don't have to re-code, you don't have to lose things in translation. So with this back to your question of how do you serve to different roles, now all those personas or all those roles have to be part of the same project or they have to be part of the same experiment. They're just serving different parts of the lifecycle and now whatever tooling you provide or whatever architecture technologies you provide have to look at it holistically. There has to be continuous development. There has to be collaboration. There has to be central repositories that actually cater to those needs. So what we have to look at is in this lifecycle, we have to make sure that all these communities are represented and are addressed. If they build a model in a specific technology, how do we consume that? How do we take it in? Then how do we deploy that? From an enterprise point of view, it doesn't matter where a model gets built. It does matter how end users access it. It doesn't matter how security is applied to it. It doesn't matter how scaling is applied to it. So really there is a lot of consistency is required in the operationalization and also in how you onboard those different tools. How do you make sure that consistency or methodology or standard practices are applied in this entire lifecycle? And also monitoring, that's a huge aspect. When you have deployed a model and it's in production, monitoring means two different things to people where is it even available? You know, when you go to a website, when you click on something, is a website available? Very similarly, when you go to an endpoint or you're scoring against a model, is that model available? Do you have enough resources? Can it scale depending on how much requests come in? That's one aspect of monitoring. And the second aspect is really, how is the model performing? You know, what is the accuracy? What is the drift? When is the time to retrain? So you no longer have the luxury to look at these things in isolation, right? So we wanna make sure that all these things can be addressed in a manner knowing that this iteration sometimes can be a month, sometimes it can be a day, sometimes it's probably a few hours. And that is why it can no longer be isolated. And even infrastructure point of view, some of these workloads may need things like GPU and you may need it for a very short amount of time. But how do you make sure that you give what is needed for that duration that is required and take it back and assign it to something else because these are very valuable resources. Great stuff from Nanda. How is this operationalized, approached machine learning operations going to turn into offerings? Well, let's hear what HPE had to say in this short product video. Across industries, artificial intelligence and machine learning have the potential to help organizations innovate and deliver real business value. However, implementing AI at scale is hard. Many enterprises dabble with AI, but very few can scale up their AI ML operations to deliver high quality and reliable machine learning models in production. HPE Machine Learning Ops is an end-to-end data science solution that brings DevOps like speed and agility to the machine learning lifecycle. Data scientists operate in the shared workspace and can instantaneously spin up development, training or production environments with their choice of machine learning tools in a matter of minutes. With a project and model repository, the model handoff is standardized and DevOps workflows executed before full production deployment. HPE ML Ops provides native capabilities to deploy machine learning models to scalable, load balanced and secure endpoints. An intuitive dashboard provides full visibility into utilization across all infrastructure resources. Integrations with third-party tools provide model performance monitoring and interpretability. With HPE ML Ops, you can standardize your ML workflows, reduce risk, improve team productivity and accelerate the time to value of your ML projects. Always great to hear how we're going to take complex concepts and turn them into real products that deliver real business value. But let's close out this digital community event with a couple of observations on how HPE is going to advance the technology state of the art within machine learning, machine learning operations and start to tie it together with some of these crucial infrastructure decisions that need to be made as we try to succeed more with machine learning inside our businesses. Stu Miniman was able to sit down with Patrick Osborne of HPE recently. Let's hear what Stu and Patrick had to talk about. Here with Patrick Osborne, who's the vice president and general manager of big data and secondary storage with HPE. Patrick, help us put into context what we've been talking about ML Ops, AI and HPE strategy in this area. Yeah, thanks, Stu. As you all have reported in the past and we've worked with you, HPE is very strong in infrastructure solutions. We've had quite a bit of success in AI, ML, deep learning cookbook, which we released last year. And so we're definitely helping customers along the maturity curve for AI and ML. You see that we've got a number of advisory services. I think one of the big things that we could get called out from customers is that there's a skills gap in operationalizing and putting AI and ML workloads into production as well as we are a thought leader and have quite a bit of research with HPE labs on memory-driven computing, Gen Z and being able to scale those workloads within the enterprise. So those are things that we're building off of in addition to some pretty high profile and very valuable software acquisitions for us in last year. First around blue data, which we talked about today in the context of ML Ops and then most recently MAPR which has a very powerful, scalable, persistent data layer for analytics. So for us, AI is a very, is a top priority for us at HPE, it's part of our corporate narrative and helping customers along that maturity curve is definitely where we're focused on. Great, so how are HPE and its partners helping customers along their journey that they're going on with AI? Yeah, so I think at the end of the day, HPE is very focused on our customers, especially from a go-to-markets perspective. So we're in the phase now where we're helping customers not just explore, but to operationalize AI and ML. So whether it's cookbooks and RAs, specific products like machine learning operations which helps you scale from data scientists or a danger engineer developing an algorithm on their laptop to be able to run that at scale in the data center. So for us, that journey is very important, especially around the outcome. From the technology partner perspective, we have a number of really high profile and new relationships that we're building for this new ecosystem around AI and ML and DL. And so folks like Dataikoo, H2O, on the hardware side, Intel and Nvidia, we are bringing that to our customers to provide a complete solution. So being able to take those ISVs and run them in containerized, stateful deployment and then be able to partner with all of our hardware vendors and software vendors. And then for the channel, we feel that this is a huge, this is a great opportunity for them to certainly move up stack in how they talk to customers about their business outcomes. So I think it's part of a three prong strategy and we're really kind of focused on those key areas. Yeah, no data in area that's getting attention from all sectors of the marketplace. So those that are watching HPE, what should we be expecting to look to see from them in the near future? Yeah, so I think from our perspective, we've got a number of releases that are coming up over the next year and pretty excited about that in addition to machine learning operations. I think that the world will continue to be moving towards containers for more than just stateless applications. We're starting now with AI and ML and I think there's a big future for other applications whether they're cloud native or those applications are refactored. Certainly living within a world of Kubernetes is becoming more of a reality from a deployment perspective. So for us, we're very focused on the customer outcome. I think the other area too is that HPE has been very famous for lately is around consumption-based services. So we're able to bring that vetted ecosystem, the containerized deployment model and platform, your accelerators, compute networking and storage and even a persistent data layer and even the cloud experience to the customer as a business outcome and a consumption experience through GreenLake is something that we think is very valuable for our customers. Thank you, Stu and Patrick. I'm Peter Burris and again, this has been another Wikibon the Cube digital community event sponsored by HPE. Now stay tuned for our crowd chat which follows immediately after this. This will be your opportunity to ask your questions, share your experiences and push forward the community's aggregate thinking on machine learning operations. Very, very important topic. So thank you very much for watching and participating in our crowd chat.