 From around the globe, it's theCUBE with digital coverage of IBM Think 2021, brought to you by IBM. Welcome back everyone to theCUBE's coverage of IBM Think 2021 virtual. I'm John Furrier, your host. I got a great guest here, Robin Hernandez, Vice President, Hybrid Cloud Management in Watson. AI ops, Robin, great to see you. Thanks for coming on theCUBE. Thanks so much for having me, John. You know, Hybrid Cloud, the CEO of IBM, Marvin loves Cloud. We know that. We've talked to him all the time about it. And Cloud is now part of the entire DNA of the company. Hybrid Cloud is validated, multiclouds around the corner. This is the underlying pinnings of the new operating system of business. And with that, that's massive change. We've seen IT move to large scale. You're seeing transformation, driving innovation, driving scale. And AI is the center of it. So AI ops is a huge topic. I want to jump right into it. Can you just tell me about, you know, your day-to-day IT operations teams, what you guys are doing? How are you guys organized? How are you guys bringing value to the customers? What are your teams responsible for? Yeah, so for a few years, we've been working with our IT customers, our enterprise customers, in this transformation that they're going through. As they move more workloads to Cloud, and they still have some of their workloads on premise, or they have a strategy of using multiple public Clouds. Each of those Cloud vendors have different tools. And so they're forced with, how do I keep up with the changing rate and pace of this technology? How do I build skills on a particular public Cloud vendor when, you know, maybe six months from now, we'll have another Cloud vendor that will be introduced, or another technology that will be introduced. And it's almost impossible for an IT team to keep up with the rate and pace of the change. So we've really been working with IT operations and transforming their processes and their skills within their teams, and looking at what tools do they use to move to this Cloud operations model. And then as part of that, how do they leverage the benefits of AI and make that practical and purposeful in this new mode of Cloud operations? The trend that's been booming is this idea of site reliability engineer. It's really an IT operations role. It's become kind of a new mix between engineering and IT and development. I mean, classic DevOps, we've seen Dev and Ops, right? You got to operate the developers and the software modern apps are coming in. That's infrastructure as code that's been around for a while. But now, as the maturization of things like Kubernetes and microservices, people are programming the infrastructure. And so the scale's there. And that's been around for a while. But now it's kind of going to a whole enterprise level with containers and other things. How is the site reliability engineering persona, if you will, or IT Ops change specifically? Because that's where the action is. And that's where you hear things like observability and I need more data, break down the silos. What's that's all about? What's your view? Yeah, so site reliability engineering or SRE practices, as we call it, has really not changed the processes per se that IT has to do, but it's more accelerated at an enormous rate and pace those processes. And the tools, as you mentioned, the cloud native tools like Kubernetes have accelerated how those processes are executed. Everything from releasing new code and how they work with development to actually code the infrastructure and the policies in that development process to maintaining and observing over the lifecycle of an application, the performance, the availability, the response time and the customer experience. All of those processes that used to happen in silo with separate teams and sort of a waterfall approach with SRE practices now, they're happening instantaneously. They're being scaled out. They're being, failback is happening much more quickly so that applications do not have outages. And the rate and pace of this has just accelerated so quickly. This is the transformation of what we call cloud operations. And we believe that as IT teams work more closely with developers then they move towards this SRE model that they cannot just do this with their personnel and changing skills and changing tools. They have to do this with modernized tools like AI. And this is where we are recommending applying AI to those processes so that you can then get automation out of the backend that you would not think about in a traditional IT operations or even in an SRE practice. You have to leverage capabilities and new technologies like AI to even accelerate further. Let's unpack the AI operations piece because I think that's where I think I'm in hearing. I'd love for you to clarify this because it becomes, I think, the key important point but also kind of confusing to some folks because IT operations, people see that changing. You just pointed out why. Obviously the tools and the culture is changing. But AI becomes a scale point because of the automation piece you mentioned. How does that thread together? How does AI ops specifically change the customer's approach in terms of how they work with their teams and how that automation is being applied? Because I think that's the key thread, right? Because everyone kind of gets the cultural shifts and the tools if they're not living it and putting it in place, but now they want to scale it. That's where automation comes in. Is that right? Is that the right way to think about it? What's your view on this? This is important. It's absolutely right. And I always like to talk about AI in other industries before we apply it to IT, to help IT understand because a lot of times, IT looks at AI as a buzzword and they say, oh, yes, sure, this is going to help me. But if you think about, we've been doing AI for a long time at many different companies, not just at IBM. But if you think about the other industries where we've applied it, healthcare in particular is so tangible for most people, right? It didn't replace a doctor, but it helps a doctor see the things that would take them weeks and months of studying and analyzing different patients to say, hey, John, I think this may be a symptom that we overlooked or didn't think about or a diagnosis that we didn't think about without manually looking at all this research. AI can accelerate that so rapidly for a doctor. The same notion for IT. If we apply AI properly to IT, we can accelerate things like remediating incidents or finding a performance problem that may take you or I months or weeks or even hours to find. AI applied properly to find those issues and diagnose just like they couldn't healthcare and diagnose those issues for IT much more rapidly. You know, I want to get your thoughts on something while you're here, because you're in the business for many, many decade, 20 years experience. You know, cloud, cold, you know, the new modern area you're managing it now. Clients are having a scenario where they, okay, I'm changing over the culture. Okay, I got some cloud, I got some public and I got some hybrid and man, we just, we did some agile things where provision, it's all done, it's out there and all of a sudden, someone adds something new and it crashes. Now I got to get in, where's the risk? Where's the security holes? They're seeing this kind of day two operations as some people call another buzzword, but it's becoming more of, okay, we got it up and running, but we're still now going to still push some code and things are starting to break. And that's net new thing. So it's kind of like they're out of their comfort zone. This is where I kind of see the AI ops evolving quickly because there's kind of a DevSec op piece, there's also data involved, observability. How do you talk to that scenario where, okay, you sold me on cloud, I've been doing it, I did some projects, we're up and running, we got a production system and we added something new, something maybe trivial and it breaks stuff. Yes, so with the new cloud operations and SRE, the IT teams are much more responsible for business outcomes and not just as you say, the application being deployed and the application being available, but the life cycle of that application and the results that it's bringing to the end users and the business. And what this means is that IT needs to partner much more closely with development. And it is hard for them to keep up with the tools that are being used and the new code and the architectures and microservices that developers are using. So we like to apply AI in what we call the change risk management process. And so everyone's familiar with change management, that means a new piece of code is being released, you have to maintain that where that code is being released to as part of the application architecture and make sure that it's scaled out and rolled out properly within your enterprise policies. When we apply AI, we then apply what we call a risk factor to that change, because we know so often application outages occur when something new is in the environment. So by applying AI, we can then give you a risk rating that says there's an 80% probability that this change that you're about to roll out, a code change is going to cause a problem in this application. So it allows you to then go back and work with the development team and say, hey, how do we reduce this risk? Or decide to take that calculated risk and push the visibility of where those risks may occur. So this is a great example, change risk management of how applying AI can make you more intelligent in your decisions, much more tied to the business and tied to the application release team. That's awesome. While I got you here on this point of change management, the term shift left has come up a lot in the industry. I'd like to get to your quick definition of what that is in your mind. What does shift left mean for ops teams with AI ops? Yeah, so in the early days of IT, there was a hard line definitely between your development and IT team. It was kind of, we always said, throwing it over the fence, right? The developers would throw the code over the fence and say, good luck, IT, figure out how to deploy it, where it needs to be deployed and cross your fingers that nothing bad happens. Well, shift left is really about, A, breaking down that fence. And if you think of your developers on your left-hand side, you being the IT team, it's really shifting more towards that development team and getting involved in that code release process, getting involved in their CI CD pipeline to make sure that all of your enterprise policies and what that code needs to run effectively in your enterprise application architecture, those pieces are coded ahead of time with the developer. So it's really about partnering between IT and development, shifting left to have a more collaboration versus throwing things over the fence and playing the blame game, which is what happens a lot in the early days by IT. Yeah, and they get a smarter team out of it. Great, great point. That's a great insight. Thanks for sharing that. I think it's super relevant. That's the hot trend right now, making Dilliv's more productive, building security in from the beginning while they're doing it, code it right in, make it security proof, if you will. I got to ask you, one of the organizational questions as an IBM leader, what are some of the roadblocks that you see in organizations that when they embrace AI ops or trying to embrace AI ops or trying to scale it, and how they can overcome those blockers? What are some of the things you're seeing that you can share with other folks that may be watching and trying to solve this problem? Yeah, so AI in any industry or discipline is only as good as the data you feed it. AI is about learning from past trends and creating a normal baseline for what is normal in your environment, what is most optimal in your environment, this being your enterprise application running in steady state. And so if you think back to the healthcare example, if we only have five or six pieces of patient data that we feed the AI, then the AI recommendation to the doctor is going to be pretty limited. We need a broad set of use cases across a wide demographic of people in the healthcare example. It's the same with IT, applying AI to IT. You need a broad set of data. So one of the roadblocks that we hear from many customers is, well, I'm using an analytics tool already and I'm not really getting a lot of good recommendations or automation out of that analytics tool. And we often find it's because they're pulling data from one source. Likely they're pulling data from performance metrics, performance of what the infrastructure, what's happening with the infrastructure. Utilization or memory utilization, storage utilization. And those are all good metrics, but without the context of everything else in your environment, without pulling in data from what's happening in your logs, pulling in data from unstructured data from things like collaboration tools. What are your teams saying? What are the customers saying about the experience with your application? You have to pull in many different data sets across IT and the business in order to make that AI recommendation the most useful. And so we recommend a more holistic, true AI platform versus a very segregated data approach to applying and feeding the analytics or AI engine. That's awesome. It's like a masterclass right there, Robin. Great stuff. Great insight. We'll quickly to wrap. I would love to take a quick minute to explain and share. What are some of the use cases to get started and really get into AOF, see some successes for people that want to explore more, dig in and get into this fast. What are some use cases? What's some low-hanging fruit? What would you share? Yeah, we know that IT teams like to see results and they hate black boxes. They like to see into everything that's happening and understand deeply. And so this is one of our major focus areas. As we do, we say we're making AI purposeful for IT teams. But some of the low-hanging fruits, we have visions and lots of our enterprise customers have visions of applying AI to everything from customer experience of the application, cost management of the application and infrastructure in many different aspects. But some of the low-hanging fruit is really expanding the availability and the service level agreements of your applications. So many people will say, I have a 93% uptime availability or an agreement with my business that this application will be up 93% of the time. Applying AI, we can increase those numbers to 99.9% of the time because it learns from past problems and it creates that baseline of what's normal in your environment and then we'll tell you before an application outage occurs. So avoiding application outages and then improving performance, recommendations on scalability, what's the number of users coming in versus your normal scale rate and automating that scalability? So performance improvements and scalabilities is another low-hanging fruit area where many IT teams are starving. Yeah, I mean, why wouldn't you want to have the AI ops there? Totally cool, very relevant. You're seeing hybrid cloud, standardized all across business. You got to have that data. You got to have that incident management work there. Robin, great insight. Thank you for sharing. Robin Hernandez, Vice President of Hybrid Cloud Management at Watson and Watson AI ops. Thanks for coming on theCUBE. Thank you so much for having me, John. Okay, this is theCUBE's coverage of IBM Think 2021. I'm John Furrier, your host. Thanks for watching.