 from Burlingame, California. It's theCUBE, covering SumoLogic Illuminate 2019, brought to you by SumoLogic. Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're at the SumoLogic Illuminate 2019 event. It's at the Hyatt Regency San Francisco Airport. About 800, 900 people. Our second year, it's the third year of the event. Excited to be here and watch it grow. We've seen a bunch of these things grow from little to big over a number of years and it's always fun to kind of be here for the Zenith. We're excited to be joined by our next guest. She's an analyst. It's Nancy Goring, senior analyst for 451 Research. Nancy, great to see you. Thank you, thanks for having me. Absolutely, so first off, just kind of impressions of the event here. Yeah, good stuff, you know? Definitely trying to get on top of some of the big trends. The big news here was their new Kubernetes monitoring tool. So obviously kind of staying on the leading edge of the cloud-native technologies. It's amazing how fast it's growing. I'm doing some research for this event. I found some of your stuff on the internet and just one quote. I think it's from like years ago, but just for people to kind of understand the scale, I think you said Google was launching four billion containers a week. Twitter had 12,000 services. Uber 4,000 microservices. Yelp ingesting 25 million data points per minute. And I think this is like a two or three-year-old presentation. I mean, the scale in which the data is moving is astronomical. Yeah, well, I mean, if you think of Google launching four billion containers every week, they're collecting a number of different data points about a container spinning up, about the operation of that container while it's alive, about the container spinning down. So it's not even just four billion pieces of data. It's, you know, multiply that by 10 or 20 or many more. So yeah, so the volume of operations data that people are faced with is just, you know, out of this world. And some of that is beginning to get abstracted away in terms of what you need to look at. So, you know, Kubernetes is an orchestration engine, so that's helping move things around. You still need to collect that data to inform automation tools, right? So even if humans aren't really looking at it, it's being used to drive automations. It still has to be collected. Right, and there's still configurations and settings and dials, and it seems like a lot of the breaches that we hear about today are people just misconfiguring something on AWS. It's human error. It's human error. And so how do we kind of square the circle? Because the data is only growing, the quantity, the sources, the complexity, the lack of structure, and that's before we had IoT, and now we got edge devices, and they're all reporting in from home. Yeah. Crazy problem. It's really, I think, driving a lot of the investments and the focus in more sophisticated analytics, right? So that's why you're hearing a lot more about machine learning and AI in this space is because humans can't just look at that huge volume of data and figure out what it means. So the development of machine learning tools, for instance, is going to pull out a piece of data that's important. Like here's the anomaly. This is the thing you should be paying attention to. And then obviously getting increasingly sophisticated, right? In terms of correlating data from different parts of your infrastructure in order to, yeah, make sense of it. Right. And then, oh, by the way, they're all made up of microservices that are all interconnected and APIs to third-party providers. Yeah. I mean, the complexity is ridiculous. Yeah, and then, you know, and I've been actually thinking and talking a lot recently about organizational issues within companies that exacerbate some of these challenges. So you mentioned microservices. So a lot of times, you know, you've got DevOps groups and an individual DevOps group is responsible for A or multiple microservices, right? They're all running sort of autonomous. They're doing their own thing, right? So that they can move quickly. But is there anybody overseeing the application that's made up of maybe 1,000 microservices? And in some cases, the answer is no. And so it may look like all the microservices are operating well, but the user experience actually is not good. And no one really notices until the user starts complaining. So it's like things start, you know, you have to think about organizational things. Who's responsible for that, right? You know, if you're on a DevOps team and your job has been to support these certain services and not the whole, like who's responsible for the whole application. And that's, it's a challenge. It's something actually in our surveys, we're hearing from people that they're looking for people as that skill set, someone who understands how to look at microservices as they work together to deliver a service, right? It's a pain point. Should the project or the product manager for that application would hopefully have some visibility as to kind of what they're trying to optimize for? In some cases, they're not technical enough, right? A product manager doesn't necessarily have the depth to know that, or they're not used to using the types of tools that the DevOps team or the operations team would use to track the performance of an application. So sometimes it's just a matter of having the right tooling in front of them. And then even the performance, it's like, what are you optimizing for? Are you optimizing for security? Are you optimizing for speed? Are you optimizing for experience? You can't optimize for everything. If you got a stack rank order at some point in time, so that would also then drive, you know, different prioritization or the way that you look at those microservices performance. Yeah, yeah. Interesting. So another big topic that comes up often is the vision of a single pane of glass. And you know, I can't help but think in my work day, you know, how often I'm tabbing between, you know, Salesforce and email and Slack and Asana and a couple of browsers are open. I mean, it's bananas, you know, it's no longer just that email is the only thing that's open on my desk all day. And then you can only imagine the DevOps world, you know, we saw just crazy complexity around, again, managing all the microservices, all the APIs. So what's kind of the story? What are you seeing in kind of the development of that? And there's so many vendors now and so many services. Yeah. It's not just we're just going to put in HP OpenView and that's the standard and that's what we're all on. So if you're looking at it from the lens of monitoring or observability or performance, traditionally you had different tools that looked at, say, different layers of a service. So you had a tool that was looking at infrastructure. It was your infrastructure monitoring tool. You had an application performance monitoring tool. You might have a network performance monitoring tool. You might have point tools that are looking just at the database layer. But as things get more complicated, as applications are getting much more complex, looking at that data in a siloed tool tends to obscure the bigger picture. You don't understand when you're looking at these separate tools how some piece of infrastructure might be impacting the application, for instance. And so the idea is to bring all of that operations data about the performance of an application into one spot where you can run, again, these more sophisticated analytics so that you can understand the relationship between the different layers of the application stack, also horizontally, right? So how microservices that are dependent on each other, how one microservice might be impacting the performance of another. So that's conceptually the idea behind having a single pane of glass. Now the execution can happen in a bunch of different ways. So you can have one vendor, there are vendors that are growing horizontally. So they're collecting data across the stack. There's other vendors that are positioning themselves as that sort of central data repository. So they may not directly collect all of that data, but they might ingest some data that another monitoring vendor has collected. So there's always going to be good arguments for best-of-breed tools, right? So in most cases, businesses are not going to settle on just one monitoring tool that does it all. But that's conceptually the reason, right? Is you want to bring all of this data together, however you get it, however it's being collected, so that you can analyze it and understand that big picture performance of a complicated application. Right, but then even then, as you said, you don't even want to, you're not really monitoring the application performance per se, you're just waiting for the, you're waiting for some of those needles to fall out of the haystack because you just, you just can't. You know, there's so much stuff. And you know, it's where do you focus your priority? You know, what's most critical? What needs attention now? And if without a machine to help kind of point you in the right direction, you're going to have a hard time finding that needle. Yeah, and there's a lot of different approaches that are beginning to develop. So one is this idea of SLOs or service level objectives. And so for instance, a really common service level objective that teams are looking at is latency. So the latency of the service should never drop under whatever 100 milliseconds. And if it does, I want to be alerted. And also, if it drops below that objective for a certain amount of time, that can actually help you as a team allocate resources. So if you're not living up to that service level objective, maybe you should shift some people's time to working on improving the application instead of developing a new feature, right? So it can really help you prioritize your time because you know what, there was a time when people in operations teams or DevOps teams had a really hard time, and they still do, figuring out which problems are important. Because you've always, people always have a lot of performance problems going on. So which do you focus your time on? And it's been pretty opaque. It's hard to see, is this performance impacting the bottom line of my business? Is this impacting my customers? Are we losing business over this? Like that's a really common question that people can't answer. So yeah, people are beginning to develop these approaches to try to figure out how do you prioritize work on performance problems. It's interesting, because the other one that you've mentioned before is kind of this post-incident review instead of a post-mortem. And you talked about culture and words matter. And I think that's a really interesting take because it implies we're going to learn and we're going to go forward as opposed to it's dead. And we're going to yell at each other and someone's going to get blamed. That's exactly it. And we're going to move on. So how has that kind of evolved? And how does that really help organizations do a better job? I mean, there's much more of a focus on setting aside time to do that kind of analysis, right? So look at how we're performing as a team. Look at how we responded to an incident so that you can find ways that you can do better next time. And some of that is real tactical, right? It's tweaking alerts. Did we not get an alert? Did we not even know this problem was happening? So maybe you build new alerts or get rid of a bunch of alerts that did nothing. There's a lot you can learn. And again, to your point, I think part of the reason people have started calling it a post-incident review instead of a post-mortem is because yet, you don't want that to be a session where people are feeling like blame. This is my fault. I screwed up. I spent way too long on this or I hadn't set things up properly. It's meant to be productive. Let's find the weak points and fill them, right? Fill those gaps. It's funny you had another, there was another thing I found where you were talking about not necessarily the post-mortem, but people being much more proactive, much more thoughtful as to how they are going to take care of these things. And it is really more of a social, cultural change than necessarily the technical piece. That culture piece is so, so important. It is. And especially right now, there's a lot of focus on tooling. And that can cause some interesting issues. So, especially in an organization that has really adopted DevOps practices, the idea of a DevOps team is that it's very autonomous. They do what they need to do to move fast and to get the job done. And that often includes choosing your own tools. But that has created a number of problems, especially in monitoring. So if you have 100 DevOps teams and they all have chosen their own monitoring tools, this is not efficient. So it's not a good idea because those tools aren't talking to each other, even though they're microservices that are dependent on each other. It's inefficient from a business perspective. You've got all these relationships with vendors, and in some cases, with a single vendor. You might have 50 instances of the same monitoring tool that you have 50 accounts with them. That's just totally inefficient. And then you've got people on a DevOps, an individual. All the individual DevOps teams have a person who's supposed to be the resident expert in these tools. Like, maybe you should share that knowledge across. But my point is, you get into this situation where you have hundreds of monitoring tools, sometimes 40, 50 monitoring tools. You realize that's a problem. How do you address that problem? Because you're going to have to go out and tell people you can't use this tool that you love that helps you do your job that you chose. And so, again, this whole cultural question comes out. Like, how do you manage that transition in a way that's going to be productive? The other one that you brought up, which I thought was interesting, is where the support team basically tells the business team, you only have X number of incidents. We're going to give you a budget. And if you exceed the budget, we're not going to help you. It's a really different way to think about prioritization. Yeah, I don't necessarily think that's a great approach. I mean, there was somebody who did that, but... But I think it's kind of an interesting thing. You talked about it in that, I think it was one of your presentations or speeches where it makes you kind of rethink why do we have so many incidents? And there shouldn't be that many incidents. And maybe some of the responsibilities should be shifted to think about why and how, and more of a systemic problem than a feature problem or a bug or a piece of broken code. So again, I think there's so many kind of cultural opportunities to rethink this in this world of continuous development, continuous publishing, continuous pushing out of new code. Yeah, yeah, for sure. All right, Nancy, well, thanks for taking a few minutes, and it was really great to talk to you. Thanks for having me. All right, she's Nancy, I'm Jeff. You're watching theCUBE, we're at Sumo Logic Illuminate 2019. Thanks for watching, we'll see you next time.